🚀 How many tests do you really need?

Us folks over at Stock Alarm have recently been writing tests like there's no tomorrow because we've reached a level where we can't afford a mistake in production (imagine thousands of users getting a phone call at 3 am lol).

While writing these tests i realized that I haven't read any threads on IH about testing in a long while, specifically regarding unit tests, integration tests, line coverage, branch coverage, etc...

My question to you all is, how do you know when you've written enough tests? is there such a thing? What metrics do you use to track how well your code is tested?

Drop a link to your project as well so we can understand your niche.

Yahia Bakour

posted to

Developers

on August 1, 2020

Say something nice to yahiabakour…

Post Comment

5

is there such a thing as enough tests?
For unit tests there is 100% coverage
And complete input validation of all types.

Usually what is common is to identify core, mission critical components, and test the heck out of them.
But not do so much for edge things.
Like you can think of a risk matrix, of what happens if this component fails... then categorise customer impact... if it's just doing to be a display issue it's not as big as an alarm issue...
Also the more users you have, the more common it may be to do transaction testing like sign-ups...
Another way to 'test' indirectly is to have good stats and alerts on them, for example, you track logins and alarms, and you know that during "normal days" it should be within a range.. (there are anomaly detection systems for better stuff), so if suddenly something is out of range you know there is an issue, due not from what... like no signups today where normal is 5-15, something is up, it could be stuff out of code as well such as servers, config and such, so it's a good view that isn't usually called 'testing' but it is protection of your systems which is very similar

hatkyinc

·
4 years ago
·
Reply
1. 1
  
  Wow thank you for a super well thought out response.
  
  I totally agree on all points and it's very interesting to think of testing in terms of a risk matrix.
  
  And regarding metrics, i never thought about setting alarms on those but based off your input i think it might be a worthy investment. How do you usually determine a safe percentage deviance from regular days? I suppose it would be fine to assume that 30% deviance in signups from a previous day (assuming weekends/weekdays handled seperately) is an error
  
  Either way thanks for your input 🚀
  
  yahiabakour
  
  ·
  4 years ago
  ·
  Reply
  1. 1
    
    you usually first start tracking the numbers, than for simple things you figure your own patterns and sets stuff up, you can also do it in multiple levels...
    I've been in a company where this was the main things, so for example 10 minute window with 0 signups for them was a critical alerts wake people up. But a 50% drop in daily registers was actually the next alert since the variation was huge... (and spikes up could be notices for workday for just a few people that are in the marketing dep.)
    Ent. level systems actually have anomaly detection or at least a prediction metric, so for example it's easy on disk size to just say give me an alert 10 days before disk runs out. (and you don't need to think about the rate of consumption or changes in consumption. that's available in datadog for example)
    I've more seen this manually, but after deploys, lets say after an hour, compare main metrics for changes.. (sometimes things are expected to change)
    Don't be afraid to smooth numbers out for this with moving averages or comparisons, lets say 1 hour now - lastweek kinda thing. (often bushiness differ highly between weekdays and weekend days, like there stock exchange doesn't work weekends for you...)
    
    hatkyinc
    
    ·
    4 years ago
    ·
    Reply
3

Zero, users will test on production 😈

Akcium

·
4 years ago
·
Reply
1. 3
  
  I see you like to live dangerously as well 👨‍💻
  
  yahiabakour
  
  ·
  4 years ago
  ·
  Reply
2

For reference, I'm developing Nodewood, a JavaScript SaaS boilerplate for people looking to build a SaaS from a decent starting point, and save time over personally developing the common subset of features every SaaS has. Given that, my point of view is a little skewed, since bugs in my code ripple out to everyone using it. Like library authors, I have to be a little stricter and less-comfortable with taking on technical debt than a SaaS founder can be.

I aim for 100% integration tests of the API, and unit tests only for things that make sense. i.e. validation can be cumbersome to test with integration tests, especially once you've set up a decent validation pattern. Unit testing those patterns means that anywhere else you use that validation, you don't need to be so strenuous when testing all possible permutations of that validation.

That said, I use tests to develop my API in the first place, so it's not like it's a hardship on me to add that coverage. I start off by defining all the important actions and considerations for my endpoint with a bunch of it.todo('should do xyz'), and then I write each test one at a time, turning it from red to green. When all tests are written and green, I can start using the API from my UI directly, since I know exactly how it's supposed to behave.

In the end, this development pattern saves me time, not just for the "less broken code to worry about later" standard reason, but because I don't have to worry about reloading pages, setting up state, sending requests, wondering whether the bug is in the UI or the API, etc.

DanHulton

·
4 years ago
·
Reply
1. 2
  
  Thanks for the background info and Nodewood looks dope, might use it for my next project but i would love to see some examples of things built with Nodewood (maybe add a section to your website).
  
  Based off your brief example i assume you are using Jest/Sinon/Chai for your testing and what not.
  
  Interesting that you aim for 100% integration tests as alot of people seem to emphasize unit tests (although i do agree integration testing is alot more important)
  
  yahiabakour
  
  ·
  4 years ago
  ·
  Reply
  1. 1
    
    Thanks for the compliment! I'm just putting the finishing touches on Nodewood, so technically nobody is using it yet. I'll be looking for some people to take part in a paid (with large discount) beta once it's ready, though, and hopefully I can get some testimonials from them once I'm ready to go live for real.
    
    You're right - I do use Jest, and I do think integration tests are more important. For the most part, I don't care how the implementation works, so long as the API gives the correct outputs to specific inputs. This keeps the coupling for my tests very low. I hated writing tests for years, because my employers were so focused on unit tests. Too much mocking, and refactoring meant a whole lot of extra work rewriting the tests at the same time, which defeated the purpose. With integration tests, I can refactor to my heart's content, secure in the knowledge that my API still functions exactly the same as it's supposed to.
    
    DanHulton
    
    ·
    4 years ago
    ·
    Reply
2

It's great that your team is making testing a priority.

The question is a great one, very hard to answer, and to make matters more complicated; it can vary depending on the product you're building. Striving for high code coverage is a good idea for critical parts of your codebase that aren't likely to change much.

Manual tests may also be another option as a final sanity check before pushing a release, this can be a great way to catch issues before they reach production and can help steer what automated tests are worth prioritising.

Good luck!

nickreffitt

·
4 years ago
·
Reply
1. 1
  
  Totally agree that code coverage is a good idea but it takes alot of time that could be spent on building features
  
  I do some simple sanity checks for our product before a release and after deployment but im always worried something might go wrong as our code coverage is still too low for my taste.
  
  Thank you for your input!
  
  yahiabakour
  
  ·
  4 years ago
  ·
  Reply
  1. 1
    
    Totally agree, does your team allocate a % of time for refactoring work per sprint? I find this to be a good way to slowly improve code coverage over a quarter or so.
    
    nickreffitt
    
    ·
    4 years ago
    ·
    Reply
2

For Writing Analytics, I have a high functional coverage for the backend – to ensure customers will be charged correctly, won't be sent 2,000 emails, and that the data that goes to the db is in a good shape. Basically, anything where the cleanup would be messy.

The UI is the exact opposite – I make lots more changes there, and the potential damage isn't that high. Even if there would be a severe issue like people not being able to log in or sign up (which hasn't happened yet, knock on wood), I could investigate and deploy a fix quickly.

radek

·
4 years ago
·
Reply
1. 2
  
  Totally agree and i feel that our projects are twins in terms of testing haha!
  
  Our UI has little coverage while our backend is currently going through a major testing overhaul to make sure that we don't end up firing thousands of alerts accidentally lol
  
  I love your product btw, i had a similar idea a while back but looks like you beat me to it (in style as well, UI looks super polished).
  
  yahiabakour
  
  ·
  4 years ago
  ·
  Reply
  1. 1
    
    Haha, great to see other people using the same approach.
    
    Thank you, I appreciate it!
    
    radek
    
    ·
    4 years ago
    ·
    Reply
2

When writing tests, I ask myself, "is it possible that this thing can be broken in the future? (is this an area that we change often?)", "if we break this, will we notice before we ship the bug?" if we can break it and we might miss it then the question is "what is the test I need to write so we can be safe from the embarrassment?" In other words, I try to be very pragmatic in terms of tests, and write just enough tests to make sure we don't ship bugs.

Emreb

·
4 years ago
·
Reply
1. 2
  
  Thank you for the input and that's a pretty good way to look at it, i always end up writing either too little coverage on my tests or end up testing way too much just as a sanity check. Seems like you've found a nice in-between mentality!
  
  yahiabakour
  
  ·
  4 years ago
  ·
  Reply
2
Ideally 100% (*) unit tests coverage, and of course more functional tests, integration tests possible.

There is a saying, "untested code is broken code", and this rule is valid everywhere except for machine learning stuff (which cannot be properly tested). Sometimes property based testing can help you discovering corner cases that are pretty hard to imagine.
- = the 100% coverage is opinionated, it's definitely not a silver bullet; but in my experience, not being able to take your code at 100% coverage can mean that you can improve code design.
brunoripa

·
4 years ago
·
Reply
1. 1
  
  damn, 100% unit test coverage is insane to think about but i guess that's why its the ultimate goal. Love the saying hahah!
  
  yahiabakour
  
  ·
  4 years ago
  ·
  Reply
  1. 1
    
    Unit tests have very limited test scenarios so 100% unit test coverage is not as harsh as it sounds. 100% test coverage (functional, integration, E2E, etc. combined) however, is probably undoable. You can literally come up with thousands, if not, millions of scenarios to test. If someone or your boss tells you to write 100% test coverage, tell them to go either hire a QA team or go fuck themselves. You can pay me $500k a year and I won't be able to carry out writing 100% test coverage and develop at the same time to meet deadlines.
    
    rogerchin
    
    ·
    4 years ago
    ·
    Reply
2. 1
  
  This comment was deleted a year ago.
  
  DeletedUser
  
  ·
  4 years ago
1
IMO there are really two questions here.
1. How important is quality?
2. Which specific techniques are the best tradeoff for a given situation?
It’s really important to get beyond platitudes here. Everybody theoretically wants a higher-quality product. The trick is redirecting development time that could have gone into features, Product/market fit, etc., into improving quality instead. Or customers choosing to pass over cheap low-quality options for an expensive higher-quality one. These definitely happen, but are not universal. You need a clear vision of what level of quality your customers expect, or alternatively how much % you’re willing to spend, to focus the decisionmaking.

Once you have that, you know where you want to be on the cost/benefit curve. So you start with the tactics that are great cost/benefit and move down the list of less benefit until you’ve achieved your goal.

Indicators of excellent cost/benefit include:
- testing for problems that would harm your reputation if they occurred
- tests that can be written quickly
- tests that will run quickly. A slow test suite won’t be run
- tests for requirements people forget or are likely to break accidentally. Or did break.
- tests that encode information people would need to communicate about anyway, such as what steps reproduce a bug
- tests for things that aren’t obvious from the code, and the code can’t be changed to make them obvious
Indicators of poor cost/benefit include:
- We’ll probably change the expected results due to customer feedback, new learning etc.
- Testing the implementation details of code we plan to rewrite
- Behavior is nondeterministic due to time of day, network conditions, etc.
Keep in mind, automated testing is just one strategy for improving quality. There are many other ways to find/fix issues: logging, analytics, playgrounds, user testing, fuzzing, etc. Consider which tool will be best for the problem you’re solving.
drewcrawford

·
4 years ago
·
Reply
1. 1
  
  Thank you so much for including a nice cost/benefit list. The IH community is a goldmine of information and i've actually saved the link to your answer haha
  
  IMO the main issue arises when you have a feature that has rapidly changing requirements but could also harm our reputation if a bug occurs. That tends to be where most of our bugs come in
  
  We just started implementing analytics dashboards with a few alarms to check if something goes wrong and I will definitely look into the other options you mentioned.
  
  yahiabakour
  
  ·
  4 years ago
  ·
  Reply
1

It's a continuous function, not boolean. I feel for Stock Alarm, there are so many math conversions so you do want to make sure test coverage for those is 100%.

I've also seen devs religious about unit tests trying to have 100% test coverage on UI, even when it's just a test feature that may get thrown out in a month. To me, that should be negotiated because it can really slow down your speed of business innovation. Many devs don't understand that and insist to spend 3X time to cover an experimental UI. It's especially a problem when your company only has 3 - 6 months of runway.

On the other hand, for a company like Google, even for an experiment, there should be some automated coverage because it can afford the extra time that engineers have to spend on them.

I know I may freak out some people here but the best devs I work with write code that are logically sound so there are very few bugs. Usually they just need to write a few test cases to cover the corner cases to prevent future updates breaking the behavior. To me that's honestly the ideal state.

Unit tests cannot replace clear logical understanding of the problem. There are devs who mechanically write tests and tweak their code until all the tests pass without really understanding the problem. That tend to be the case in many companies where a good testing practice have not been established so people try to compensate by writing redundant tests.

txu

·
4 years ago
·
Reply
1. 1
  
  Thank you for the insightful feedback and I totally agree that 100% coverage is more of an ideal case scenario that startups cannot afford to pursue (should i keep testing or do i ship features?)
  
  I would be curious to hear about your testing strategy for your products (Higher goals & priorities). It seems that both products are somewhat related and might share infrastructure, in that case do you share any backend between the two products?
  
  yahiabakour
  
  ·
  4 years ago
  ·
  Reply
  1. 1
    
    I have 95% test coverage for a few modules where there are a lot of if/else statements. The two apps share a lot of the same components, it's a leverage to have many reusable components. They also have the same backend.
    
    I use immutable objects + functional programming + Rx so I know they are bug free as long as they are assembled properly.
    
    I screwed up a few times and the app was crashing for 2 days in one instance. It was because I accidentally deleted a line in AndroidManifest.xml file which no unit tests can detect. I fixed it in 6 hours and very few people noticed that it was crashing.
    
    It comes down to your confidence level of how fast you can recover from mistakes. I wouldn't do that when I was working for a big company, but at startups you almost always have less than 1 year of runway left. Have to take some measured risks to increase the odds of success.
    
    txu
    
    ·
    4 years ago
    ·
    Reply
2

This comment was deleted a year ago.

DeletedUser

·
4 years ago
1. 1
  
  Thanks for the input!
  
  What is your web app out of curiosity?
  
  yahiabakour
  
  ·
  4 years ago
  ·
  Reply
  1. 2
    
    This comment was deleted a year ago.
    
    DeletedUser
    
    ·
    4 years ago