It happened to me several times in my career that senior engineers told me: unit tests are not useful. These are the common arguments:
- You must hit a real database/3rd-party/dependency to “really” test stuff.
- Unit tests pass, but then the final result is buggy.
- High coverage means nothing.
I’m sure you saw various images like this one:
Of course, that’s fair—unit tests don’t test a system as a whole. And the farther our testing environment from production, the more chances we have to face unpredictable outcomes. We would love to run our tests on production! Involve real dependencies, real data, real everything. Just imagine this QA heaven: high-level automated testing on production, that hits every possible dependency and tests every possible click of a user. That’s what we need, not some detailed low-level testing. Right?
Describing perfect apples doesn’t deny a need in oranges. It’s true that testing a system as a whole, in an environment close to production, is very much useful. But why do we compare that to unit testing?
Software without unit tests
Let’s imagine a system that processes numbers, applying a set of simple operations: adding, subtracting, and multiplying. For simplicity, let’s describe our system as a function:
S(a, b) => mul(sum(a, b), (sub(a, b))
sum(a, b) => a + b sub(a, b) => a - b mul(a, b) => a * b
Any real application can be described as a super complex function, with a multitude of simple operations and with complex inputs and outputs.
High-level tests for our system may look as follows:
S(0, 0) == 0 // (0 + 0) * (0 - 0) = 0 S(0, 1) == -1 // (0 + 1) * (0 - 1) = -1 S(1, 1) == 0 // (1 + 1) * (1 - 1) = 0
The system consists of 3 units: sum, sub, and mul. In any high-level testing, we usually narrow expectations to a simple result: we check if a record was created or a message was displayed, etc. That’s because we can’t think of all the possible outcomes of a complex operation. Even in the artificial example above, we have a simple result: a number.
Now let’s imagine somebody introduces a bug by changing
sum(a, b) from
a + b to
a + 1. Our system tests would appear greener than a summer grass, because the change didn’t affect them, and we would happily ship the update to production:
S(0, 0) == 0 // (0 + 1) * (0 - 0) = 0 S(0, 1) == -1 // (0 + 1) * (0 - 1) = -1 S(1, 1) == 0 // (1 + 1) * (1 - 1) = 0
In our example, we had 3 units and 3 system tests. That’s 1:1 ratio. Real systems have way more units, and the ratio can easily be 1:1000 or even higher.
The more units we have, the higher the chance to miss a bug in one of those units. Add to this the complexity of modern software: we use languages that support NULLs and dynamic typing, and we use tons of 3rd-party dependencies. Even a simple
b. Good luck catching all those nuances in the high-level testing!
After all, when you have a bug, it’s usually in a single unit (read: method, function, line). It may cause other units to fail, but the source, in many cases, is limited to one location.
Software with unit tests only
A set of unit tests does not guarantee the system is working as a whole. When we run 10K unit tests, we shouldn’t think that we’re testing our system. We’re merely running 10K independent tests. We may have a set of tests that involve the same unit, and when they all pass, we may assume the unit is working as expected. That would be a valid assumption. But all unit tests combined don’t test the system as a whole. Just like a high-level test that validates a feature doesn’t test each individual unit.
It is, however, much easier to grow a layer of high-level tests on top of the existing layer of unit tests. Adding unit tests to an existing system is often much harder.
Additional benefits of unit tests
Until now, we talked about the granularity of testing. Unit tests are supposed to be on a very low level as opposed to other kinds of tests. At the same time, there are at least two additional reasons to invest in unit tests: code decoupling (reducing complexity) and documentation.
You won’t be able to create proper unit tests if your dependencies are highly coupled. That’s especially painful when your code depends on external service: web API, database, etc. Some developers think it’s cool to interact with a real database in tests. Yes, it is cool. In high-level testing. When good practices are followed, and unit tests are created for every unit, there’ll be hundreds if not thousands of tests over time. They need to run fast. Otherwise, developers will incline toward not running the whole suite of tests as often as they should, which will delay bug detection. Eventually, the entire deployment pipeline will be longer than desired. Low coupling and mocking are great tools for designing unit tests. The tests should be fast so that developers could run a complete suite before pushing their changes.
Apart from coupling, there’s also complexity. If your unit does not follow the single responsibility principle, it will be harder to account for all possible scenarios and do thorough testing.
TDD (test-driven development) comes as an excellent solution for maintaining simple, low coupled code. When you practice it, you also have great test coverage. Without TDD, it may be challenging to create high-quality unit tests.
Finally, when you have proper unit tests, they serve as a technical documentation of your code. You can go through the tests to learn the code’s behavior. It may be harder to understand all possible outcomes from the code itself, even if it’s super clean. Tests serve as examples, and when we learn something, we need examples to understand faster.