← Back to home

Writing tests is different from writing code

Abstraction and factoring aren't super helpful when you're writing tests.

Writing decent software is difficult because the surface area (what works) is much different from the volume of the code (how it works, and more importantly, how it does not work.)

This is something I’ve been thinking more about lately, and I’ve come to think of tests as a different kind of code. They’re more or less executable requirements, or executable documentation. In this sense, documentation can be thought of as a subset of requirements.

For tests you don’t want a ton of abstraction or function calls between you and the code your testing. You really want something that’s flat, concrete, and easily readable. Abstracting away duplication is not the thing you want to do for tests, even if it makes your tests less verbose. Abstraction leads to something like a DSL but for your tests, where each test is almost completely different, but very few lines.

Tests with factored out builders, helpers, utils, and more makes them harder to understand. It moves out some logically dense parts in order to focus on other logically dense parts. But it doesn’t reduce the complexity, it just spreads it out. It reduces the reader’s ability to understand it because the brain has a upper limit on abstraction, and code is already abstract enough as it is.

The bottom-up approach of concrete examples is better. Something that’s plain, even if verbose, allows us to build the mental models of abstraction ourselves as we see each example. Something falls into place in your head when you see the similarities and differences in relation to each other, not with the differences highlighted by abstracting away the similarities. Think of it as the difference between a relative bar graph, and an absolute one; one highlights the delta, another gives you some perspective for that delta. I’d take the perspective on the delta everytime, because I can use context to figure out if it’s as important as the relative bar graph would suggest.

A couple of years ago I worked on some software that had very thorough tests that were abstracted and factored into horror shows. The code being tested was complex, and as the code organically grew year after year, the tests did too, but were abstracted each time, ending up with a testing system that was even more complicated than the live code it was attempting to test. Some tests were testing cases that would not, could not, and did not actually exist.

A good example of this type of abstraction is a builder pattern for constructing objects for services. You end up passing large objects back and forth with builders, effective checking for a single property changing on it. It either tests something very simple, or masks the complexity, leaving you with tests that don’t entirely document the reality of the underlying code. The solution to this sort of thing is usually just plain duplication of data. The tests are larger, but the complexity isn’t hidden or spread out. If the code is solving a real problem, there’s a minimum amount of complexity that needs to exist to work. There’s also a minimum amount of code that needs to be in one place to test it. What works for the production path - abstraction, factoring, separation of concerns - doesn’t work for testing.

code | testing