Write tests for people

A human oriented sales pitch for getting your team on board with writing more tests.

A lot has been said about the virtues of automated software testing, such as helping you write less error-prone and more modular code. While these things can be true, I think their biggest value prop is being one of the most useful tools you can have for sustainably growing an engineering team. In fact, I think having a robust test suite is a necessary prerequisite for growth.

Case study: a transition to state machine transitions

In a recent role, another engineer I was working with who had just recently joined the team identified a nasty bit of front-end logic that was trying to compute some state based on a number of different inputs that all needed to be fetched from an API. This engineer suggested that we instead precompute this state on the server using a finite-state machine, which would greatly improve load-times and remove some really gnarly, mind-bending tech-debt.

This was an objectively good idea, so they immediately got started on a proposal doc laying out all of the necessary changes for integrating this new state machine into our existing services. This doc was then peer reviewed, revised, and approved by other members of our own team, and by engineers from other related teams. They then got to work, and shipped it off after hours of thoroughly testing the new state machine in our local and staging environments.

Unfortunately, after a couple days of running this code in production, serious issues began cropping up that were impacting the business financially (in a negative way), despite our valiant efforts to avoid this situation with documentation, peer review, and manual testing. Initially we were perplexed as to how we could have missed these issues in our many hours of testing, but then realized that most (if not all) of the bugs we were seeing were all located in services that our own team was unfamiliar with.

This ended up becoming a common theme as the size of our codebase and team grew: engineers became more likely to introduce bugs into systems they had little to no knowledge of while working on seemingly unrelated systems. This was especially true for newer members, who had no pre-existing knowledge about the inner workings of existing systems and how they were all interrelated.

And even if you were a tenured member of the team, the codebase had grown to a point where it was impossible to keep it all documented in your head to know where you should be testing when opening up a PR.

One thing all of these failing systems had in common was a lack of any type of meaningful test coverage. This was because we had consciously and collectively decided not to impose a testing requirement for merging PRs in the early days of the engineering org, in an effort to… ship things faster. This was fine when the team was composed of just a handful of engineers and we were all aware of what everyone else was working on, but ended up coming back to haunt us as the team and the code we were responsible for maintaining increased in size.

Tests are a safety net

Writing tests could be valuable in helping you ship less buggy code, but I think they are even more valuable as a tool for documenting and enforcing the desired behavior of a system and preventing engineers from erroneously changing that behavior.

This creates a safety net that is especially valuable to newer teammates, who need to get work done but are unsure of what exactly they need to change to get things working properly: the tests will tell them what systems their change would affect. The same applies to more experienced team members as the size of the codebase grows past the point of being able to comprehend it in its entirety at any given time.

Tests can also be valuable to engineers who are earlier in their careers and may be less confident in their abilities. In the absence of a good test suite, they might view shipping a critical bug as a failing on their part, when it is in fact a failing of the engineers that came before them to practice good engineering practices and implement systems to reduce the chances of things from blowing up in the first place.

Tests are just one component of a fully fledged quality assurance program and will never completely prevent bugs from cropping up, but they definitely do help, and will pay massive dividends as you start to bring on more folks.