The Myth of Code Coverage

The Myth of Code Coverage

One question I often ask potential software engineering candidates is to pinpoint the percentage of code coverage in an ideal project. Interestingly enough, many of them jump to the sky with numbers beyond 90%. They would start preaching how well-tested code is more reliable and brings more value to all stakeholders. When I ask about the current projects they work on, the reality looks a bit more down to earth.

Want to know my answer? I'd say it's around 66.7%.

The reasons for that vary, but allow me to be blunt and say that 1/3 of the code every software project is irrelevant, buggy, overly complicated, or simply sucks. It has a reason to be where it is, but chances are, one year down the road, it will become a liability. Being dogmatic about tests and covering every line will only make it more difficult to get rid of it.

See, no two lines of code are equal in value and importance. Adding a new feature to an existing application affects its capabilities only marginally. However, it takes a proportionally large amount of time to develop and integrate due to the existing complexity. The bigger the complexity, the longer it takes to introduce new functionality. By the time the feature finds itself in production, it may as well be already irrelevant.

The only sure-fire way to improve code coverage (and by that keep software relevant) is to identify and remove the unnecessary code.

How do you identify irrelevant code? Don't search for it. Instead, let it reveal itself to you. One dimension of software that few teams make good use of, is its history. Git is a great analysis tool. Start using it not only to prevent future problems but also, to understand where and how often certain parts of the code change over time.

Chances are, you will find pieces of code that have undergone fewer changes than the rest in long periods. Those are the pillars of your application - the 2/3s that must be well-tested.

You will also find others where changes occur more or less every week. Ask yourselves whether those are still relevant, both from a technical and business perspective. Adding tests for the sake of coverage would have the opposite effect of increasing the code quality. In a perfectly-design software project, the part that is allowed to change most often is the configuration. A simple analysis of the code change frequency would show whether that is the case. If it isn't, try separating the moving parts from the core logic. If this is not feasible either, most probably those portions of the code don't belong to the codebase anyway. Turning them into interchangeable scripts (even stored in a separate repository) is one way of tackling them in their own right.

Let's not get too much into technical details. I have already alluded to the work of Adam Tornhill on code analysis in a previous post of mine:

Use the Git History to Identify Pain Points in Any Project
Have you heard of Adam Tornhill [https://twitter.com/AdamTornhill]’s work? Ifnot, I highly recommend that you set some time aside and check out Your Code asa Crime Scene [https://amzn.to/32DM1G9] or Software DEsign X-Rays[https://amzn.to/2vtbjdR…

In a future post, I will discuss some of the new ideas I applied to the simple git one-liner I presented there.