Code Review Limbo: How Low Should You Go?
Code reviews have been a standard and accepted part of software development for decades, and, despite the vast changes in software, supporting tools, and methodologies, one question has survived: how thorough do we need to be when reviewing code changes? Is a cursory skim sufficient, does it require a methodical analysis of all changes, or do I need to pick through it like a monkey looking for ticks?
Budgets are too tight to be burned on inflexible “one-size-fits-all” processes, and we cannot afford ineffective reviews that let obvious bugs slip through. Whether we are staring down a large Pull Request (PR) or a simple change, there should be no rigid methodology for what a code review is or how it should be conducted. All code changes, developers, projects and reviewers are not the same, and, therefore, all code reviews will not be the same. In this article, I will explore the diversity of factors I have used over the years to keep our reviews limber and help us answer the question: “how low should you go?”
Nature of the Change
At the top of the list is the nature of the change, measured in four ways: size, risk, testability, and complexity.
A trivial label change will typically garner much less scrutiny than rewriting a search engine’s primary algorithm. While most code reviews lie between these extremes, the hyperbole illustrates the point: bigger changes require more attention. While a brief diff might suffice for minor changes, where the impact can be absorbed quickly and easily, larger PRs usually require more in-depth analysis to usefully assess their impact, even including running the code, and/or running the associated unit tests.
Changes to safety-critical code call for more intense scrutiny, e.g. code used in healthcare decision making, air traffic control or other life and death scenarios. Even the most trivial change with a low probability of error still requires deeper examination than normal due to the extreme cost of an error. When the risk is high, we must "go low” in our code review.
Not all code changes are easily tested by a QA team. Examples of this are error handling when unexpected database exceptions happen mid-transaction and duplicate processes running concurrently (when they shouldn’t be). These, and many others like them, can be important technical considerations to ensure the integrity of the system, but may not be realistic for QA to test. Knowing this means that the engineering team must take extra responsibility for the correctness of the code, and that includes the code review process “going lower” than it might normally.
But it isn’t just size that matters. While complexity usually correlates to the size of a PR, a small volume of diffs can also contain dense logic that is not easily digested at a glance. A good rule of thumb: if it takes more than a first look to absorb what the code is doing, then it will require more than a passing look to review the changes in any useful sense. You cannot effectively review what you do not understand. And that leads to another type of review.
Some code simply cannot be reviewed in the traditional, meaningful way. When the Wizard of Oz (aka the Architect) drops a two-inch stack of paper on your desk containing a new subsystem that needs to be reviewed, it is time to get innovative. And yes, this has happened to me.
This is the time to “go high” and do an old fashioned “walkthrough,” where the author provides a high-level overview of the code, explains the design, organization of modules, logic flow, etc. Apart from extreme circumstances, or if your project has no time constraints and you have infinite patience, it is not realistic to inspect every line of this code in a meaningful way.
The primary goal of reviewing code like this is cross-training. Any detailed review that happens along the way is a bonus. At best, you may be able to identify a few critical areas that require greater attention, e.g. safety-critical code. We can wring our hands that this type of review should not happen, that PRs should be small, etc. but, when you find yourself staring down a menacing pull request, you need a battle plan.
With the above factors in mind, another significant influence on how to review a PR is the developer who made the change. Changes from a junior developer, or someone new to the project, may warrant more careful attention, especially for errors of omission. It is common for such developers to not understand the broader context of the system, how their change might interact with other subsystems and modules, or even forget or not fully understand the full nature of requirements. A code review here is an opportunity to not only ensure the integrity of the change, but to help educate the developer and broaden their understanding of the system.
A senior developer, who is an expert in the system, generally has earned a higher level of trust and will not be as prone to errors of omission. Of course, an inspection still is required, but typically it would be overkill to pick through each line with the same intensity as in the previous example. There is an exception worth exploring, however, what I call the sniff test.
The Sniff Test basically says, if the code under review has a funky smell, it is time to start digging for problems. The funkier the smell, the deeper the dig. Things that get my nose twitchy include:
- If one change shows carelessness or lack of understanding, then it is reasonable to suspect other code was done poorly.
- If one requirement was misunderstood and implemented wrongly, that could reflect a related problem in other changes as well.
- How do the automated tests look? Code that is poorly tested is often poorly written, both showing a lack of attention to detail or comprehension.
- Does the code demonstrate adherence to project standards and conventions? If not, are there other more functional ways that the code does not conform?
- Does this developer have a demonstrated history of buggy code?
To a large extent, the depth of scrutiny we apply during a code review reflects our trust in the developer and their change. Each of the above areas, and those like them, erode that trust. Even the most senior, expert, trusted person has off days, and any of the above should be regarded as a warning flag.
The final factor that may influence how much time and effort is given to a code review is the number of safety nets the project has. Specifically, how many quality safeguards are in place, and, therefore, how much is this review depended on for ensuring the quality of the release.
Did the developer add thorough and meaningful unit tests, designed to execute both standard functional areas, as well as unlikely or difficult to test corner cases? Does this project even make use of automated testing? If not, has the developer provided solid manual testing instructions and provided evidence of their own testing?
I once worked on a complex enterprise system that had tens of thousands of automated unit tests, complete with a Continuous Integration (CI) build. When code was checked in you knew within an hour or so if it broke anything. Over time these tests had proven they were worth their weight in gold and provided an incredible safety net. Because of the complex nature of the larger system, it was difficult to see every possible impact of a change. The full range of protection provided by the tests made it easier to focus code review energies on areas that were more meaningful and reasonable to inspect.
Similarly, what kind of QA testing will be done for this product? Strong QA will provide the same confidence that a CI build does. It does not exempt the reviewer from doing a thorough job, but with time as a precious commodity, it allows a more strategic and focused use of time during review.
Unit testing, CI builds, QA: each of these indicate how much the code review is relied upon to provide the safety for the changes, and therefore, can help us know, “how low do we go?”
Code reviews are a critical part of the software development process, but can be performed in an infinite number of ways. This wide range of methods and strategies make it necessary to carefully consider the best use of time to ensure maximum effectiveness without draining the entire budget on rigid, but not helpful, busywork. The nature of the change, the developer, other safety nets and the all-important sniff test can help you effectively budget your time and effort as you decide, “how low will you go.”