Code Reviews – How to Determine the Effectiveness of a Code Review Process?

code-reviewsmeasurementmetricsquality

We've introduced a code review process within our organisation and it seems to be working well. However, I would like to be able to measure the effectiveness of the process over time, i.e. are we not finding bugs because the code is clean or are people just not picking up on bugs?

Currently, we don't have an effective fully-automated test process. We primarily employ manual testing, so we can't rely on defects found at this stage to ensure that the code review process is working.

Has anyone come across this issue before or has any thoughts on what works well in measuring code reviews?

Best Answer

There are a number of metrics that can be gathered from code reviews, some even extending throughout the lifecycle of the project.

The first metric that I would recommend gathering is defect removal effectiveness (DRE). For every defect, you identify what phase the defect was introduced in and what phase it was removed in. The various defect detection techniques that you use are all assessed simultaneously, so it applies equally to requirements reviews, design reviews, code reviews, unit tests, and so on. You would be particularly interested in the number of defects caught in the code phase, since this would probably encompass your unit tests as well as code reviews. If many defects from the code phase are making it through to the integration test phase or even the field, you know that you post-coding practices should be evaluated.

Various meeting metrics would also be relevant. These include the time to prepare, time in meeting, lines of code read, defects found in the review, and so on. Some observations can be made from this data. As an example would be if your reviewers are spending a large amount of time reading the code in preparation for the review, but finding very few problems. Coupled with the DRE data, you can draw the conclusion that if defects are being tested in integration testing or the field, then your team needs to focus on their review techniques to be able to find problems. Another interesting note would be the lines of code (or some other size measurement) read in a meeting compared to the time of the meeting. Research has found that the speed of a typical code review is 150 lines of code per hour.

With any of these metrics, it's then important to understand their impact on the process. Root cause analysis, using techniques such as why-because, Five Whys, or Ishikawa diagrams can be used to identify the reasons why code reviews (or any other quality improvement technique) are (in)effective.

You might also be interested in this article about inspections from The Ganssle Group and an article by Capers Jones in Crosstalk about Defect Potentials and DRE.

Related Topic