Quality Gates – Concept and Application in Software Testing

code-qualityfunctional-testingprogramming practicestestingunit testing

We are using SonarQube for code quality testing. It tests the quality of code, and not the function of code. It has the concept of quality gates, so you can set for instance a 90% quality gate, meaning that anything over 90% quality is considered a pass.

Some folks here like this idea and have decided to apply it to functional and unit tests. After running our functional and unit tests, we check what percentage passed and promote the code to the next environment if a high enough percentage of tests pass. In order for the code to be promoted to production the percentage passed must be 100.

To me, the tests themselves are the quality gate. Tests should never fail. If tests are failing, there is a risk introduced to the entire application and it must be fixed right away.

I'm struggling to see a valid argument for requiring that only a certain percentage of functional and unit tests pass as the code travels through our different environments on route to production. Can anyone provide one?

Best Answer

A test suite should only pass if all tests pass. Otherwise, the tests become worthless. What is an important failure, what is a failure that can be ignored? The result would be that all test failures would be ignored after a while. Bad.

There is one exception to this: a test suite may contain tests that are known to fail, as the necessary functionality has yet to be implemented or the bug has yet to be fixed. Such tests are valuable because they clearly document a bug. But because their failure would not be a regression, their failure should not fail the whole test suite (on the contrary, if they start to pass that would indicate your test suite isn't up to date with your code). Ideally, your test framework has a concept of such “TODO tests”.

Quality metrics are a different beast. If a quality metric crosses a threshold, that indicates that something is probably but not necessarily ripe for a refactoring. But some “violations” may be OK in the context of that code. As long as certain code regions can be excluded from specific analysis tools, gating on that quality metric is OK. Obviously, any explicit exclusion would be a red flag in a code review and subject to extra scrutiny, but keeping such an escape hatch open for exceptional circumstances is important.

In particular, the idea of requiring an increasing quality as an artefact travels through the release pipeline is not necessarily good. Where does the necessary quality increment come from? From the devs who improve the code and re-submit a new artefact into the pipeline. Since the necessary quality metric to traverse the whole pipeline is known beforehand, submitting any artefacts without this quality is a waste of time. So why are you doing it? Likely, the stages in the pipeline provide feedback on your program which is useful before the main release. To get this feedback, you have to submit the code even when you don't have the intention of making it through the pipeline. Again, false negatives are bad. Such a workflow is unsuitable for a pipeline model, and the feedback should be available independently.

That does not mean you should give up on quality gating. But if your target is a 100% metric for a release, the current quality metric becomes a progress indicator for your project, like a burn down chart for technical debt.

Related Solutions

Unit-testing – Unit and Integration testing: How to it become a reflex

None of us actually feel bad when not writing unit tests at the same time as the actual code.

This is the point you need to address. The culture of your team needs to change such that not writing tests during the sprint (or whatever unit of time you use) becomes just as much a code smell as hard-coding values. Much of that involves peer pressure. Nobody really wants to be viewed as substandard.

Do the tests yourself. Visibly berate yourself when you don't do them. Point out where a "good" programmer would've caught that bug if they'd written unit tests. Nobody wants to be bad. Make it that this undesireable behavior is bad and people will follow.

Unit Testing – Ensuring Quality of Code

You should definitely take the same if not better care of your unit tests than your production code in terms of quality and readability. The unit tests are often the first thing you look at when trying to grasp what some piece of code does, and the reader should quickly understand what's at stake when looking at the test. Unit tests also tend to change a lot and will break a lot if their design is not robust, which kind of nullifies the benefits of having tests.

Violation of the Law of Demeter is definitely a problem in your unit tests for that reason as well as 2 others that come off my mind :

If your tests break the Law of Demeter in their Arrange or Act sections, it's probably a sign that your production code also does, since your unit tests are just another consumer of your code and will probably call and operate the class under test in the same way that any other production code would do.
If your tests break the Law of Demeter in their Assert sections (ie you verify the value of something that is deeply nested in the dependencies graph of the object under test), it might be that those are really integration tests rather than unit tests. In other words, if in TestA you assert that A.B.C.D equals something, it might be that you're actually trying to test D and A rather than just A.

By the way, when you say

I break very often the Law of Demeter, for faster writing and not using so many variables.

you should be aware that writing

var grab = myDependency.Grab;
var something = grab.Something;
var very = something.Very;

very.Deep();

is actually no better Demeter wise than

myDependency.Grab.Something.Very.Deep();

if that's what you meant.

Best Answer

Related Solutions

Unit-testing – Unit and Integration testing: How to it become a reflex

Unit Testing – Ensuring Quality of Code

Related Topic