R – Are unit tests and acceptance tests enough

acceptance-testingautomated-testsintegration-testingunit testing

If I have unit tests for each class and/or member function and acceptance tests for every user story do I have enough tests to ensure the project functions as expected?

For instance if I have unit tests and acceptance tests for a feature do I still need integration tests or should the unit and acceptance tests cover the same ground? Is there overlap between test types?

I'm talking about automated tests here. I know manual testing is still needed for things like ease of use, etc.

Best Answer

I'd recommend reading chapters 20 - 22 in the 2nd edition of Code Complete. It covers software quality very well.

Here's a quick breakdown of some of the key points (all credit goes to McConnell, 2004)

Chapter 20 - The Software-Quality Landscape:

No single defect-detection technique is completely effective by itself
The earlier you find a defect, the less intertwined it will become with the rest of your code and the less damage it will cause

Chapter 21 - Collaborative Construction:

Collaborative development practices tend to find a higher percentage of defects than testing and to find them more efficiently
Collaborative development practices tend to find different kinds of errors than testing does, implying that you need to use both reviews and testing to ensure the quality of your software
Pair programming typically costs the about the same as inspections and produces similar quality code

Chapter 22 - Developer Testing:

Automated testing is useful in general and is essential for regression testing
The best way to improve your testing process is to make it regular, measure it, and use what you learn to improve it
Writing test cases before the code takes the same amount of time and effort as writing the test cases after the code, but it shortens defect-detection-debug-correction-cycles (Test Driven Development)

As far as how you are formulating your unit tests, you should consider basis testing, data-flow analysis, boundary analysis etc. All of these are explained in great detail in the book (which also includes many other references for further reading).

Maybe this isn't exactly what you were asking, but I would say automated testing is definitely not enough of a strategy. You should also consider such things as pair programming, formal reviews (or informal reviews, depending on the size of the project) and test scaffolding along with your automated testing (unit tests, regression testing etc.).

Related Solutions

The difference between integration and unit tests

The key difference, to me, is that integration tests reveal if a feature is working or is broken, since they stress the code in a scenario close to reality. They invoke one or more software methods or features and test if they act as expected.

On the opposite, a Unit test testing a single method relies on the (often wrong) assumption that the rest of the software is correctly working, because it explicitly mocks every dependency.

Hence, when a unit test for a method implementing some feature is green, it does not mean the feature is working.

Say you have a method somewhere like this:

public SomeResults DoSomething(someInput) {
  var someResult = [Do your job with someInput];
  Log.TrackTheFactYouDidYourJob();
  return someResults;
}

DoSomething is very important to your customer: it's a feature, the only thing that matters. That's why you usually write a Cucumber specification asserting it: you wish to verify and communicate the feature is working or not.

Feature: To be able to do something
  In order to do something
  As someone
  I want the system to do this thing

Scenario: A sample one
  Given this situation
  When I do something
  Then what I get is what I was expecting for

No doubt: if the test passes, you can assert you are delivering a working feature. This is what you can call Business Value.

If you want to write a unit test for DoSomething you should pretend (using some mocks) that the rest of the classes and methods are working (that is: that, all dependencies the method is using are correctly working) and assert your method is working.

In practice, you do something like:

public SomeResults DoSomething(someInput) {
  var someResult = [Do your job with someInput];
  FakeAlwaysWorkingLog.TrackTheFactYouDidYourJob(); // Using a mock Log
  return someResults;
}

You can do this with Dependency Injection, or some Factory Method or any Mock Framework or just extending the class under test.

Suppose there's a bug in Log.DoSomething(). Fortunately, the Gherkin spec will find it and your end-to-end tests will fail.

The feature won't work, because Log is broken, not because [Do your job with someInput] is not doing its job. And, by the way, [Do your job with someInput] is the sole responsibility for that method.

Also, suppose Log is used in 100 other features, in 100 other methods of 100 other classes.

Yep, 100 features will fail. But, fortunately, 100 end-to-end tests are failing as well and revealing the problem. And, yes: they are telling the truth.

It's very useful information: I know I have a broken product. It's also very confusing information: it tells me nothing about where the problem is. It communicates me the symptom, not the root cause.

Yet, DoSomething's unit test is green, because it's using a fake Log, built to never break. And, yes: it's clearly lying. It's communicating a broken feature is working. How can it be useful?

(If DoSomething()'s unit test fails, be sure: [Do your job with someInput] has some bugs.)

Suppose this is a system with a broken class:

A single bug will break several features, and several integration tests will fail.

A single bug will break several features, and several integration tests will fail

On the other hand, the same bug will break just one unit test.

The same bug will break just one unit test

Now, compare the two scenarios.

The same bug will break just one unit test.

All your features using the broken Log are red
All your unit tests are green, only the unit test for Log is red

Actually, unit tests for all modules using a broken feature are green because, by using mocks, they removed dependencies. In other words, they run in an ideal, completely fictional world. And this is the only way to isolate bugs and seek them. Unit testing means mocking. If you aren't mocking, you aren't unit testing.

The difference

Integration tests tell what's not working. But they are of no use in guessing where the problem could be.

Unit tests are the sole tests that tell you where exactly the bug is. To draw this information, they must run the method in a mocked environment, where all other dependencies are supposed to correctly work.

That's why I think that your sentence "Or is it just a unit test that spans 2 classes" is somehow displaced. A unit test should never span 2 classes.

This reply is basically a summary of what I wrote here: Unit tests lie, that's why I love them.

Random data in Unit Tests

There's a compromise. Your coworker is actually onto something, but I think he's doing it wrong. I'm not sure that totally random testing is very useful, but it's certainly not invalid.

A program (or unit) specification is a hypothesis that there exists some program that meets it. The program itself is then evidence of that hypothesis. What unit testing ought to be is an attempt to provide counter-evidence to refute that the program works according to the spec.

Now, you can write the unit tests by hand, but it really is a mechanical task. It can be automated. All you have to do is write the spec, and a machine can generate lots and lots of unit tests that try to break your code.

I don't know what language you're using, but see here:

Java http://functionaljava.org/

Scala (or Java) http://github.com/rickynils/scalacheck

Haskell http://www.cs.chalmers.se/~rjmh/QuickCheck/

.NET: http://blogs.msdn.com/dsyme/archive/2008/08/09/fscheck-0-2.aspx

These tools will take your well-formed spec as input and automatically generate as many unit tests as you want, with automatically generated data. They use "shrinking" strategies (which you can tweak) to find the simplest possible test case to break your code and to make sure it covers the edge cases well.

Happy testing!

Best Answer

Related Solutions

The difference between integration and unit tests

The difference

Random data in Unit Tests

Related Topic