Testing Data – How to Test When Data Arrangement Is Cumbersome

integration-teststestingunit testing

I am writing a parser and as a part of that, I have an Expander class that "expands" single complex statement into multiple simple statements. For example, it would expand this:

x = 2 + 3 * a

into:

tmp1 = 3 * a
x = 2 + tmp1

Now I'm thinking about how to test this class, specifically how to Arrange the tests. I could manually create the input syntax tree:

var input = new AssignStatement(
    new Variable("x"),
    new BinaryExpression(
        new Constant(2),
        BinaryOperator.Plus,
        new BinaryExpression(new Constant(3), BinaryOperator.Multiply, new Variable("a"))));

Or I could write it as a string and parse it:

var input = new Parser().ParseStatement("x = 2 + 3 * a");

The second option is much simpler, shorter and readable. But it also introduces a dependency on Parser, which means that a bug in Parser could fail this test. So, the test would stop being a unit test of Expander, and I guess technically becomes an integration test of Parser and Expander.

My question is: is it okay to rely mostly (or completely) on this kind of integration test to test this Expander class?

Best Answer

You're going to find yourself writing a lot more tests, of much more complicated, interesting, and useful behavior, if you can do so simply. So the option that involves

var input = new Parser().ParseStatement("x = 2 + 3 * a");

is quite valid. It does depend on another component. But everything depends on dozens of other components. If you mock something to within an inch of its life, you're probably depending on a lot of mocking features and test fixtures.

Developers sometimes over-focus on the purity of their unit tests, or developing unit tests and unit tests only, without any module, integration, stress or other kinds of tests. All those forms are valid and useful, and they're all the proper responsibility of developers--not just Q/A or operations personnel further down the pipeline.

One approach I've used is to start with these higher level runs, then use the data produced from them to construct the long-form, lowest-common-denominator expression of the test. E.g. when you dump the data structure from the input produced above, then you can easily construct the:

var input = new AssignStatement(
    new Variable("x"),
    new BinaryExpression(
        new Constant(2),
        BinaryOperator.Plus,
        new BinaryExpression(new Constant(3), BinaryOperator.Multiply, new Variable("a"))));

kind of test that tests at the very lowest level. That way you get a nice mix: A handful of the very most basic, primitive tests (pure unit tests), but have not spent a week writing tests at that primitive level. That gives you the time resource needed to write many more, slightly less atomic tests using the Parser as a helper. End result: More tests, more coverage, more corner and other interesting cases, better code and higher quality assurance.

Related Topic