Unit-testing – How to unit-test code using graph structures

graphtest-coverageunit testing

I am writing (recursive) code that is navigating a dependency graph looks for cycles or contradictions in the dependencies. However, I am not sure how to approach unit testing this. The problem is that one of our main concerns is will the code handle on all the interesting graph structures that can arise and making sure that all nodes will be handled appropriately.

While usually 100% line or branch coverage is sufficient to be confident that some code works, it feels like even with 100% path coverage you'd still have doubts.

So, how does one go about selecting graph structures for test cases to have confidence that their code could handle all the conceivable permutations you'll find in real world data.


PS- If it matters, all edges in my graph are labelled "must have" xor "cannot have" and there are no trivial cycles, and there is only one edge between any two nodes.


PPS- This additional problem statement was originally posted by the question's author in a comment below:

For all vertices N in forest F, for all vertices M, in F, such that if there are any walks between N and M they all must either use only edges labelled 'conflict' or 'requires'.

Best Answer

We keep thinking that we have all the tricky ones covered and then we realise that a certain structure causes problems we hadn't considered before. Testing every tricky that we can think of is what we are doing.

That sounds like a good start. I guess you already tried to apply some classic techniques like boundary value analysis or equivalence partitioning, as you already mentioned coverage based testing. After you invested a lot of time in constructing good test cases, you will come to a point where you, your team and also your testers (if you have those) run out of ideas. And that's the point where you should leave the path of unit testing and start testing with as much real world data as you can.

It should be obvious that you should try to pick a big diversity of graphs from your production data. Maybe you have to write some additional tools or programs just for that part of the process. The hard part here is probably to verify the correctness of your programs output, when you put ten thousand different real world graphs into your program, how will you know if your program produces always the correct output? Obviously you cannot check than manually. So if you are lucky, you may be able to make a second, very simple implementation of your dependency check, which might not fulfill your performance expectations, but is easier to verify than your original algorithm. You should also try to integrate a lot of plausibility checks directly into your program (for example, make some bookkeeping if your graph search hit every node and every edge of the graph).

Finally, learn to accept that every test can only proof the existence of bugs, but not the absence of bugs.