After having written lots of test, I am strongly in favour of splitting up large methods, and of testing private methods. Splitting up functionality into smaller steps has two great advantages:
By introducing a name for an operation, the code becomes more self-documenting.
By using smaller methods, the code is simpler and thus more likely correct. E.g. you can immediately understand getChildNode()
.
While the overall program cyclomatic complexity isn't reduced, these two advantages outweigh the bit of extra code in my book. There is a third advantage: the code becomes much easier to test, assuming we can get around the private
access modifier.
People who advise against testing private implementation details make a good point: the test should show that the implementation adheres to it's public interface, but this can only be done by black-box tests of the public methods. Such tests don't require that you get 100% coverage, but missing coverage is an indication of dead code that's not required by the specification. Since TDD tests should define externally observable behaviour, such tests fall into this category.
But you can also use a different approach to testing: showing that an existing implementation is likely correct, and works as the programmer expected. Since we already have the code, we can design our tests to maximize coverage. We can use boundary values to meticulously exercise the behaviour of the program. In other words, we can write white-box tests that know about implementation details of the system under test. These tests are highly coupled to the implementation, but that is OK as long as you also have more general black-box tests.
As a result, I prefer a few black-box tests that walk through the “happy path” and basic guarantees of the interface. But it is way to cumbersome to exercise all possibilities: with each input parameter or state variable in the function, the test space increases exponentially! A function f(bool)
might take two tests, a function f(bool, bool)
2²=4, and f(bool, bool, bool, bool)
already 24=16. This is untenable. But by splitting a large function into smaller functions, I only have to show that each smaller function works as expected, and that the functions work together correctly (I call this testing by induction). My workload now adds, instead of multiplying – a great improvement if you want to be thorough!
In your concrete example, either possibility is suboptimal because in the first try there's loads of code duplication requiring duplicate testing, and in the second try there are interdependencies between the functions that cannot be mocked away. Only getChildNode()
is easy to test, but this function is incorrect if n
is equal to t.getValue()
, which will never happen if that function is only ever called by traverseByValue()
. An easy to test alternative would be:
public static int calculateDepth(BinaryTreeNode<Integer> t, int n) {
if (t == null) {
throw new TreeException("Value not found in tree!");
}
if (t.getValue() == n) {
return 0;
}
BinaryTreeNode<Integer> child = null;
if (n < t.getValue()) {
child = t.getLeft();
} else {
child = t.getRight();
}
return 1 + calculateDepth(child, n);
}
This specific case doesn't even use any helper functions, because it's simple enough to do without – there are only 4 paths through this code. Your previous implementation hid this by nesting conditionals when the other branch had already been terminated by a throw
or return
, and by unnecessary code duplication.
However, testing a recursion or loop can be difficult. While we could create a fairly complex tree and check for the correct result, we would like a way to check the loop invariant. In a language with higher-order functions, this might be:
public static int calculateDepth(
BinaryTreeNode<Integer> t,
int n)
{
return calculateDepthLoop(t, n, calculateDepthLoop);
}
type Recurser = int(BinaryTreeNode<Integer>, int, Recurser);
private static int calculateDepthLoop(
BinaryTreeNode<Integer> t,
int n,
Recurser recurse)
{
if (t == null) {
throw new TreeException("Value not found in tree!");
}
if (t.getValue() == n) {
return 0;
}
BinaryTreeNode<Integer> child = null;
if (n < t.getValue()) {
child = t.getLeft();
} else {
child = t.getRight();
}
return 1 + recurse(child, n, recurse);
}
Now we could run a test plan like:
calculateDepthLoop(null, ANY, ANY)
throws.
calculateDepthLoop(Tree(x, ANY, ANY), x, ANY) is
0`.
calculateDepthLoop(Tree(x, left, ANY), y, callback)
for y < x
invokes result = callback(left, y, callback)
and returns the result + 1
.
calculateDepthLoop(Tree(x, ANY, right), y, callback)
for x < y
invokes result = callback(right, y, callback)
and returns the result + 1
.
with only 4 tests (one for each path) we can be sure that calculateDepthLoop()
works as expected. We might want a couple more just to be sure that everything works for all valid values of x
and y
. Now we only need another test to check that everything integrates as it should, this can be done with a black-box test of calculateDepth()
, which I'd do by creating a moderately complex tree requiring the function to recurse both left and right and return some value.
Best Answer
Generally, one would better be judicious about using protected access at all. The reasons for that are laid out in answers to prior questions over here: Why is Clean Code suggesting avoiding protected variables?
As for using it the way you think of here - weakening access limitation because it feels like more comfortable to test - it looks like a terribly bad idea.
The "right option" when you test functionality covered by private method is to leave its modifier alone and figure testing approach based on assumption that method is and will stay private.
There could be many cases and reasons when one needs to redesign code for testability. Various difficulties in writing unit tests often serve as good indication for such a need. But, "oh, that method is private" is not one of those indicative difficulties.
I wrote an explanation for this in an answer to prior question as a side note because that other question and answer were focused on different matters. But for the question you asked, it seems to apply fully and directly, so I will simply copy that part over here.
<rant "I think I heard enough whining. Guess it's about time to say loud and clear...">
Private methods are beneficial to unit testing.
Note below assumes that you are familiar with code coverage. If not, take a time to learn, since it's quite useful to those interested in unit testing and in testing at all.
All right, so I've got that private method and unit tests, and coverage analysis telling me that there's a gap, my private method isn't covered by tests. Now...
What do I gain from keeping it private
Since method is private, the only way to proceed is to study the code to learn how it is used through non-private API. Typically, such a study reveals that reason for the gap is that particular usage scenario is missing in tests.
For the sake of completeness, other (less frequent) reasons for such coverage gaps could be bugs in specification / design. I won't dive deep into these here, to keep things simple; suffice to say that if you weaken access limitation "just to make method testable", you'll miss a chance to learn that these bugs exist at all.
Fine, to fix the gap, I add a unit test for missing scenario, repeat coverage analysis and verify that gap is gone. What do I have now? I've got as new unit test for specific usage of non-private API.
New test ensures that expected behavior for this usage won't change without a notice since if it changes, test will fail.
An outside reader may look into this test and learn how it is supposed to use and behave (here, outside reader includes my future self, since I tend to forget the code a month or two after I'm done with it).
New test is tolerant to refactoring (do I refactor private methods? you bet!) Whatever I do to
privateMethod
, I'll always want to testnonPrivateMethod(true)
. No matter what I do toprivateMethod
, there will be no need to modify test because method isn't directly invoked.Not bad? You bet.
What do I loose from weakening access limitation
Now imagine that instead of above, I simply weaken access limitation. I skip the study of the code that uses the method and proceed straight with test that invokes my
exPrivateMethod
. Great? Not!Do I gain a test for specific usage of non-private API mentioned above? No: there was no test for
nonPrivateMethod(true)
before, and there is no such test now.Do outside readers get a chance to better understand usage of the class? No. "- Hey what's the purpose of the method tested here? - Forget it, it's strictly for internal use. - Oops."
Is it tolerant to refactoring? No way: whatever I change in
exPrivateMethod
, will likely break the test. Rename, merge into some other method, change arguments and test will just stop compiling. Headache? You bet!Summing up, sticking with private method brings me a useful, reliable enhancement in unit tests. In contrast, weakening access limitations "for testability" only gives me an obscure, hard to understand piece of test code, which is additionally at permanent risk of being broken by any minor refactoring; frankly what I get looks suspiciously like technical debt.
</rant>