TDD Red-Green-Refactor: Testing Methods That Become Private

encapsulationrefactoringtdd

as far as I understand it, most people seem to agree that private methods should not be tested directly, but rather through whatever public methods call them. I can see their point, but I have some problems with this when I try to follow the "Three Laws of TDD", and use the "Red – green – refactor" cycle. I think it's best explained by an example:

Right now, I need a program that can read a file (containing tab-separated data) and filter out all columns that contain non-numerical data. I guess there's probably some simple tools available already to do this, but I decided to implement it from scratch myself, mostly because I figured it could be a nice and clean project for me to get some practice with TDD.

So, first, I "put the red hat on", that is, I need a test that fails. I figured, I'll need a method that finds all the non-numerical fields in a line. So I write a simple test, of course it fails to compile immediately, so I start writing the function itself, and after a couple of cycles back and forth (red/green) I have a working function and a complete test.

Next, I continue with a function, "gatherNonNumericColumns" that reads the file, one line at a time, and calls my "findNonNumericFields"-function on each line to gather up all the columns that eventually must be removed. A couple of red-green-cycles, and I'm done, having again, a working function and a complete test.

Now, I figure I should refactor. Since my method "findNonNumericFields" was designed only because I figured I would need it when implementing "gatherNonNumericColumns", it seems to me that it would be reasonable to let "findNonNumericFields" become private. However, that would break my first tests, since they would no longer have access to the method they were testing.

So, I end up with a private methods, and a suite of test that test it. Since so many people advice that private methods should not be tested, it feels like I've painted myself into a corner here. But where exactly did I fail?

I gather I could have started out at a higher level, writing a test that tests what will eventually become my public method (that is, findAndFilterOutAllNonNumericalColumns), but that feels somewhat counter to the whole point of TDD (at least according to Uncle Bob): That you should switch constantly between writing tests and production code, and that at any point in time, all your tests worked within the last minute or so. Because if I start out by writing a test for a public method, there will be several minutes (or hours, or even days in very complex cases) before I get all the details in the private methods to work so that the test testing the public method passes.

So, what to do? Is TDD (with the rapid red-green-refactor cycle) simply not compatible with private methods? Or is there a fault in my design?

Best Answer

Units

I think I can pinpoint exactly where the problem started:

I figured, I'll need a method that finds all the non-numerical fields in a line.

This should be immediately followed with asking yourself "Will that be a separate testable unit to gatherNonNumericColumns or part of the same one?"

If the answer is "yes, separate", then your course of action is simple: that method needs to be public on an appropriate class, so it can be tested as a unit. Your mentality is something like "I need to test drive out one method and I also need to test drive out another method"

From what you say though, you figured that the answer is "no, part of the same". At this point, your plan should no longer be to fully write and test findNonNumericFields then write gatherNonNumericColumns. Instead, it should be simply to write gatherNonNumericColumns. For now, findNonNumericFields should just be a likely part of the destination you have in mind when you're choosing your next red test case and doing your refactoring. This time your mentality is "I need to test drive out one method, and while I do so I should keep in mind that my finished implementation will probably include this other method".


Keeping a short cycle

Doing the above should not lead to the problems you describe in your penultimate paragraph:

Because if I start out by writing a test for a public method, there will be several minutes (or hours, or even days in very complex cases) before I get all the details in the private methods to work so that the test testing the public method passes.

At no point does this technique require you to write a red test which will only turn green when you implement the entirety of findNonNumericFields from scratch. Much more likely, findNonNumericFields will start as some code in-line in the public method you're testing, which will be built up over the course of several cycles and eventually extracted during a refactoring.


Roadmap

To give an approximate roadmap for this particular example, I don't know the exact test cases you used, but say you were writing gatherNonNumericColumns as your public method. Then most likely the test cases would be the same as the ones you wrote for findNonNumericFields, each one using a table with only one row. When that one-row scenario was fully implemented, and you wanted to write a test to force you to extract out the method, you'd write a two-row case which would require you to add your iteration.