Best practices for retrofitting legacy code with automated tests

legacytesting

I'm about to take on the task of reimplementing an already defined interface (a set of C++ header files) in a relatively large and old code base. Before doing this, I would like to have as complete test coverage as possible, so I can detect reimplementation errors as early and easily as possible. The problem is that the already existing code base was not designed to be easily testable, with (very) large classes and functions, a high degree of coupling, functions with (many) side effects, etc.

It would be nice to hear of any previous experience with similar tasks, and some good and concrete tips on how you went about retrofitting automated tests (unit, integrations, regression, etc.) to your legacy code.

Best Answer

First of all, get and read Working Effectively With Legacy Code by Michael Feathers - it is an indispensable aid for such tasks.

Then, a few notes:

  • do you have a precise specification / contract for the interface, or do you practically only have the existing implementation as "specification"? In the former case it is easier to do a complete rewrite from scratch, in the latter it is difficult to impossible.
  • if you want to reimplement the interface, the most useful way to spend your testing resources is to write tests only against the interface. Of course, this does not qualify as unit testing in the strict sense, rather functional/acceptance testing, but I am not a purist :-) However, these tests are reusable and enable you to directly compare results from the two implementations side by side.
  • overall, I would favor refactoring the existing code rather than rewriting from scratch, unless it is completely unmaintainable. (But in this case, how are you going to write unit tests against it anyway?) Check out this post from Joel for a more detailed discussion on the subject. Having created a set of acceptance tests against the interface gives you a thin, but useful safety net, against which you can start cautiously refactoring the existing code towards making it unit testable (using the ideas from Feathers' book).
Related Topic