Unit Testing – How to Refactor a Large Method Without Breaking Anything

formal-methodslegacyrefactoringunit testing

I'm currently refactoring a part of a large codebase with no unit tests whatsoever. I tried to refactor code the brute way, i.e. by trying to guess what the code is doing and what changes wouldn't change it meaning, but without success: it randomly breaks features all around the codebase.

Note that refactoring includes moving legacy C# code to a more functional style (the legacy code doesn't use any of the features of .NET Framework 3 and later, including LINQ), adding generics where the code may benefit from them, etc.

I can't use formal methods, given how much would they cost.

On the other hand, I presume that at least "Any refactored legacy code shall come with unit tests" rule should be strictly followed, no matter how much would it cost. The problem is that when I refactor a tiny part of a 500 LOC private method, adding unit tests appears to be a difficult task.

What can help me in knowing which unit tests are relevant for a given piece of code? I'm guessing that static analysis of the code would somehow be helpful, but what are the tools and techniques I can use to:

  • Know exactly what unit tests should I create,

  • And/or know if the change I've done affected the original code in a way that it is executing differently from now?

Best Answer

I have had similar challenges. The Working with Legacy Code book is a great resource, but there's an assumption that you can shoe-horn in unit tests to support your work. Sometimes that's just not possible.

In my archeology work (my term for maintenance on legacy code like this), I follow a similar approach as to what you outlined.

  • Start with a solid understanding of what the routine is currently doing.
  • At the same time, identify what the routine was supposed to be doing. Many think this bullet and the previous are the same, but there is a subtle difference. Often times, if the routine was doing what it was supposed to be doing then you wouldn't be applying maintenance changes.
  • Run some samples through routine and make sure you hit the boundary cases, relevant error paths, along with the mainline path. My experience is that the collateral damage (feature breakage) comes from boundary conditions not being implemented in exactly the same way.
  • After those sample cases, identify what's being persisted that doesn't necessarily need to be persisted. Again, I have found that it's side-effects like this that lead to collateral damage elsewhere.

At this point, you should have a candidate list of what's been exposed and / or manipulated by that routine. Some of those manipulations are likely to be inadvertent. Now I use findstr and the IDE to understand what other areas might reference the items in the candidate list. I'll spend some time understanding how those references are working and what their nature is.

Finally, once I've deluded myself into thinking I understand the impacts of the original routine, I'll make my changes one-at-a-time and rerun the analysis steps I outlined above to verify that the change is working as I expect it to work. I specifically try to avoid changing multiple things at once as I have found this blows up on me when I try and verify the impact. Sometimes you can get away with multiple changes, but if I can follow a one-at-a-time route, that's my preference.

In short, my approach is similar to what you laid out. It's a lot of prep work; then make circumspect, individual changes; and then verify, verify, verify.