C++ Refactoring – How to Start Refactoring a Mostly-Procedural Application

refactoring

We have a program written in C++ that is mostly procedural, but we do use some C++ containers from the standard library (vector, map, list, etc). We are constantly making changes to this code, so I wouldn't call it a stagnant piece of legacy code that we can just wrap up.

There are a lot of issues with this code making it harder and harder for us to make changes, but I see the three biggest issues being:

  1. Many of the functions do more (way more) than one thing
  2. We violate the DRY principle left and right
  3. We have global variables and global state up the wazoo.

I was thinking we should attack areas 1 and 2 first. Along the way, we can "de-globalize" our smaller functions from the bottom up by passing in information that is currently global as parameters to the lower level functions from the higher level functions and then concentrate on figuring out how to removing the need for global variables as much as possible.

I just finished reading Code Complete 2 and The Pragmatic Programmer, and I learned a lot, but I am feeling overwhelmed. I would like to implement unit testing, change from a procedural to OO approach, automate testing, use a better logging system, fully validate all input, implement better error handling and many other things, but I know if we start all this at once, we would screw ourselves. I am thinking the three I listed are the most important to start with.

Any suggestions are welcome.

We are a team of two programmers mostly with experience with in-house scripting. It is going to be hard to justify taking the time to refactor, especially if we can't bill the time to a client. Believe it or not, this project has been successful enough to keep us busy full time and also keep several consultants busy using it for client work.

Best Answer

My advice would be to be very very careful not to break something that works!

Look on this as a long term incremental project, improve code in the area that is relevant to a current change request and as far as possible leave the rest alone.

Don't get rid of global's just because they are global. Some data is naturally global -- time of day, run number, business date, connection status, shutdown requested etc.

The DRY principal applies mostly to development, once the code has been repeated it makes very little difference until you need to make a major change to the repeated code. So leave this stuff alone until it requires a major change.

Many functions that do more than one thing. -- welcome to the real world. Business requirments are never as simple as the ones you see in the text books, and, are often an amalgam of complex interdependent functionality. If you see a case where a function can be split cleanly into two separate simpler functions then jump on it, but, don't expect to find too many.

Related Topic