C++ Refactoring – Refactoring Large Legacy C++ Codebase Mixed with C Style

Architecturecc++11refactoring

Background:

Currently I am working on a C++ project. It's for business use, so this project is not open sourced. However, this is a project with a fairly large code base (around 10k lines of code) and a couple of external linked libraries and binaries. The problem is nearly 80% of the code base is written in C style rather than C++ style, but the main interface is written in C++11 standard and needs to be compiled in C++11. One of the most inconvenient parts of the C style code is plenty of goto statement are used, and I can tell the purpose of using goto is to ensure the code is thread safe and clean up the garbage right in time. (They worked) When I first learned C++, I started with C++11, and I think RAII is the modern fashion to replace the hacky style to ensure the code is thread safe and resources being used are cleaned up in time.

Puzzle:

I am assigned to add a series of new functions in the project. Some of my changes have to make in these C style code base. And if I changed one of them, why not refactor all of them in C++1x style? The conflict is given limited time, I can't finish refactoring all of them, not even 80%. And these C style code bases are also using some C style based libraries, I don't know if I changed my code from C style to C++ style, will these external libraries work as before? Will there be equivalent C++ libraries to replace those deprecated C libraries? So my puzzles are: Given limited time, how to refactor large code base legacy C++ project from C style to C++ style effectively? What is the trade off? Is the trade off worth considering?

Best Answer

C code using the goto cleanup style like this:

int function(int argument)
{
  int result;
  opaque_handle handle = NULL;
  char* text = NULL;
  size_t size;

  handle = extlib_create();
  if (!handle) {
    result = extlib_failed;
    goto cleanup;
  }
  size = extlib_size(handle, argument);
  text = malloc(size);
  if (!text) {
    result = oom;
    goto cleanup;
  }
  extlib_read(handle, text, size, argument);
  // etc.

  result = success;
  goto cleanup;

cleanup:
  if (handle) extlib_destroy(handle);
  if (text) free(text);
  return result;
}

can always refactored to C++ by wrapping each resource in a RAII wrapper and perhaps using some exception wrappers.

class extlib_handle {
  opaque_handle handle;
public:
  explicit extlib_handle(opaque_handle handle) : handle(handle) {}
  extlib_handle(const extlib_handle&) = delete;
  extlib_handle& operator =(const extlib_handle&) = delete;
  ~extlib_handle() { extlib_destroy(handle); }
  operator opaque_handle() const { return handle; }
};

template <typename V>
bool resize_nothrow(V& v, typename V::size_type sz)
try {
  v.resize(sz);
  return true;
} catch (std::bad_alloc&) {
  return false;
}

int function(int argument)
{
  extlib_handle handle(extlib_create());
  if (!handle) return extlib_failed;

  size_t size = extlib_size(handle, argument);
  std::vector<char> text;
  if (!resize_nothrow(text, size)) return oom;

  extlib_read(handle, text.data(), size, argument);
  // etc.

  return success;
}

This refactoring does not affect any code outside the function in any way, and allows you to slowly build up a collection of reusable RAII wrappers.

You can use this technique to refactor exactly the parts of the C code you actually touch, not any others. Touching any others introduces unnecessary risks (you could make a mistake and add a bug in code that would otherwise be untouched - this risk is mitigated in the code you have to touch anyway because there could already be new bugs there anyway, so it will receive more attention from testers, and because it makes writing new code less error-prone).

As usual, if you can, write tests. But chances are the code is not amenable to testing because the 3rd party dependencies can't be substituted. Refactoring for that is a time-consuming task.

Related Topic