Need help eliminating dead code paths and variables from C source code

cgcc

I have a legacy C code on my hands, and I am given the task to filter dead/unused symbols and paths from it. Over the time there were many insertions and deletions, causing lots of unused symbols. I have identified many dead variables which were only being written to once or twice, but were never being read from.

Both blackbox/whitebox/regression testing proved that dead code removal did not affected any procedures. (We have a comprehensive test-suite).

But this removal was done only on a small part of code. Now I am looking for some way to automate this work.

We rely on GCC to do the work.

P.S. I'm interested in removing stuff like:

  1. variables which are being read just for the sake of reading from them.
  2. variables which are spread across multiple source files and only being written to.

For example:

file1.c:
int i;

file2.c:
extern int i;
....
i=x;

Best Answer

You should check what clang's scan-build can do for your build. It can sometimes determine if a variable is only written to, but the results is never used (Dead store in scan-build's language).

The problem all these tools have is of course that they cannot determine if a call like

int status = do_something(input_var);

does more than assigns some value to status, e.g. do_something might modify not only input_var, but also global state. So when cleaning out stuff make sure you check stuff that relies on side effects.

scan-build integrates nicely into make-based build process, e.g.

$ scan-build ./configure
$ scan-build make

would run your code through scan-build when using autotools.

You could also try cppcheck, but in my experience it doesn't find more than what scan-build finds (and scan-build can find other issues which cppcheck cannot see).