C Programming – When to Check Pointers for NULL

cnullvalidation

Summary:

Should a function in C always check to make sure it is not dereferencing a NULL pointer? If not when is it appropriate to skip these checks?

Details:

I've been reading some books about programming interviews and I'm wondering what the appropriate degree of input validation for function arguments in C is? Obviously any function that takes input from a user needs to perform validation, including checking for a NULL pointer before dereferencing it. But what about in the case of a function within the same file that you don't expect to expose through your API?

For example the following appears in the source code of git:

static unsigned short graph_get_current_column_color(const struct git_graph *graph)
{
    if (!want_color(graph->revs->diffopt.use_color))
        return column_colors_max;
    return graph->default_column_color;
}

If *graph is NULL then a null pointer will be dereferenced, probably crashing the program, but possibly resulting in some other unpredictable behavior. On the other hand the function is static and so maybe the programmer already validated the input. I don't know, I just selected it at random because it was a short example in an application program written in C. I've seen many other places where pointers are used without checking for NULL. My question is general not specific to this code segment.

I saw a similar question asked within the context of exception handing. However, for a unsafe language such as C or C++ there is no automatic error propagation of unhandled exceptions.

On the other hand I have seen lots of code in open source projects (such as the example above) that does not do any checks of pointers before using them. I'm wondering if anyone has thoughts on guidelines for when to put checks in a function vs. assuming that the function was called with correct arguments.

I'm interested in this question in general for writing production code. But I'm also interested within the context of programming interviews. For instance many algorithm textbooks (such as CLR) tend to present the algorithms in pseudocode without any error checking. However, while this is good for understanding the core of an algorithm it's obviously not a good programming practice. So I would not want to tell an interviewer that I was skipping error checking to simplify my code examples (as a textbook might). But I also would not want to appear to produce inefficient code with excessive error checking. For instance the graph_get_current_column_color could have been modified to check *graph for null but its not clear what it would do if *graph was null, other than it should not dereference it.

Best Answer

Invalid null pointers can either be caused by programmer error or by runtime error. Runtime errors are something a programmer can't fix, like a malloc failing due to low memory or the network dropping a packet or the user entering something stupid. Programmer errors are caused by a programmer using the function incorrectly.

The general rule of thumb I've seen is that runtime errors should always be checked, but programmer errors don't have to be checked every time. Let's say some idiot programmer directly called graph_get_current_column_color(0). It will segfault the first time it's called, but once you fix it, the fix is compiled in permanently. No need to check every single time it's run.

Sometimes, especially in third party libraries, you'll see an assert to check for the programmer errors instead of an if statement. That allows you to compile in the checks during development, and leave them out in production code. I've also occasionally seen gratuitous checks where the source of the potential programmer error is far removed from the symptom.

Obviously, you can always find someone more pedantic, but most C programmers I know favor less cluttered code over code that is marginally safer. And "safer" is a subjective term. A blatant segfault during development is preferable to a subtle corruption error in the field.