Why would you ever use `malloc(0)`

ccoding-standardsmalloc

While reading an answer here, I saw this code:

char ** v = malloc(0);
while ((r = strtok(s, " ")) != NULL) {
    char ** vv = realloc(v, (n+1)*sizeof(*vv));

The thing that bugged me was the call to malloc with an argument of zero. According to the standard, this will return either NULL or a non-NULL pointer that can be successfully passed to free. I know that this does not invoke any problems (except for if you do stuff like if (v == NULL) or similar) but is there any practical reason whatsoever to prefer malloc(0) instead of NULL?

I saw the argument "to indicate the goal of that pointer is to be given to realloc later". To me that sounds like a pretty strange argument. I cannot see the value of that convention at all. First because it's an extra function call that's not needed. And second because the value of telling that you will use realloc later seems almost zero. And according to the answers on this question it does not seem to be any technical benefits whatsoever.

Personally, if I ever felt the need to tell that realloc would be used later I'd do this:

char **v = NULL; // Will be realloced later

or give it a name that makes that intention clear. I would not use a strange unmotivated function call. But IMHO, just initializing it to NULL is a very clear indication that SOMETHING will be done to it later on. I don't see the value of knowing in advance that it's realloc. What's next? A convention saying that malloc(0*0) indicates that strdup will be used later?

So to sum it up, these are the cons that I know of:

An extra unnecessary function call
Looks weird if you don't know that it indicates later realloc (and still looks weird to me anyway)
May return a valid pointer that should not be dereferenced (just strange)
May allocate memory that you cannot use (quite pointless)
Less predictable. You may get NULL. You may get something else.

Pros:

The only sensible explanation I can think of that this habit may come from is that it is something from very early C, before NULL became a part of stddef.h and calling malloc(0) was the only portable way to get a pointer that was guaranteed to be safe to be passed to free without allocating anything. Could that be the case?

So is this really an accepted convention for indicating a later realloc? If so, is it a good convention? Does it have any benefits that I fail to see?

There is a related question on SO: What's the point of malloc(0)?

Clarification:

I'm not talking about malloc(n) where n happens to be zero in some cases. I'm talking about calling malloc(0) on purpose.

Best Answer

In my opinion, that is a horrible paradigm.

I see absolutely no pros and at least three substantial cons.

Needless code complexity

Since malloc(0) can return NULL, the code has to be written to handle that anyway.

And since malloc(0) can also produce a non-NULL result, the code also has to be written in a way to handle a non-NULL pointer.

Pointer state loses all meaning

By potentially producing a pointer that can not be dereferenced, malloc(0) removes a critical distinction between NULL and non-NULL pointers: the distinction where NULL pointers mean "there's nothing here" and non-NULL pointers mean "here's some actual valid data".

The NULL/non-NULL state of a pointer loses all information.

Using malloc(0) renders the almost universal use of code such as if (ptr) ... or if (ptr != NULL) ... useless by removing information from the state of a pointer simply being non-NULL. This simple code

if ( ptr )
{
    ...

would have to be

if ( ptr && pointerActuallyPointsToActualObject )
{
    ...

And now there are two values - the pointer and its "validity flag" that have to be kept in sync and passed around.

Code such as

Foo *dataPtr = getNewFoo();

would no longer work should the prospective new Foo * being returned from the function be initialized with malloc(0) because a non-NULL pointer would no longer mean "no new Foo for you!".

Substantially Increased Potential for Heisenbugs

Any non-NULL pointer that can not be safely dereferenced creates serious potential Heisenbugs.

In general, any erroneous dereference of a NULL pointer results in an immediate failure where the cause is obvious. Dereferencing a non-NULL pointer that can not be safely dereferenced is extremely likely to result in corrupt data and/or a corrupt heap, laying a land mine or twelve that will cause later failures in what can be totally unrelated code.

You code will have bugs. There's nothing but downside in using a code construct that makes those bugs more likely to occur along with making them harder to find when they do occur.

Related Solutions

Correct For Loop Design

You forgot to mention wheter your string variables in Felix start with index 0 or 1. Searching for that in the web, its additional job for readers. And affects the way your example is evaluated.

Anyway. Are you sure that:

for(i=0; predicate(i); increment(i))

In C: "The predicate is tested after the increment, but the terminating increment is not universally valid!"

Traslates to this:

i=0
continue:
  body
  increment(i)
  if not predicate(i) goto break
  goto continue
break:

Instead of this:

continue:
  i=0
  if not predicate(i) goto break
  body
  increment(i)
  goto continue
break:

Since your for loop its more specific like pascal, you may consider how should be translated and evaluated in case the index value is equal or lesser to the initial value.

Usually, if the initial value, and final value are the same, the loop is executed once, if the final value is greater that the initial value, the loop is not executed.

C Programming – When to Check Pointers for NULL

Invalid null pointers can either be caused by programmer error or by runtime error. Runtime errors are something a programmer can't fix, like a malloc failing due to low memory or the network dropping a packet or the user entering something stupid. Programmer errors are caused by a programmer using the function incorrectly.

The general rule of thumb I've seen is that runtime errors should always be checked, but programmer errors don't have to be checked every time. Let's say some idiot programmer directly called graph_get_current_column_color(0). It will segfault the first time it's called, but once you fix it, the fix is compiled in permanently. No need to check every single time it's run.

Sometimes, especially in third party libraries, you'll see an assert to check for the programmer errors instead of an if statement. That allows you to compile in the checks during development, and leave them out in production code. I've also occasionally seen gratuitous checks where the source of the potential programmer error is far removed from the symptom.

Obviously, you can always find someone more pedantic, but most C programmers I know favor less cluttered code over code that is marginally safer. And "safer" is a subjective term. A blatant segfault during development is preferable to a subtle corruption error in the field.

Clarification:

Best Answer

Related Solutions

Correct For Loop Design

C Programming – When to Check Pointers for NULL

Related Topic