While reading an answer here, I saw this code:
char ** v = malloc(0);
while ((r = strtok(s, " ")) != NULL) {
char ** vv = realloc(v, (n+1)*sizeof(*vv));
The thing that bugged me was the call to malloc with an argument of zero. According to the standard, this will return either NULL or a non-NULL pointer that can be successfully passed to free. I know that this does not invoke any problems (except for if you do stuff like if (v == NULL)
or similar) but is there any practical reason whatsoever to prefer malloc(0)
instead of NULL
?
I saw the argument "to indicate the goal of that pointer is to be given to realloc
later". To me that sounds like a pretty strange argument. I cannot see the value of that convention at all. First because it's an extra function call that's not needed. And second because the value of telling that you will use realloc later seems almost zero. And according to the answers on this question it does not seem to be any technical benefits whatsoever.
Personally, if I ever felt the need to tell that realloc
would be used later I'd do this:
char **v = NULL; // Will be realloced later
or give it a name that makes that intention clear. I would not use a strange unmotivated function call. But IMHO, just initializing it to NULL is a very clear indication that SOMETHING will be done to it later on. I don't see the value of knowing in advance that it's realloc
. What's next? A convention saying that malloc(0*0)
indicates that strdup
will be used later?
So to sum it up, these are the cons that I know of:
- An extra unnecessary function call
- Looks weird if you don't know that it indicates later realloc (and still looks weird to me anyway)
- May return a valid pointer that should not be dereferenced (just strange)
- May allocate memory that you cannot use (quite pointless)
- Less predictable. You may get NULL. You may get something else.
Pros:
- ?
The only sensible explanation I can think of that this habit may come from is that it is something from very early C, before NULL
became a part of stddef.h
and calling malloc(0)
was the only portable way to get a pointer that was guaranteed to be safe to be passed to free
without allocating anything. Could that be the case?
So is this really an accepted convention for indicating a later realloc? If so, is it a good convention? Does it have any benefits that I fail to see?
There is a related question on SO: What's the point of malloc(0)?
Clarification:
I'm not talking about malloc(n)
where n
happens to be zero in some cases. I'm talking about calling malloc(0)
on purpose.
Best Answer
In my opinion, that is a horrible paradigm.
I see absolutely no pros and at least three substantial cons.
Needless code complexity
Since
malloc(0)
can returnNULL
, the code has to be written to handle that anyway.And since
malloc(0)
can also produce a non-NULL
result, the code also has to be written in a way to handle a non-NULL
pointer.Pointer state loses all meaning
By potentially producing a pointer that can not be dereferenced,
malloc(0)
removes a critical distinction betweenNULL
and non-NULL
pointers: the distinction whereNULL
pointers mean "there's nothing here" and non-NULL
pointers mean "here's some actual valid data".The
NULL
/non-NULL
state of a pointer loses all information.Using
malloc(0)
renders the almost universal use of code such asif (ptr) ...
orif (ptr != NULL) ...
useless by removing information from the state of a pointer simply being non-NULL
. This simple codewould have to be
And now there are two values - the pointer and its "validity flag" that have to be kept in sync and passed around.
Code such as
would no longer work should the prospective new
Foo *
being returned from the function be initialized withmalloc(0)
because a non-NULL
pointer would no longer mean "no new Foo for you!".Substantially Increased Potential for Heisenbugs
Any non-
NULL
pointer that can not be safely dereferenced creates serious potential Heisenbugs.In general, any erroneous dereference of a
NULL
pointer results in an immediate failure where the cause is obvious. Dereferencing a non-NULL
pointer that can not be safely dereferenced is extremely likely to result in corrupt data and/or a corrupt heap, laying a land mine or twelve that will cause later failures in what can be totally unrelated code.You code will have bugs. There's nothing but downside in using a code construct that makes those bugs more likely to occur along with making them harder to find when they do occur.