When was block scope for variables introduced to C, and why is it still frowned upon

cmemory usagescope

In light of the recent OBJ_obj2txt vulnerability in LibreSSL (which was found during the OpenSMTPD audit, and does not affect OpenSSL), it came to my attention that the memory leak issue likely resulted from some earlier code refactoring, where the block scoped variable char *bndec was moved out to be function scoped instead.

I know first-hand that there is this great resistance to block scoped variables within old-school projects like OpenBSD, but what other justification would there be to move the char *bndec declaration?

More broadly, when was block scoping introduced for variables in C? All I could find is that it was already part of C89. Is that where it started, or was it also part of an earlier spec?

Best Answer

TL;DR: It's not an issue of block scoping, but rather a misunderstanding of how pointers should be handled.

Courtesy of the vulnerability listing that you posted, let's take a look at the code in question.

489 int
490 OBJ_obj2txt(char *buf, int buf_len, const ASN1_OBJECT *a, int no_name)
491 {
...
494         char *bndec = NULL;
...
516         len = a->length;
...
519         while (len > 0) {
...
570                         bndec = BN_bn2dec(bl);
571                         if (!bndec)
572                                 goto err;
573                         i = snprintf(buf, buf_len, ".%s", bndec);
...
598         }
...
601         free(bndec);
...
609 }

And if we delve into the declaration of BN_bn2dec we see a comment of:

/* Must 'free' the returned data */

What you see as a block scoping issue, I see as a poor understanding of pointer operations as the root issue.

This diff revision shows when the *bndec was moved from inside the while loop to outside the while loop. The error was also introduced at the same time when the free(bndec) statement was moved outside the loop.

Where *bndec is declared for this function is a bit irrelevant. It can safely be defined either outside or inside the while loop. And if memory serves correctly, the actual declaration will be moved to the top of the function by the compiler anyway.

Declaring *bndec inside the loop merely would have forced an error if the free() call remained outside of the loop. It remains to be seen whether that error would have been sufficient to alert the programmer to the fallacy behind their refactoring logic.

Being a bit of a pessimist when it comes to most programmers and pointers, I'm going to posit that the free accessing an out-of-scope variable error wouldn't have helped them discover the error of their ways. Instead, they would have taken the naive (and incorrect) solution of resolving the error by merely moving the variable declaration outside of the block. And that's why I disagree with the contention that the root cause of this is a block scoping issue.

If C happened to have garbage collection, then block scoping would have been of more value in this case. But it doesn't, and that line of thought becomes pointless speculation.

Unfortunately, we don't really know why the refactoring programmer chose to move that free(bndec) statement. All that they left via check-in comments was this:

Clean up some of the nightmare of string and pointer arithmatic in this nasty function.
This gets rid of the nasty tmp variables used to hold temporary strings and the DECIMAL_SIZE hack. it gets rid of the rather pointless null checks for buf (since the original code dereferences it before checking). It also gets rid of the insane possibility this could return -1 when stuff is using the return values to compute lengths All the failure cases now return 0 and an empty string like the first error case in the original code.

And while I respect their dedication to clean code, I'm not certain I fully agree with their removing of various checks from the code. Having written a fair amount of C code with a fairly large team of developers, my experience is that defensive coding practices are almost always warranted. And while a developer may not be able to justify the time expense to add those defensive practices, I can't say that I have ever found valid grounds to remove those defenses.

To delve into an equal amount of hyperbole as the refactoring programmer, I find their check-in comment to smack a little bit too much of hubris and not enough of humility. That hubris is likely what led them to assume they "knew what they were doing" ^(TM) in moving the free(bndec) statement, and they didn't mentally walk through what that was going to do to the code.

This error was introduced with the refactoring not because of the block scoping, but because of the failure to understand how pointers operate and the failure to understand how pointers relate to memory.

While I think I have established why block scoping is irrelevant to this issue, I want to address what you see as "great resistance to block scoped variables within old-school projects". And to my knowledge, ANSI C has always had block scoping available for variables. Going out on a limb, I suspect K&R C also had it.

Keep in mind that C is not a gentle language for the naive or the initiate. It was created in order to provide an additional layer of abstraction over B and assembly level programming. Programmers writing in C were still expected to have an understanding of the underlying assembler that was being generated. For example, variables are not automatically initialized, and many a programmer has been burned by that lack of initialization. C expects the programmer to understand what that variable declaration maps to in memory. C also has more than its fair share of unwritten style rules that are considered to be "good form."

Moving all variable declarations is one of those unwritten style rules, and the pragmatic reason goes back to the lack of automatic initialization. Disciplined C programmers declared all of their variables at the beginning of the function so they could make a quick visual scan to verify that all declared variables had been initialized. It's an easy way to make sure that a common mistake (e.g. lack of initialization) has been avoided.

Best Answer

Related Solutions

C Syntax – Design Choices for Arrays, Pointers, and Functions

Are Block-Scope Variables in C/C++ Stacked Only if the Block is Executed?

Related Topic