Variables Scope – Should I Declare Variables at the Top of the Function for Reasons Other Than the Scope Rules?

scopevariables

In JavaScript, one should declare all variables at the beginning of the function to mitigate the risk of mistakes related to the fact that the scope of variables is a function. The following code illustrates this non-intuitive scope by actually printing "Hello, World" to the console:

var demo = function () {
    if (false) {
        var hello = "Hello, World";
    }

    console.log(hello);
};

Similarly, in VB6 (but not in VB.NET), the variables have a scope of a function/sub, which also means that they would better be declared at the top of the function/sub block.

C, C++, Java, C#, Python or PHP don't have any reason to move the declaration of variables to the top of the method/function: instead, variables are preferably declared close to the location where they are used.

Are there languages where it is better to declare the variables at the top of the function-type block (instead of declaring them as close as possible to the location where they are used) for a reason other than function-level scoping?

In other words, is the non-intuitive scoping similar to the one used by VB6 and JavaScript the only reason to move the declaration of variables at the top of a function?

Best Answer

Each computer language has its strengths and pitfalls. Each has its own coding standards/practices often accepted, built and dictated by the community around it. Because programming is a precise art I would say, unless there is a good reason to do something, you shouldn't.

Now there are very good reasons to declare variables close to where they are used:

Makes it easy to determine the type, when reading code.
Makes it easy to delete code including the declaration all together.

But as you suggest, for some languages there are better reasons to not do that, for sake of clarity or scope. Another reason could be to reserve memory or registers early on, again it depends on the language you are using. A new language may come along with non-C like semantics and may require a different approach to where variables get declared.

I would also always recommend following the coding standards established by the community for your language of choice.

Related Solutions

The reason for using lowercase for the first word in a local variable (eg, employeeCount, firstName)

That naming convention is often used when people want to be able to give a variable the same name as its type. For example:

Employee employee;

Some languages even enforce that capitalization. This prevents having to use annoying variable names like MyEmployee, CurrentEmployee, EmployeeVar, etc. You can always tell if something is a type or a variable, just from the capitalization. That prevents confusion in situations like:

employee.function(); // instance method
Employee.function(); // static method

Also, in English, nouns are not generally capitalized, so you can't really claim your capitalization is "proper."

So what does that have to do with your situation? You obviously have no trouble reading your own code, but by keeping things as consistent as possible, you reduce the mental workload of anyone else needing to read your code. In coding as in writing, you adjust your style to match the readers.

When was block scope for variables introduced to C, and why is it still frowned upon

TL;DR: It's not an issue of block scoping, but rather a misunderstanding of how pointers should be handled.

Courtesy of the vulnerability listing that you posted, let's take a look at the code in question.

489 int
490 OBJ_obj2txt(char *buf, int buf_len, const ASN1_OBJECT *a, int no_name)
491 {
...
494         char *bndec = NULL;
...
516         len = a->length;
...
519         while (len > 0) {
...
570                         bndec = BN_bn2dec(bl);
571                         if (!bndec)
572                                 goto err;
573                         i = snprintf(buf, buf_len, ".%s", bndec);
...
598         }
...
601         free(bndec);
...
609 }

And if we delve into the declaration of BN_bn2dec we see a comment of:

/* Must 'free' the returned data */

What you see as a block scoping issue, I see as a poor understanding of pointer operations as the root issue.

This diff revision shows when the *bndec was moved from inside the while loop to outside the while loop. The error was also introduced at the same time when the free(bndec) statement was moved outside the loop.

Where *bndec is declared for this function is a bit irrelevant. It can safely be defined either outside or inside the while loop. And if memory serves correctly, the actual declaration will be moved to the top of the function by the compiler anyway.

Declaring *bndec inside the loop merely would have forced an error if the free() call remained outside of the loop. It remains to be seen whether that error would have been sufficient to alert the programmer to the fallacy behind their refactoring logic.

Being a bit of a pessimist when it comes to most programmers and pointers, I'm going to posit that the free accessing an out-of-scope variable error wouldn't have helped them discover the error of their ways. Instead, they would have taken the naive (and incorrect) solution of resolving the error by merely moving the variable declaration outside of the block. And that's why I disagree with the contention that the root cause of this is a block scoping issue.

If C happened to have garbage collection, then block scoping would have been of more value in this case. But it doesn't, and that line of thought becomes pointless speculation.

Unfortunately, we don't really know why the refactoring programmer chose to move that free(bndec) statement. All that they left via check-in comments was this:

Clean up some of the nightmare of string and pointer arithmatic in this nasty function.
This gets rid of the nasty tmp variables used to hold temporary strings and the DECIMAL_SIZE hack. it gets rid of the rather pointless null checks for buf (since the original code dereferences it before checking). It also gets rid of the insane possibility this could return -1 when stuff is using the return values to compute lengths All the failure cases now return 0 and an empty string like the first error case in the original code.

And while I respect their dedication to clean code, I'm not certain I fully agree with their removing of various checks from the code. Having written a fair amount of C code with a fairly large team of developers, my experience is that defensive coding practices are almost always warranted. And while a developer may not be able to justify the time expense to add those defensive practices, I can't say that I have ever found valid grounds to remove those defenses.

To delve into an equal amount of hyperbole as the refactoring programmer, I find their check-in comment to smack a little bit too much of hubris and not enough of humility. That hubris is likely what led them to assume they "knew what they were doing" ^(TM) in moving the free(bndec) statement, and they didn't mentally walk through what that was going to do to the code.

This error was introduced with the refactoring not because of the block scoping, but because of the failure to understand how pointers operate and the failure to understand how pointers relate to memory.

While I think I have established why block scoping is irrelevant to this issue, I want to address what you see as "great resistance to block scoped variables within old-school projects". And to my knowledge, ANSI C has always had block scoping available for variables. Going out on a limb, I suspect K&R C also had it.

Keep in mind that C is not a gentle language for the naive or the initiate. It was created in order to provide an additional layer of abstraction over B and assembly level programming. Programmers writing in C were still expected to have an understanding of the underlying assembler that was being generated. For example, variables are not automatically initialized, and many a programmer has been burned by that lack of initialization. C expects the programmer to understand what that variable declaration maps to in memory. C also has more than its fair share of unwritten style rules that are considered to be "good form."

Moving all variable declarations is one of those unwritten style rules, and the pragmatic reason goes back to the lack of automatic initialization. Disciplined C programmers declared all of their variables at the beginning of the function so they could make a quick visual scan to verify that all declared variables had been initialized. It's an easy way to make sure that a common mistake (e.g. lack of initialization) has been avoided.

Best Answer

Related Solutions

The reason for using lowercase for the first word in a local variable (eg, employeeCount, firstName)

When was block scope for variables introduced to C, and why is it still frowned upon

Related Topic