While case folding is fairly trivial in English, it's much less so in some other languages.
If a German programmer uses ß
in a variable name, what are you going to consider the upper-case equivalent? Just FYI, "ß" is only ever used in lower case. OTOH, "ss" is equivalent -- would you consider a compiler obliged to match them? When you get into Unicode, you get even more interesting problems, such as characters with pre-composed diacritical marks versus separate combining diacriticals. Then you get to some Arabic scripts, with three separate forms of many letters, rather than just two.
In the dark ages most programming languages were case-insensitive almost out of necessity. For example, Pascal started out on Control Data mainframes, which used only six bits per character (64 codes, total). Most such machines used the "CDC Scientific" character set, which only contained upper-case characters. You could switch to other character sets, but most had either upper-case or lower-case, but not both -- but used the same codes for both. The same was true of the ancient Baudot codes and such considered standard in the beginning days of COBOL, FORTRAN, BASIC, etc. By the time more capable hardware was widely available, their being case-insensitive was so thoroughly ingrained that changing it was impossible.
Over time, the real difficulty of case-insensitivity has become more apparent, and language designers have mostly decided ("realized" would probably be a more accurate term) that when/if people really want case insensitivity, that it's better handled by ancillary tools than in the language itself.
At least IMO, compiler should take input exactly as presented, not decide that "you wrote this, but I'm going to assume you really meant something else." If you want translations to happen, you're better off doing them separately, with tools built to handle that well.
In English the semicolon is used to separate items in a list of statements, for example
She saw three men: Jamie, who came from New Zealand; John, the
milkman's son; and George, a gaunt kind of man.
When programming you are separating a number of statements and using a full stop could be easily confused for a decimal point. Using the semicolon provides an easy to parse method of separating the individual program statements while remaining close to normal English punctuation.
Edit to add
In the early days when memory was expensive, processing slow, and the first programming languages were being devised, there was a need to split the program up into separate statements for processing. Some languages required that each statement was placed on a line so that the carriage return could act as the statement delimiter. Other languages allowed a more free format to the text layout and so required a specific delimiter character. This character was chosen to be the semicolon, most likely because of the similarity to its use in the English language (this has to be a supposition; I was not there at the time) and as it did not produce a conflict with the other punctuation marks and symbols that were required for mathematical or other syntactic purposes.
Edit again
The need for some terminator character goes back to the requirements for parsing the language text. The early compilers were written in assembly language or, in some cases, directly in hand crafted binary machine instructions. Having a special character that identified the end of the statement and delimited the chunk of text that is being processed makes the processing that much easier. As I said above, other languages have used the carriage return or brackets. The Algol, Pascal, Ada, BCPL, B, C, PL/M, and other families of languages happen to use the semicolon. As to which one was first to use this particular character, I do not go back far enough in history to remember. Its choice and adoption makes perfect sense as
- Its use mirrors the use in normal English punctuation.
- Other characters (e.g. the full stop) could be confusing as they already have a common use (a full stop is also used as a decimal point).
- A visible punctuation character allows free format code layout.
- Using a similar delimiter character in derivative or later languages builds upon the familiarity gained by all of the programmers that have used the earlier language.
As a final remark, I think that there has been more time spent on these answers and comments than was spent in deciding to use the semicolon to end a statement when designing the first language that used it in this way.
Best Answer
Two of the major influences to C were the Algol family of languages (Algol 60 and Algol 68) and BCPL (from which C takes its name).
From http://www.princeton.edu/~achaney/tmve/wiki100k/docs/BCPL.html
From http://progopedia.com/language/bcpl/
Within BCPL, one often sees curly braces, but not always. This was a limitation of the keyboards at the time. The characters
$(
and$)
were lexicographically equivalent to{
and}
. Digraphs and trigraphs were maintained in C (though a different set for curly brace replacement -??<
and??>
).The use of curly braces was further refined in B (which preceded C).
From Users' Reference to B by Ken Thompson:
There are indications that curly braces were used as short hand for
begin
andend
within Algol.From http://www.bobbemer.com/BRACES.HTM
The use of square brackets (as a suggested replacement in the question) goes back even further. As mentioned, the Algol family influenced C. Within Algol 60 and 68 (C was written in 1972 and BCPL in 1966), the square bracket was used to designate an index into an array or matrix.
As programmers were already familiar with square brackets for arrays in Algol and BCPL, and curly braces for blocks in BCPL, there was little need or desire to change this when making another language.
The updated question includes an addendum of productivity for curly brace usage and mentions python. There are some other resources that do this study though the answer boils down to "Its anecdotal, and what you are used to is what you are most productive with." Because of the widely varying skills in programming and familiarity with different languages, these become difficult to account for.
See also: Stack Overflow Are there statistical studies that indicates that Python is “more productive”?
Much of the gains would be dependent on the IDE (or lack of) that is used. In vi based editors, putting the cursor over one matching open/close and pressing
%
will then move the cursor to the other matching character. This is very efficient with C based languages back in the old days - less so now.A better comparison would be between
{}
andbegin
/end
which was the options of the day (horizontal space was precious). Many Wirth languages were based on abegin
andend
style (Algol (mentioned above), pascal (many are familiar with), and the Modula family).I have difficulty finding any that isolate this specific language feature - at best I can do is show that the curly brace languages are much more popular than begin end languages and it is a common construct. As mentioned in Bob Bemer link above, the curly brace was used to make it easier to program as shorthand.
From Why Pascal is Not My Favorite Programming Language
Which is about all that can be said - its familiarity and preference.