There are many more aspects one should consider when settling for assignment/declaration syntax, than simple =
vs. :=
bikeshedding.
Type inference or not, you will want a syntax for explicit type annotations. In some type systems, inference may not be possible without occasional explicit annotations. There two possible classes of syntax for this:
- A type-variable statement without further operators implies a declaration, e.g.
int i
in C. Some languages use postfix types like i int
, (Golang to a certain degree).
- There is a typing operator, often
:
or ::
. Sometimes, this declares the type of a name: let i : int = 42
(e.g. Ocaml). In an interesting spin of this, Julia allows a programmer to use type assertions for arbitrary expressions, along the lines of sum = (a + b):int
.
You may also want to consider an explicit declaration keyword, like var
, val
or let
. The advantage is not primarily that they make parsing and understanding of the code much easier, but that they unambiguously introduce a variable. Why is this important?
If you have closures, you need to precisely declare which scope a variable belongs to. Imagine a language without a declaration keyword, and implicit declaration through assignment (e.g. PHP or Python). Both of these are syntactically challenged with respect to closures, because they either ascribe a variable to the outermost or innermost possible scope. Consider this Python:
def make_closures():
x = 42;
def incr():
x = x + 1
def value():
print(x)
return incr, value
i, v = make_closures();
v(); # expected: 42, behaviour: error because x is uninitialized in value()
Compare with a language that allows explicit declaration:
var make_closures = function() {
var x = 42,
incr = function() { x++ },
value = function() { console.log(x) };
return { incr: incr, value: value };
};
var closures = make_closures();
closures.value(); // everything works
Explicit declarations allow variable shadowing. While generally a bad practice, it sometimes makes code much easier to follow – no reason to disallow it.
Explicit declarations offer a form of typo detection, because unbound variables are not implicitly declared. Consider:
var1 = 42
if 0 < 1:
varl = 12
print(var1) # 42
versus:
use strict;
my $var1 = 42;
$varl = 12 if 0 < 1; # Doesn't compile: Global symbol "$varl" requires explicit package name
say $var1;
You should also consider whether you would like to (optionally) enforce single-assignment form, e.g through keywords like val
(Scala), let
, or const
or by default. In my experience, such code is easier to reason about.
How would a short declaration e.g. via :=
fare in these points?
- Assuming you have typing via a
:
operator and assigment via =
, then i : int = 42
could declare a variable, the syntax i : = 42
would invoke inference of the variable, and i := 42
would be a nice contraction, but not an operator in itself. This avoids problems later on.
- Another rationale is the mathematical syntax for the declaration of new names
x := expression
or expression =: x
. However, this has no significant difference to the =
relation, except that the colon draws attention to one name. Simply using the :=
for similarity to maths is silly (considering the =
abuse), as is using it for similarity to Pascal.
We can declare some more or less sane characteristics for :=
, like:
- It declares a new variable in the current scope
- which is re-assignable,
- and performs type inference.
- Re-declaring a variable in the same scope is a compilation error.
- Shadowing is permitted.
But in practice, things get murky. What happens when you have multiple assignments (which you should seriously consider), like
x := 1
x, y := 2, 3
Should this throw an error because x
is already declared in this scope? Or should it just assign x
and declare y
? Go takes the second route, with the result that typo detection is weakened:
var1 := 1
varl, var2 := 2, 3 // oops, var1 is still 1, and now varl was declared
Note that the “RHS of typing-operator is optional” idea from above would disambiguate this, as every new variable would have to be followed by a colon:
x: = 1
x:, y: = 2, 3 # error because x is already declared
x, y: = 2, 3 # OTOH this would have been fine
Should =
be declaration but :=
be assignment? Hell no. First, no language I know of does this. Second, when you don't use single-assignment form, then assignment is more common than declaration. Huffman-coding of operator requires that the shorter operator is used for the more common operation. But if you don't generally allow reassignment, the =
is somewhat free to use (depending on whether you use =
or ==
as comparison operator, and whether you could disambiguate a =
from context).
Summary
- If assignment and declaration use the same operator, bad things happen: Closures, variable shadowing, and typo detection all get ugly with implicit declarations.
- But if you don't have re-assignments, things clear up again.
- Don't forget that explicit types and variable declarations are somewhat related. Combining their syntax has served many languages well.
- Are you sure you want such little visual distinction between assignment and declaration?
Personal opinion
I am fond of declaration keywords like val
or my
. They stand out, making code easier to grok. Explicit declarations are always a good idea for a serious language.
There are two well known syntactic sugar languages for C:
And two less well known, but more truly syntactic sugar as they are set on always compiling to C while the above switched to direct assembly generation:
Both C++ and Objective C started as translators to C. They switched to direct assembly generation because the code generation was getting complicated for some features. They were never designed to generate interfaces easily callable from C, but both embed complete C.
The other 2, GOB and Vala on the other hand are designed to generate interfaces easily callable from C. GOB only provides generating some boilerplate for object oriented programming while the function bodies are written in plain C, Vala is completely different language that maps to C.
All those languages provide features, not just simply different syntax. Because frankly, providing just different syntax has absolutely no benefit. It is not the syntax that governs how complicated it is to write anything in a language. It is the structural complexity of expressions and the amount of things the programmer has to keep track of. Saving a few keystrokes won't help. Programmers can generally type quite a bit faster than they think. So a different language can only help by:
- Saving a lot of keystrokes. This is the case of GObject Builder and in part Vala. This is because boilerplate needed to create a GObject is rather large.
- Providing actual features, like automatic reference counting in Vala.
Even then, of the four above mentioned languages, only C++ became universally popular and only because it added really powerful features. Objective C is only used on single platform and GOB and Vala are mainly niche languages for the Gnome project.
Because it is a lot of work to learn a new language. It's not enough if a programmer knows a language. Most projects are work of large teams these days and all members of the team have to know the language to use it on a project. And because people leave jobs, project leaders hesitate to use new language not only until they have enough programmers proficient in it, but also until they are sure they can hire more if needed. So new languages only catch up if they have really important new features. (Java while being rather weak language in itself has a huge advantage in comprehensive standard library. For productivity the standard library is even more important than features of the language, not to mention syntax).
Best Answer
The reason to choose one or the other is because of intent and as a result of this, it increases readability.
Intent: the loop should run for as long as
i
is smaller than 10, not for as long asi
is not equal to 10. Even though the latter may be the same in this particular case, it's not what you mean, so it shouldn't be written like that.Readability: a result of writing down what you mean is that it's also easier to understand. For example, if you use
i != 10
, someone reading the code may wonder whether inside the loop there is some wayi
could become bigger than 10 and that the loop should continue (btw: it's bad style to mess with the iterator somewhere else than in the head of thefor
-statement, but that doesn't mean people don't do it and as a result maintainers expect it).