Syntax – Variable Declaration vs Assignment in Programming Languages

language-designlanguage-featuressyntax

Working on a statically typed language with type inference and streamlined syntax, and need to make final decision about syntax for variable declaration versus assignment. Specifically I'm trying to choose between:

// Option 1. Create new local variable with :=, assign with =
foo := 1
foo = 2

// Option 2. Create new local variable with =, assign with :=
foo = 1
foo := 2

Creating functions will use = regardless:

// Indentation delimits blocks
square x =
    x * x

And assignment to compound objects will do likewise:

sky.color = blue
a[i] = 0

Which of options 1 or 2 would people find most convenient/least surprising/otherwise best?

Best Answer

There are many more aspects one should consider when settling for assignment/declaration syntax, than simple = vs. := bikeshedding.

Type inference or not, you will want a syntax for explicit type annotations. In some type systems, inference may not be possible without occasional explicit annotations. There two possible classes of syntax for this:

  1. A type-variable statement without further operators implies a declaration, e.g. int i in C. Some languages use postfix types like i int, (Golang to a certain degree).
  2. There is a typing operator, often : or ::. Sometimes, this declares the type of a name: let i : int = 42 (e.g. Ocaml). In an interesting spin of this, Julia allows a programmer to use type assertions for arbitrary expressions, along the lines of sum = (a + b):int.

You may also want to consider an explicit declaration keyword, like var, val or let. The advantage is not primarily that they make parsing and understanding of the code much easier, but that they unambiguously introduce a variable. Why is this important?

  • If you have closures, you need to precisely declare which scope a variable belongs to. Imagine a language without a declaration keyword, and implicit declaration through assignment (e.g. PHP or Python). Both of these are syntactically challenged with respect to closures, because they either ascribe a variable to the outermost or innermost possible scope. Consider this Python:

    def make_closures():
      x = 42;
      def incr():
        x = x + 1
      def value():
        print(x)
      return incr, value
    
    i, v = make_closures();
    v();  # expected: 42, behaviour: error because x is uninitialized in value()
    

    Compare with a language that allows explicit declaration:

    var make_closures = function() {
      var x     = 42,
          incr  = function() { x++ },
          value = function() { console.log(x) };
      return { incr: incr, value: value };
    };
    
    var closures = make_closures();
    closures.value();  // everything works
    
  • Explicit declarations allow variable shadowing. While generally a bad practice, it sometimes makes code much easier to follow – no reason to disallow it.

  • Explicit declarations offer a form of typo detection, because unbound variables are not implicitly declared. Consider:

    var1 = 42
    if 0 < 1:
       varl = 12
    
    print(var1)  # 42
    

    versus:

    use strict;
    my $var1 = 42;
    $varl = 12 if 0 < 1;  # Doesn't compile: Global symbol "$varl" requires explicit package name
    say $var1;
    

You should also consider whether you would like to (optionally) enforce single-assignment form, e.g through keywords like val (Scala), let, or const or by default. In my experience, such code is easier to reason about.

How would a short declaration e.g. via := fare in these points?

  • Assuming you have typing via a : operator and assigment via =, then i : int = 42 could declare a variable, the syntax i : = 42 would invoke inference of the variable, and i := 42 would be a nice contraction, but not an operator in itself. This avoids problems later on.
  • Another rationale is the mathematical syntax for the declaration of new names x := expression or expression =: x. However, this has no significant difference to the = relation, except that the colon draws attention to one name. Simply using the := for similarity to maths is silly (considering the = abuse), as is using it for similarity to Pascal.
  • We can declare some more or less sane characteristics for :=, like:

    • It declares a new variable in the current scope
    • which is re-assignable,
    • and performs type inference.
    • Re-declaring a variable in the same scope is a compilation error.
    • Shadowing is permitted.

    But in practice, things get murky. What happens when you have multiple assignments (which you should seriously consider), like

    x := 1
    x, y := 2, 3
    

    Should this throw an error because x is already declared in this scope? Or should it just assign x and declare y? Go takes the second route, with the result that typo detection is weakened:

    var1 := 1
    varl, var2 := 2, 3  // oops, var1 is still 1, and now varl was declared
    

    Note that the “RHS of typing-operator is optional” idea from above would disambiguate this, as every new variable would have to be followed by a colon:

    x: = 1
    x:, y: = 2, 3  # error because x is already declared
    x,  y: = 2, 3  # OTOH this would have been fine
    

Should = be declaration but := be assignment? Hell no. First, no language I know of does this. Second, when you don't use single-assignment form, then assignment is more common than declaration. Huffman-coding of operator requires that the shorter operator is used for the more common operation. But if you don't generally allow reassignment, the = is somewhat free to use (depending on whether you use = or == as comparison operator, and whether you could disambiguate a = from context).

Summary

  • If assignment and declaration use the same operator, bad things happen: Closures, variable shadowing, and typo detection all get ugly with implicit declarations.
  • But if you don't have re-assignments, things clear up again.
  • Don't forget that explicit types and variable declarations are somewhat related. Combining their syntax has served many languages well.
  • Are you sure you want such little visual distinction between assignment and declaration?

Personal opinion

I am fond of declaration keywords like val or my. They stand out, making code easier to grok. Explicit declarations are always a good idea for a serious language.

Related Topic