How do languages with Maybe types instead of nulls handle edge conditions

nulltype-systems

Eric Lippert made a very interesting point in his discussion of why C# uses a null rather than a Maybe<T> type:

Consistency of the type system is important; can we always know that a non-nullable reference is never under any circumstances observed to be invalid? What about in the constructor of an object with a non-nullable field of reference type? What about in the finalizer of such an object, where the object is finalized because the code that was supposed to fill in the reference threw an exception? A type system that lies to you about its guarantees is dangerous.

That was a bit of an eye-opener. The concepts involved interest me, and I've done some playing around with compilers and type systems, but I never thought about that scenario. How do languages that have a Maybe type instead of a null handle edge cases such as initialization and error recovery, in which a supposedly guaranteed non-null reference is not, in fact, in a valid state?

Best Answer

That quote points to a problem that occurs if the declaration and assignment of identifiers (here: instance members) are separate from each other. As a quick pseudocode sketch:

class Broken {
    val foo: Foo  // where Foo and Bar are non-nullable reference types
    val bar: Bar

    Broken() {
        foo = new Foo()
        throw new Exception()
        // this code is never reached, so "bar" is not assigned
        bar = new Bar()
    }

    ~Broken() {
        foo.cleanup()
        bar.cleanup()
    }
}

The scenario is now that during construction of an instance, an error will be thrown, so construction will be aborted before the instance has been fully constructed. This language offers a destructor method which will run before the memory is deallocated, e.g. to manually free non-memory resources. It must also be run on partially constructed objects, because manually managed resources might already have been allocated before construction was aborted.

With nulls, the destructor could test whether a variable had been assigned like if (foo != null) foo.cleanup(). Without nulls, the object is now in an undefined state – what is the value of bar?

However, this problem exists due to the combination of three aspects:

  • The absence of default values like null or guaranteed initialization for the member variables.
  • The difference between declaration and assignment. Forcing variables to be assigned immediately (e.g. with a let statement as seen in functional languages) is an easy was to force guaranteed initialization – but restricts the language in other ways.
  • The specific flavor of destructors as a method that gets called by the language runtime.

It is easy to choose another design that does not exhibit these problems, for example by always combining declaration with assignment and having the language offer multiple finalizer blocks instead of a single finalization method:

// the body of the class *is* the constructor
class Working() {
    val foo: Foo = new Foo()
    FINALIZE { foo.cleanup() }  // block is registered to run when object is destroyed

    throw new Exception()

    // the below code is never reached, so
    //  1. the "bar" variable never enters the scope
    //  2. the second finalizer block is never registered.
    val bar: Bar = new Bar()
    FINALIZE { bar.cleanup() }  // block is registered to run when object is destroyed
}

So there is not an issue with the absence of null, but with the combination a set of other features with an absence of null.

The interesting question is now why C# chose one design but not the other. Here, the context of the quote lists many other arguments for a null in the C# language, which can be mostly summarized as “familiarity and compatibility” – and those are good reasons.

Related Topic