Design – Sum Types vs Polymorphism

designfunctional programmingobject-oriented

This past year I took the leap and learned a functional programming language (F#) and one of the more interesting things that I've found is how it affects the way I design OO software. The two things I find myself missing most in OO languages are pattern matching and sum types. Everywhere I look I see situations that would be trivially modeled with a discriminated union, but I am reluctant to crowbar in some OO DU implementation that feels unnatural to the paradigm.

This generally leads me to create intermediate types to handle the or relationships that a sum type would handle for me. It also seems to lead to a good deal of branching. If I read people like Misko Hevery, he suggests that good OO design can minimize branching through polymorphism.

One of the things I avoid as much as possible in OO code is types with null values. Obviously the or relationship can be modeled by a type with one null value and one non-null value, but this means null tests everywhere. Is there a way to model heterogeneous but logically associated types polymorphically? Design strategies or patterns would be very helpful, or simply ways to think about heterogeneous and associated types generally in the OO paradigm.

Best Answer

Like you, I wish that discriminated unions were more prevalent; however, the reason they are useful in most functional languages is that they provide exhaustive pattern matching, and without this, they are just pretty syntax: not just pattern matching: exhaustive pattern matching, so that the code doesn't compile if you don't cover every possibility: this is what gives you power.

The only way to do anything useful with a sum type is to decompose it, and branch depending on which type it is (e.g. by pattern matching). The great thing about interfaces is that you don't care what type something is, because you know you can treat it like an iface: no unique logic needed for each type: no branching.

This isn't a "functional code has more branching, OO code has less", this is a "'functional languages' are better suited to domains where you have unions - which mandate branching - and 'OO languages' are better suited to code where you can expose common behaviour as a common interface - which might feel like it does less branching". The branching is a function of your design and the domain. Quite simply, if your "heterogeneous but logically associated types" can't expose a common interface, then you have to branch/pattern-match over them. This is a domain/design problem.

What Misko may be referring to is the general idea that if you can expose your types as a common interface, then using OO features (interfaces/polymorphism) will make your life better by putting type-specific behaviour in the type rather than in the consuming code.

It is important to recognise that interfaces and unions are kind of the opposite of each other: an interface defines some stuff the type has to implement, and the union defines some stuff the consumer has to consider. If you add a method to an interface, you have changed that contract, and now every type that previously implemented it needs to be updated. If you add a new type to a union, you have changed that contract, and now every exhaustive pattern matching over the union has to be updated. They fill different roles, and while it may sometimes be possible to implement a system 'either way', which you go with is a design decision: neither is inherently better.

One benefit of going with interfaces/polymorphism is that the consuming code is more extensible: you can pass in a type that wasn't defined at design time, so long as it exposes the agreed interface. On the flip side, with a static union, you can exploit behaviours that weren't considered at design time by writing new exhaustive pattern-matchings, so long as they stick to the contract of the union.

Regarding the 'Null Object Pattern': this isn't not a silver bullet, and does not replace null checks. All it does it provide a way to avoid some 'null' checks where the 'null' behaviour can be exposed behind a common interface. If you can't expose the 'null' behaviour behind the type's interface, then you will be thinking "I really wish I could exhaustively pattern match this" and will end up performing a 'branching' check.

Related Solutions

Functional Programming – Understanding Values, Types, and Kinds

According to the basic Wikipedia entry about kinds "a kind is the type of a type constructor or, less commonly, the type of a higher-order type operator". So I understand that to mean the type of a kind is a kind and it's kinds all the way down (which makes sense - otherwise we would need an infinite number of names, one for each meta-type(i)).

From the same reference:

"(* => *) => * is the kind of a higher-order type operator from unary type constructors to proper types. These are very seldom encountered, even in programming language theory, but see Pierce (2002), chapter 32 for an application."

would seem to indicate it has limited but non-zero usefulness.

Not a great answer, but hopefully it will stop the "not a real question" close votes until someone who, say, implemented a Haskell compiler and really knows what he's talking about comes along...

Functional Programming – Is Higher-Rank Parametric Polymorphism Useful?

In general, you use higher-rank polymorphism when you want the callee to be able to select the value of a type parameter, rather than the caller. For example:

f :: (forall a. Show a => a -> Int) -> (Int, Int)
f g = (g "one", g 2)

Any function g that I pass to this f must be able to give me an Int from a value of some type, where the only thing g knows about that type is that it has an instance of Show. So these are kosher:

f (length . show)
f (const 42)

But these are not:

f length
f succ

One particularly useful application is in using the scoping of types to enforce the scoping of values. Suppose we have an object of type Action<T>, representing an action we can run to produce a result of type T, such as a future or callback.

T runAction<T>(Action<T>)

runAction :: forall a. Action a -> a

Now, suppose that we also have an Action that can allocate Resource<T> objects:

Action<Resource<T>> newResource<T>(T)

newResource :: forall a. a -> Action (Resource a)

We want to enforce that those resources are only used inside the Action where they were created, and not shared between different actions or different runs of the same action, so that actions are deterministic and repeatable.

We can use higher-ranked types to accomplish this by adding a parameter S to the Resource and Action types, which is totally abstract—it represents the “scope” of the Action. Now our signatures are:

T run<T>(<S> Action<S, T>)
Action<S, Resource<S, T>> newResource<T>(T)

runAction :: forall a. (forall s. Action s a) -> a
newResource :: forall s a. a -> Action s (Resource s a)

Now when we give runAction an Action<S, T>, we are assured that because the “scope” parameter S is fully polymorphic, it cannot escape the body of runAction—so any value of a type that uses S such as Resource<S, int> likewise cannot escape!

(In Haskell, this is known as the ST monad, where runAction is called runST, Resource is called STRef, and newResource is called newSTRef.)

Best Answer

Related Solutions

Functional Programming – Understanding Values, Types, and Kinds

Functional Programming – Is Higher-Rank Parametric Polymorphism Useful?

Related Topic