Haskell’s algebraic data types

algebraic-data-typesdata structuresfunctional programminghaskelltypes

I'm trying to fully understand all of Haskell's concepts.

In what ways are algebraic data types similar to generic types, e.g., in C# and Java? And how are they different? What's so algebraic about them anyway?

I'm familiar with universal algebra and its rings and fields, but I only have a vague idea of how Haskell's types work.

Best Answer

Haskell's algebraic data types are named such since they correspond to an initial algebra in category theory, giving us some laws, some operations and some symbols to manipulate. We may even use algebraic notation for describing regular data structures, where:

+ represents sum types (disjoint unions, e.g. Either).
• represents product types (e.g. structs or tuples)
X for the singleton type (e.g. data X a = X a)
1 for the unit type ()
and μ for the least fixed point (e.g. recursive types), usually implicit.

with some additional notation:

X² for X•X

In fact, you might say (following Brent Yorgey) that a Haskell data type is regular if it can be expressed in terms of 1, X, +, •, and a least ﬁxed point.

With this notation, we can concisely describe many regular data structures:

Units: data () = ()

1
Options: data Maybe a = Nothing | Just a

1 + X
Lists: data [a] = [] | a : [a]

L = 1+X•L
Binary trees: data BTree a = Empty | Node a (BTree a) (BTree a)

B = 1 + X•B²

Other operations hold (taken from Brent Yorgey's paper, listed in the references):

Expansion: unfolding the fix point can be helpful for thinking about lists. L = 1 + X + X² + X³ + ... (that is, lists are either empty, or they have one element, or two elements, or three, or ...)
Composition, ◦, given types F and G, the composition F ◦ G is a type which builds “F-structures made out of G-structures” (e.g. R = X • (L ◦ R) ,where L is lists, is a rose tree.
Differentiation, the derivative of a data type D (given as D') is the type of D-structures with a single “hole”, that is, a distinguished location not containing any data. That amazingly satisfy the same rules as for differentiation in calculus:

1′ = 0

X′ = 1

(F + G)′ = F' + G′

(F • G)′ = F • G′ + F′ • G

(F ◦ G)′ = (F′ ◦ G) • G′

References:

Species and Functors and Types, Oh My!, Brent A. Yorgey, Haskell’10, September 30, 2010, Baltimore, Maryland, USA
Clowns to the left of me, jokers to the right (Dissecting Data Structures), Conor McBride POPL 2008

Related Solutions

C++ – What are POD types in C++

POD stands for Plain Old Data - that is, a class (whether defined with the keyword struct or the keyword class) without constructors, destructors and virtual members functions. Wikipedia's article on POD goes into a bit more detail and defines it as:

A Plain Old Data Structure in C++ is an aggregate class that contains only PODS as members, has no user-defined destructor, no user-defined copy assignment operator, and no nonstatic members of pointer-to-member type.

Greater detail can be found in this answer for C++98/03. C++11 changed the rules surrounding POD, relaxing them greatly, thus necessitating a follow-up answer here.

Haskell – What does the `forall` keyword in Haskell/GHC do

Let's start with a code example:

foob :: forall a b. (b -> b) -> b -> (a -> b) -> Maybe a -> b
foob postProcess onNothin onJust mval =
    postProcess val
    where
        val :: b
        val = maybe onNothin onJust mval

This code doesn't compile (syntax error) in plain Haskell 98. It requires an extension to support the forall keyword.

Basically, there are 3 different common uses for the forall keyword (or at least so it seems), and each has its own Haskell extension: ScopedTypeVariables, RankNTypes/Rank2Types, ExistentialQuantification.

The code above doesn't get a syntax error with either of those enabled, but only type-checks with ScopedTypeVariables enabled.

Scoped Type Variables:

Scoped type variables helps one specify types for code inside where clauses. It makes the b in val :: b the same one as the b in foob :: forall a b. (b -> b) -> b -> (a -> b) -> Maybe a -> b.

A confusing point: you may hear that when you omit the forall from a type it is actually still implicitly there. (from Norman's answer: "normally these languages omit the forall from polymorphic types"). This claim is correct, but it refers to the other uses of forall, and not to the ScopedTypeVariables use.

Rank-N-Types:

Let's start with that mayb :: b -> (a -> b) -> Maybe a -> b is equivalent to mayb :: forall a b. b -> (a -> b) -> Maybe a -> b, except for when ScopedTypeVariables is enabled.

This means that it works for every a and b.

Let's say you want to do something like this.

ghci> let putInList x = [x]
ghci> liftTup putInList (5, "Blah")
([5], ["Blah"])

What must be the type of this liftTup? It's liftTup :: (forall x. x -> f x) -> (a, b) -> (f a, f b). To see why, let's try to code it:

ghci> let liftTup liftFunc (a, b) = (liftFunc a, liftFunc b)
ghci> liftTup (\x -> [x]) (5, "Hello")
    No instance for (Num [Char])
    ...
ghci> -- huh?
ghci> :t liftTup
liftTup :: (t -> t1) -> (t, t) -> (t1, t1)

"Hmm.. why does GHC infer that the tuple must contain two of the same type? Let's tell it they don't have to be"

-- test.hs
liftTup :: (x -> f x) -> (a, b) -> (f a, f b)
liftTup liftFunc (t, v) = (liftFunc t, liftFunc v)

ghci> :l test.hs
    Couldnt match expected type 'x' against inferred type 'b'
    ...

Hmm. so here GHC doesn't let us apply liftFunc on v because v :: b and liftFunc wants an x. We really want our function to get a function that accepts any possible x!

{-# LANGUAGE RankNTypes #-}
liftTup :: (forall x. x -> f x) -> (a, b) -> (f a, f b)
liftTup liftFunc (t, v) = (liftFunc t, liftFunc v)

So it's not liftTup that works for all x, it's the function that it gets that does.

Existential Quantification:

Let's use an example:

-- test.hs
{-# LANGUAGE ExistentialQuantification #-}
data EQList = forall a. EQList [a]
eqListLen :: EQList -> Int
eqListLen (EQList x) = length x

ghci> :l test.hs
ghci> eqListLen $ EQList ["Hello", "World"]
2

How is that different from Rank-N-Types?

ghci> :set -XRankNTypes
ghci> length (["Hello", "World"] :: forall a. [a])
    Couldnt match expected type 'a' against inferred type '[Char]'
    ...

With Rank-N-Types, forall a meant that your expression must fit all possible as. For example:

ghci> length ([] :: forall a. [a])
0

An empty list does work as a list of any type.

So with Existential-Quantification, foralls in data definitions mean that, the value contained can be of any suitable type, not that it must be of all suitable types.

Best Answer

Related Solutions

C++ – What are POD types in C++

Haskell – What does the `forall` keyword in Haskell/GHC do

Related Topic