Haskell – A monad is just a monoid in the category of endofunctors, what’s the problem

category-theoryhaskellmonadsmonoids

Who first said the following?

A monad is just a monoid in the
category of endofunctors, what's the
problem?

And on a less important note, is this true and if so could you give an explanation (hopefully one that can be understood by someone who doesn't have much Haskell experience)?

Best Answer

That particular phrasing is by James Iry, from his highly entertaining Brief, Incomplete and Mostly Wrong History of Programming Languages, in which he fictionally attributes it to Philip Wadler.

The original quote is from Saunders Mac Lane in Categories for the Working Mathematician, one of the foundational texts of Category Theory. Here it is in context, which is probably the best place to learn exactly what it means.

But, I'll take a stab. The original sentence is this:

All told, a monad in X is just a monoid in the category of endofunctors of X, with product × replaced by composition of endofunctors and unit set by the identity endofunctor.

X here is a category. Endofunctors are functors from a category to itself (which is usually all Functors as far as functional programmers are concerned, since they're mostly dealing with just one category; the category of types - but I digress). But you could imagine another category which is the category of "endofunctors on X". This is a category in which the objects are endofunctors and the morphisms are natural transformations.

And of those endofunctors, some of them might be monads. Which ones are monads? Exactly the ones which are monoidal in a particular sense. Instead of spelling out the exact mapping from monads to monoids (since Mac Lane does that far better than I could hope to), I'll just put their respective definitions side by side and let you compare:

A monoid is...

A set, S
An operation, • : S × S → S
An element of S, e : 1 → S

...satisfying these laws:

(a • b) • c = a • (b • c), for all a, b and c in S
e • a = a • e = a, for all a in S

A monad is...

An endofunctor, T : X → X (in Haskell, a type constructor of kind * -> * with a Functor instance)
A natural transformation, μ : T × T → T, where × means functor composition (μ is known as join in Haskell)
A natural transformation, η : I → T, where I is the identity endofunctor on X (η is known as return in Haskell)

...satisfying these laws:

μ ∘ Tμ = μ ∘ μT
μ ∘ Tη = μ ∘ ηT = 1 (the identity natural transformation)

With a bit of squinting you might be able to see that both of these definitions are instances of the same abstract concept.

Related Solutions

Monad in plain English? (For the OOP programmer with no FP background)

UPDATE: This question was the subject of an immensely long blog series, which you can read at Monads — thanks for the great question!

In terms that an OOP programmer would understand (without any functional programming background), what is a monad?

A monad is an "amplifier" of types that obeys certain rules and which has certain operations provided.

First, what is an "amplifier of types"? By that I mean some system which lets you take a type and turn it into a more special type. For example, in C# consider Nullable<T>. This is an amplifier of types. It lets you take a type, say int, and add a new capability to that type, namely, that now it can be null when it couldn't before.

As a second example, consider IEnumerable<T>. It is an amplifier of types. It lets you take a type, say, string, and add a new capability to that type, namely, that you can now make a sequence of strings out of any number of single strings.

What are the "certain rules"? Briefly, that there is a sensible way for functions on the underlying type to work on the amplified type such that they follow the normal rules of functional composition. For example, if you have a function on integers, say

int M(int x) { return x + N(x * 2); }

then the corresponding function on Nullable<int> can make all the operators and calls in there work together "in the same way" that they did before.

(That is incredibly vague and imprecise; you asked for an explanation that didn't assume anything about knowledge of functional composition.)

What are the "operations"?

There is a "unit" operation (confusingly sometimes called the "return" operation) that takes a value from a plain type and creates the equivalent monadic value. This, in essence, provides a way to take a value of an unamplified type and turn it into a value of the amplified type. It could be implemented as a constructor in an OO language.
There is a "bind" operation that takes a monadic value and a function that can transform the value, and returns a new monadic value. Bind is the key operation that defines the semantics of the monad. It lets us transform operations on the unamplified type into operations on the amplified type, that obeys the rules of functional composition mentioned before.
There is often a way to get the unamplified type back out of the amplified type. Strictly speaking this operation is not required to have a monad. (Though it is necessary if you want to have a comonad. We won't consider those further in this article.)

Again, take Nullable<T> as an example. You can turn an int into a Nullable<int> with the constructor. The C# compiler takes care of most nullable "lifting" for you, but if it didn't, the lifting transformation is straightforward: an operation, say,

int M(int x) { whatever }

is transformed into

Nullable<int> M(Nullable<int> x) 
{ 
    if (x == null) 
        return null; 
    else 
        return new Nullable<int>(whatever);
}

And turning a Nullable<int> back into an int is done with the Value property.

It's the function transformation that is the key bit. Notice how the actual semantics of the nullable operation — that an operation on a null propagates the null — is captured in the transformation. We can generalize this.

Suppose you have a function from int to int, like our original M. You can easily make that into a function that takes an int and returns a Nullable<int> because you can just run the result through the nullable constructor. Now suppose you have this higher-order method:

static Nullable<T> Bind<T>(Nullable<T> amplified, Func<T, Nullable<T>> func)
{
    if (amplified == null) 
        return null;
    else
        return func(amplified.Value);
}

See what you can do with that? Any method that takes an int and returns an int, or takes an int and returns a Nullable<int> can now have the nullable semantics applied to it.

Furthermore: suppose you have two methods

Nullable<int> X(int q) { ... }
Nullable<int> Y(int r) { ... }

and you want to compose them:

Nullable<int> Z(int s) { return X(Y(s)); }

That is, Z is the composition of X and Y. But you cannot do that because X takes an int, and Y returns a Nullable<int>. But since you have the "bind" operation, you can make this work:

Nullable<int> Z(int s) { return Bind(Y(s), X); }

The bind operation on a monad is what makes composition of functions on amplified types work. The "rules" I handwaved about above are that the monad preserves the rules of normal function composition; that composing with identity functions results in the original function, that composition is associative, and so on.

In C#, "Bind" is called "SelectMany". Take a look at how it works on the sequence monad. We need to have two things: turn a value into a sequence and bind operations on sequences. As a bonus, we also have "turn a sequence back into a value". Those operations are:

static IEnumerable<T> MakeSequence<T>(T item)
{
    yield return item;
}
// Extract a value
static T First<T>(IEnumerable<T> sequence)
{
    // let's just take the first one
    foreach(T item in sequence) return item; 
    throw new Exception("No first item");
}
// "Bind" is called "SelectMany"
static IEnumerable<T> SelectMany<T>(IEnumerable<T> seq, Func<T, IEnumerable<T>> func)
{
    foreach(T item in seq)
        foreach(T result in func(item))
            yield return result;            
}

The nullable monad rule was "to combine two functions that produce nullables together, check to see if the inner one results in null; if it does, produce null, if it does not, then call the outer one with the result". That's the desired semantics of nullable.

The sequence monad rule is "to combine two functions that produce sequences together, apply the outer function to every element produced by the inner function, and then concatenate all the resulting sequences together". The fundamental semantics of the monads are captured in the Bind/SelectMany methods; this is the method that tells you what the monad really means.

We can do even better. Suppose you have a sequences of ints, and a method that takes ints and results in sequences of strings. We could generalize the binding operation to allow composition of functions that take and return different amplified types, so long as the inputs of one match the outputs of the other:

static IEnumerable<U> SelectMany<T,U>(IEnumerable<T> seq, Func<T, IEnumerable<U>> func)
{
    foreach(T item in seq)
        foreach(U result in func(item))
            yield return result;            
}

So now we can say "amplify this bunch of individual integers into a sequence of integers. Transform this particular integer into a bunch of strings, amplified to a sequence of strings. Now put both operations together: amplify this bunch of integers into the concatenation of all the sequences of strings." Monads allow you to compose your amplifications.

What problem does it solve and what are the most common places it's used?

That's rather like asking "what problems does the singleton pattern solve?", but I'll give it a shot.

Monads are typically used to solve problems like:

I need to make new capabilities for this type and still combine old functions on this type to use the new capabilities.
I need to capture a bunch of operations on types and represent those operations as composable objects, building up larger and larger compositions until I have just the right series of operations represented, and then I need to start getting results out of the thing
I need to represent side-effecting operations cleanly in a language that hates side effects

C# uses monads in its design. As already mentioned, the nullable pattern is highly akin to the "maybe monad". LINQ is entirely built out of monads; the SelectMany method is what does the semantic work of composition of operations. (Erik Meijer is fond of pointing out that every LINQ function could actually be implemented by SelectMany; everything else is just a convenience.)

To clarify the kind of understanding I was looking for, let's say you were converting an FP application that had monads into an OOP application. What would you do to port the responsibilities of the monads into the OOP app?

Most OOP languages do not have a rich enough type system to represent the monad pattern itself directly; you need a type system that supports types that are higher types than generic types. So I wouldn't try to do that. Rather, I would implement generic types that represent each monad, and implement methods that represent the three operations you need: turning a value into an amplified value, (maybe) turning an amplified value into a value, and transforming a function on unamplified values into a function on amplified values.

A good place to start is how we implemented LINQ in C#. Study the SelectMany method; it is the key to understanding how the sequence monad works in C#. It is a very simple method, but very powerful!

Suggested, further reading:

For a more in-depth and theoretically sound explanation of monads in C#, I highly recommend my (Eric Lippert's) colleague Wes Dyer's article on the subject. This article is what explained monads to me when they finally "clicked" for me.
- The Marvels of Monads
A good illustration of why you might want a monad around (uses Haskell in it's examples).
- You Could Have Invented Monads! (And Maybe You Already Have.) by Dan Piponi
Sort of, "translation" of the previous article to JavaScript.
- Translation from Haskell to JavaScript of selected portions of the best introduction to monads I’ve ever read by James Coglan

Python – Speed comparison with Project Euler: C vs Python vs Erlang vs Haskell

Using GHC 7.0.3, gcc 4.4.6, Linux 2.6.29 on an x86_64 Core2 Duo (2.5GHz) machine, compiling using ghc -O2 -fllvm -fforce-recomp for Haskell and gcc -O3 -lm for C.

Your C routine runs in 8.4 seconds (faster than your run probably because of -O3)
The Haskell solution runs in 36 seconds (due to the -O2 flag)
Your factorCount' code isn't explicitly typed and defaulting to Integer (thanks to Daniel for correcting my misdiagnosis here!). Giving an explicit type signature (which is standard practice anyway) using Int and the time changes to 11.1 seconds
in factorCount' you have needlessly called fromIntegral. A fix results in no change though (the compiler is smart, lucky for you).
You used mod where rem is faster and sufficient. This changes the time to 8.5 seconds.
factorCount' is constantly applying two extra arguments that never change (number, sqrt). A worker/wrapper transformation gives us:

 $ time ./so
 842161320  

 real    0m7.954s  
 user    0m7.944s  
 sys     0m0.004s

That's right, 7.95 seconds. Consistently half a second faster than the C solution. Without the -fllvm flag I'm still getting 8.182 seconds, so the NCG backend is doing well in this case too.

Conclusion: Haskell is awesome.

Resulting Code

factorCount number = factorCount' number isquare 1 0 - (fromEnum $ square == fromIntegral isquare)
    where square = sqrt $ fromIntegral number
          isquare = floor square

factorCount' :: Int -> Int -> Int -> Int -> Int
factorCount' number sqrt candidate0 count0 = go candidate0 count0
  where
  go candidate count
    | candidate > sqrt = count
    | number `rem` candidate == 0 = go (candidate + 1) (count + 2)
    | otherwise = go (candidate + 1) count

nextTriangle index triangle
    | factorCount triangle > 1000 = triangle
    | otherwise = nextTriangle (index + 1) (triangle + index + 1)

main = print $ nextTriangle 1 1

EDIT: So now that we've explored that, lets address the questions

Question 1: Do erlang, python and haskell lose speed due to using arbitrary length integers or don't they as long as the values are less than MAXINT?

In Haskell, using Integer is slower than Int but how much slower depends on the computations performed. Luckily (for 64 bit machines) Int is sufficient. For portability sake you should probably rewrite my code to use Int64 or Word64 (C isn't the only language with a long).

Question 2: Why is haskell so slow? Is there a compiler flag that turns off the brakes or is it my implementation? (The latter is quite probable as haskell is a book with seven seals to me.)

Question 3: Can you offer me some hints how to optimize these implementations without changing the way I determine the factors? Optimization in any way: nicer, faster, more "native" to the language.

That was what I answered above. The answer was

0) Use optimization via -O2
1) Use fast (notably: unbox-able) types when possible
2) rem not mod (a frequently forgotten optimization) and
3) worker/wrapper transformation (perhaps the most common optimization).

Question 4: Do my functional implementations permit LCO and hence avoid adding unnecessary frames onto the call stack?

Yes, that wasn't the issue. Good work and glad you considered this.

Best Answer

A monoid is...

...satisfying these laws:

A monad is...

...satisfying these laws:

Related Solutions

Monad in plain English? (For the OOP programmer with no FP background)

Python – Speed comparison with Project Euler: C vs Python vs Erlang vs Haskell

Related Topic