Most of what you do in programming all day is combining some functions together to build bigger functions from them. Usually you have not only functions in your toolbox but also other things like operators, variable assignments and the like, but generally your program combines together lots of "computations" to bigger computations that will be combined together further.
A monad is some way to do this "combining of computations".
Usually your most basic "operator" to combine two computations together is ;
:
a; b
When you say this you mean "first do a
, then do b
". The result a; b
is basically again a computation that can be combined together with more stuff.
This is a simple monad, it is a way of combing small computations to bigger ones. The ;
says "do the thing on the left, then do the thing on the right".
Another thing that can be seen as a monad in object oriented languages is the .
. Often you find things like this:
a.b().c().d()
The .
basically means "evaluate the computation on the left, and then call the method on the right on the result of that". It is another way to combine functions/computations together, a little more complicated than ;
. And the concept of chaining things together with .
is a monad, since it's a way of combining two computations together to a new computation.
Another fairly common monad, that has no special syntax, is this pattern:
rv = socket.bind(address, port);
if (rv == -1)
return -1;
rv = socket.connect(...);
if (rv == -1)
return -1;
rv = socket.send(...);
if (rv == -1)
return -1;
A return value of -1 indicates failure, but there is no real way to abstract out this error checking, even if you have lots of API-calls that you need to combine in this fashion. This is basically just another monad that combines the function calls by the rule "if the function on the left returned -1, do return -1 ourselves, otherwise call the function on the right". If we had an operator >>=
that did this thing we could simply write:
socket.bind(...) >>= socket.connect(...) >>= socket.send(...)
It would make things more readable and help to abstract out our special way of combining functions, so that we don't need to repeat ourselves over and over again.
And there are many more ways to combine functions/computations that are useful as a general pattern and can be abstracted in a monad, enabling the user of the monad to write much more concise and clear code, since all the book-keeping and management of the used functions is done in the monad.
For example the above >>=
could be extended to "do the error checking and then call the right side on the socket that we got as input", so that we don't need to explicitly specify socket
lots of times:
new socket() >>= bind(...) >>= connect(...) >>= send(...);
The formal definition is a bit more complicated since you have to worry about how to get the result of one function as an input to the next one, if that function needs that input and since you want to make sure that the functions you combine fit into the way you try to combine them in your monad. But the basic concept is just that you formalize different ways to combine functions together.
Or if you play a video game, there are
tons of state variables, beginning
with the positions of all the
characters, who tend to move around
constantly. How can you possibly do
anything useful without keeping track
of changing values?
If you're interested, here's a series of articles which describe game programming with Erlang.
You probably won't like this answer, but you won't get functional program until you use it. I can post code samples and say "Here, don't you see" -- but if you don't understand the syntax and underlying principles, then your eyes just glaze over. From your point of view, it looks as if I'm doing the same thing as an imperative language, but just setting up all kinds of boundaries to purposefully make programming more difficult. My point of view, you're just experiencing the Blub paradox.
I was skeptical at first, but I jumped on the functional programming train a few years ago and fell in love with it. The trick with functional programming is being able to recognize patterns, particular variable assignments, and move the imperative state to the stack. A for-loop, for example, becomes recursion:
// Imperative
let printTo x =
for a in 1 .. x do
printfn "%i" a
// Recursive
let printTo x =
let rec loop a = if a <= x then printfn "%i" a; loop (a + 1)
loop 1
Its not very pretty, but we got the same effect with no mutation. Of course, wherever possible, we like avoid looping altogether and just abstract it away:
// Preferred
let printTo x = seq { 1 .. x } |> Seq.iter (fun a -> printfn "%i" a)
The Seq.iter method will enumerate through the collection and invoke the anonymous function for each item. Very handy :)
I know, printing numbers isn't exactly impressive. However, we can use the same approach with games: hold all state in the stack and create a new object with our changes in the recursive call. In this way, each frame is a stateless snapshot of the game, where each frame simply creates a brand new object with the desired changes of whatever stateless objects needs updating. The pseudocode for this might be:
// imperative version
pacman = new pacman(0, 0)
while true
if key = UP then pacman.y++
elif key = DOWN then pacman.y--
elif key = LEFT then pacman.x--
elif key = UP then pacman.x++
render(pacman)
// functional version
let rec loop pacman =
render(pacman)
let x, y = switch(key)
case LEFT: pacman.x - 1, pacman.y
case RIGHT: pacman.x + 1, pacman.y
case UP: pacman.x, pacman.y - 1
case DOWN: pacman.x, pacman.y + 1
loop(new pacman(x, y))
The imperative and functional versions are identical, but the functional version clearly uses no mutable state. The functional code keeps all state is held on the stack -- the nice thing about this approach is that, if something goes wrong, debugging is easy, all you need is a stack trace.
This scales up to any number of objects in the game, because all objects (or collections of related objects) can be rendered in their own thread.
Just about every user application I
can think of involves state as a core
concept.
In functional languages, rather than mutating the state of objects, we simply return a new object with the changes we want. Its more efficient than it sounds. Data structures, for example, are very easy to represent as immutable data structures. Stacks, for example, are notoriously easy to implement:
using System;
namespace ConsoleApplication1
{
static class Stack
{
public static Stack<T> Cons<T>(T hd, Stack<T> tl) { return new Stack<T>(hd, tl); }
public static Stack<T> Append<T>(Stack<T> x, Stack<T> y)
{
return x == null ? y : Cons(x.Head, Append(x.Tail, y));
}
public static void Iter<T>(Stack<T> x, Action<T> f) { if (x != null) { f(x.Head); Iter(x.Tail, f); } }
}
class Stack<T>
{
public readonly T Head;
public readonly Stack<T> Tail;
public Stack(T hd, Stack<T> tl)
{
this.Head = hd;
this.Tail = tl;
}
}
class Program
{
static void Main(string[] args)
{
Stack<int> x = Stack.Cons(1, Stack.Cons(2, Stack.Cons(3, Stack.Cons(4, null))));
Stack<int> y = Stack.Cons(5, Stack.Cons(6, Stack.Cons(7, Stack.Cons(8, null))));
Stack<int> z = Stack.Append(x, y);
Stack.Iter(z, a => Console.WriteLine(a));
Console.ReadKey(true);
}
}
}
The code above constructs two immutable lists, appends them together to make a new list, and appends the results. No mutable state is used anywhere in the application. It looks a little bulky, but that's only because C# is a verbose language. Here's the equivalent program in F#:
type 'a stack =
| Cons of 'a * 'a stack
| Nil
let rec append x y =
match x with
| Cons(hd, tl) -> Cons(hd, append tl y)
| Nil -> y
let rec iter f = function
| Cons(hd, tl) -> f(hd); iter f tl
| Nil -> ()
let x = Cons(1, Cons(2, Cons(3, Cons(4, Nil))))
let y = Cons(5, Cons(6, Cons(7, Cons(8, Nil))))
let z = append x y
iter (fun a -> printfn "%i" a) z
No mutable necessary to create and manipulate lists. Nearly all data structures can be easily converted into their functional equivalents. I wrote a page here which provides immutable implementations of stacks, queues, leftist heaps, red-black trees, lazy lists. Not a single snippet of code contains any mutable state. To "mutate" a tree, I create a brand new one with new node I want -- this is very efficient because I don't need to make a copy of every node in the tree, I can reuse the old ones in my new tree.
Using a more significant example, I also wrote this SQL parser which is totally stateless (or at least my code is stateless, I don't know whether the underlying lexing library is stateless).
Stateless programming is just as expressive and powerful as stateful programming, it just requires a little practice to train yourself to start thinking statelessly. Of course, "stateless programming when possible, stateful programming where necessary" seems to be the motto of most impure functional languages. There's no harm in falling back on mutables when the functional approach just isn't as clean or efficient.
Best Answer
UPDATE: This question was the subject of an immensely long blog series, which you can read at Monads — thanks for the great question!
A monad is an "amplifier" of types that obeys certain rules and which has certain operations provided.
First, what is an "amplifier of types"? By that I mean some system which lets you take a type and turn it into a more special type. For example, in C# consider
Nullable<T>
. This is an amplifier of types. It lets you take a type, sayint
, and add a new capability to that type, namely, that now it can be null when it couldn't before.As a second example, consider
IEnumerable<T>
. It is an amplifier of types. It lets you take a type, say,string
, and add a new capability to that type, namely, that you can now make a sequence of strings out of any number of single strings.What are the "certain rules"? Briefly, that there is a sensible way for functions on the underlying type to work on the amplified type such that they follow the normal rules of functional composition. For example, if you have a function on integers, say
then the corresponding function on
Nullable<int>
can make all the operators and calls in there work together "in the same way" that they did before.(That is incredibly vague and imprecise; you asked for an explanation that didn't assume anything about knowledge of functional composition.)
What are the "operations"?
There is a "unit" operation (confusingly sometimes called the "return" operation) that takes a value from a plain type and creates the equivalent monadic value. This, in essence, provides a way to take a value of an unamplified type and turn it into a value of the amplified type. It could be implemented as a constructor in an OO language.
There is a "bind" operation that takes a monadic value and a function that can transform the value, and returns a new monadic value. Bind is the key operation that defines the semantics of the monad. It lets us transform operations on the unamplified type into operations on the amplified type, that obeys the rules of functional composition mentioned before.
There is often a way to get the unamplified type back out of the amplified type. Strictly speaking this operation is not required to have a monad. (Though it is necessary if you want to have a comonad. We won't consider those further in this article.)
Again, take
Nullable<T>
as an example. You can turn anint
into aNullable<int>
with the constructor. The C# compiler takes care of most nullable "lifting" for you, but if it didn't, the lifting transformation is straightforward: an operation, say,is transformed into
And turning a
Nullable<int>
back into anint
is done with theValue
property.It's the function transformation that is the key bit. Notice how the actual semantics of the nullable operation — that an operation on a
null
propagates thenull
— is captured in the transformation. We can generalize this.Suppose you have a function from
int
toint
, like our originalM
. You can easily make that into a function that takes anint
and returns aNullable<int>
because you can just run the result through the nullable constructor. Now suppose you have this higher-order method:See what you can do with that? Any method that takes an
int
and returns anint
, or takes anint
and returns aNullable<int>
can now have the nullable semantics applied to it.Furthermore: suppose you have two methods
and you want to compose them:
That is,
Z
is the composition ofX
andY
. But you cannot do that becauseX
takes anint
, andY
returns aNullable<int>
. But since you have the "bind" operation, you can make this work:The bind operation on a monad is what makes composition of functions on amplified types work. The "rules" I handwaved about above are that the monad preserves the rules of normal function composition; that composing with identity functions results in the original function, that composition is associative, and so on.
In C#, "Bind" is called "SelectMany". Take a look at how it works on the sequence monad. We need to have two things: turn a value into a sequence and bind operations on sequences. As a bonus, we also have "turn a sequence back into a value". Those operations are:
The nullable monad rule was "to combine two functions that produce nullables together, check to see if the inner one results in null; if it does, produce null, if it does not, then call the outer one with the result". That's the desired semantics of nullable.
The sequence monad rule is "to combine two functions that produce sequences together, apply the outer function to every element produced by the inner function, and then concatenate all the resulting sequences together". The fundamental semantics of the monads are captured in the
Bind
/SelectMany
methods; this is the method that tells you what the monad really means.We can do even better. Suppose you have a sequences of ints, and a method that takes ints and results in sequences of strings. We could generalize the binding operation to allow composition of functions that take and return different amplified types, so long as the inputs of one match the outputs of the other:
So now we can say "amplify this bunch of individual integers into a sequence of integers. Transform this particular integer into a bunch of strings, amplified to a sequence of strings. Now put both operations together: amplify this bunch of integers into the concatenation of all the sequences of strings." Monads allow you to compose your amplifications.
That's rather like asking "what problems does the singleton pattern solve?", but I'll give it a shot.
Monads are typically used to solve problems like:
C# uses monads in its design. As already mentioned, the nullable pattern is highly akin to the "maybe monad". LINQ is entirely built out of monads; the
SelectMany
method is what does the semantic work of composition of operations. (Erik Meijer is fond of pointing out that every LINQ function could actually be implemented bySelectMany
; everything else is just a convenience.)Most OOP languages do not have a rich enough type system to represent the monad pattern itself directly; you need a type system that supports types that are higher types than generic types. So I wouldn't try to do that. Rather, I would implement generic types that represent each monad, and implement methods that represent the three operations you need: turning a value into an amplified value, (maybe) turning an amplified value into a value, and transforming a function on unamplified values into a function on amplified values.
A good place to start is how we implemented LINQ in C#. Study the
SelectMany
method; it is the key to understanding how the sequence monad works in C#. It is a very simple method, but very powerful!Suggested, further reading: