Functional Programming – Understanding the Free Monad + Interpreter Pattern

design-patternsdslfunctional programminghaskellmonad

I've seen people talking about Free Monad with Interpreter, particularly in the context of data-access. What is this pattern? When might I want to use it? How does it work, and how would I implement it?

I understand (from posts such as this) that it's about separating model from data-access. How does it differ from the well-known Repository pattern? They appear to have the same motivation.

Best Answer

The actual pattern is actually significantly more general than just data access. It's a lightweight way of creating a domain-specific language that gives you an AST, and then having one or more interpreters to "execute" the AST however you like.

The free monad part is just a handy way to get an AST that you can assemble using Haskell's standard monad facilities (like do-notation) without having to write lots of custom code. This also ensures that your DSL is composable: you can define it in parts and then put the parts together in a structured way, letting you take advantage of Haskell's normal abstractions like functions.

Using a free monad gives you the structure of a composable DSL; all you have to do is specify the pieces. You just write a data type that encompasses all of the actions in your DSL. These actions could be doing anything, not just data access. However, if you specified all your data accesses as actions, you would get an AST that specifies all the queries and commands to the data store. You could then interpret this however you like: run it against a live database, run it against a mock, just log the commands for debugging or even try optimizing the queries.

Lets look at a very simple example for, say, a key value store. For now, we'll just treat both keys and values as strings, but you could add types with a bit of effort.

data DSL next = Get String (String -> next)
              | Set String String next
              | End

The next parameter lets us combine actions. We can use this to write a program that gets "foo" and sets "bar" with that value:

p1 = Get "foo" $ \ foo -> Set "bar" foo End

Unfortunately, this is not enough for a meaningful DSL. Since we used next for composition, the type of p1 is the same length as our program (ie 3 commands):

p1 :: DSL (DSL (DSL next))

In this particular example, using next like this seems a little odd, but it's important if we want our actions to have different type variables. We might want a typed get and set, for example.

Note how the next field is different for each action. This hints that we can use it to make DSL a functor:

instance Functor DSL where
  fmap f (Get name k)          = Get name (f . k)
  fmap f (Set name value next) = Set name value (f next)
  fmap f End                   = End

In fact, this is the only valid way to make it a Functor, so we can use deriving to create the instance automatically by enabling the DeriveFunctor extension.

The next step is the Free type itself. That's what we use to represent our AST structure, build on top of the DSL type. You can think of it like a list at the type level, where "cons" is just nesting a functor like DSL:

-- compare the two types:
data Free f a = Free (f (Free f a)) | Return a
data List a   = Cons a (List a)     | Nil

So we can use Free DSL next to give programs of different sizes the same types:

p2 = Free (Get "foo" $ \ foo -> Free (Set "bar" foo (Free End)))

Which has the much nicer type:

p2 :: Free DSL a

However, the actual expression with all of its constructors is still very awkward to use! This is where the monad part comes in. As the name "free monad" implies, Free is a monad—as long as f (in this case DSL) is a functor:

instance Functor f => Monad (Free f) where
  return         = Return
  Free a >>= f   = Free (fmap (>>= f) a)
  Return a >>= f = f a

Now we're getting somewhere: we can use do notation to make our DSL expressions nicer. The only question is what to put in for next? Well, the idea is to use the Free structure for composition, so we will just put Return for each next field and let the do-notation do all the plumbing:

p3 = do foo <- Free (Get "foo" Return)
        Free (Set "bar" foo (Return ()))
        Free End

This is better, but it's still a bit awkward. We have Free and Return all over the place. Happily, there's a pattern we can exploit: the way we "lift" a DSL action into Free is always the same—we wrap it in Free and apply Return for next:

liftFree :: Functor f => f a -> Free f a
liftFree action = Free (fmap Return action)

Now, using this, we can write nice versions of each of our commands and have a full DSL:

get key       = liftFree (Get key id)
set key value = liftFree (Set key value ())
end           = liftFree End

Using this, here's how we can write our program:

p4 :: Free DSL a
p4 = do foo <- get "foo"
        set "bar" foo
        end

The neat trick is that while p4 looks just like a little imperative program, it's actually an expression that has the value

Free (Get "foo" $ \ foo -> Free (Set "bar" foo (Free End)))

So, the free monad part of the pattern has gotten us a DSL that produces syntax trees with nice syntax. We can also write composable sub-trees by not using End; for example, we could have follow which takes a key, gets its value and then uses that as a key itself:

follow :: String -> Free DSL String
follow key = do key' <- get key
                get key'

Now follow can be used in our programs just like get or set:

p5 = do foo <- follow "foo"
        set "bar" foo
        end

So we get some nice composition and abstraction for our DSL as well.

Now that we have a tree, we get to the second half of the pattern: the interpreter. We can interpret the tree however we like just by pattern-matching on it. This would let us write code against a real data store in IO, as well as other things. Here's an example against a hypothetical data store:

runIO :: Free DSL a -> IO ()
runIO (Free (Get key k)) =
  do res <- getKey key
     runIO $ k res
runIO (Free (Set key value next)) =
  do setKey key value
     runIO next
runIO (Free End) = close
runIO (Return _) = return ()

This will happily evaluate any DSL fragment, even one that isn't ended with end. Happily, we can make a "safe" version of the function that only accepts programs closed with end by setting the input type signature to (forall a. Free DSL a) -> IO (). While the old signature accepts a Free DSL a for any a (like Free DSL String, Free DSL Int and so on), this version only accepts a Free DSL a that works for every possible a—which we can only create with end. This guarantees we won't forget to close the connection when we're done.

safeRunIO :: (forall a. Free DSL a) -> IO ()
safeRunIO = runIO

(We can't just start by giving runIO this type because it won't work properly for our recursive call. However, we could move the definition of runIO into a where block in safeRunIO and get the same effect without exposing both versions of the function.)

Running our code in IO is not the only thing we could do. For testing, we might want to run it against a pure State Map instead. Writing out that code is a good exercise.

So this is the free monad + interpreter pattern. We make a DSL, taking advantage of the free monad structure to do all the plumbing. We can use do-notation and the standard monad functions with our DSL. Then, to actually use it, we have to interpret it somehow; since the tree is ultimately just a data structure, we can interpret it however we like for different purposes.

When we use this to manage accesses to an external data store, it is indeed similar to the Repository pattern. It intermediates between our data store and our code, separating the two out. In some ways, though, it's more specific: the "repository" is always a DSL with an explicit AST which we can then use however we like.

However, the pattern itself is more general than that. It can be used for lots of things which do not necessarily involve external databases or storage. It makes sense wherever you want fine control of effects or multiple targets for a DSL.

Best Answer

Related Solutions

Repository Pattern – How to Implement for Complex Object Models

JavaScript Monad – When Does a Monad Become a Hammer?

Related Topic