I've seen people talking about Free Monad with Interpreter, particularly in the context of data-access. What is this pattern? When might I want to use it? How does it work, and how would I implement it?
I understand (from posts such as this) that it's about separating model from data-access. How does it differ from the well-known Repository pattern? They appear to have the same motivation.
Best Answer
The actual pattern is actually significantly more general than just data access. It's a lightweight way of creating a domain-specific language that gives you an AST, and then having one or more interpreters to "execute" the AST however you like.
The free monad part is just a handy way to get an AST that you can assemble using Haskell's standard monad facilities (like do-notation) without having to write lots of custom code. This also ensures that your DSL is composable: you can define it in parts and then put the parts together in a structured way, letting you take advantage of Haskell's normal abstractions like functions.
Using a free monad gives you the structure of a composable DSL; all you have to do is specify the pieces. You just write a data type that encompasses all of the actions in your DSL. These actions could be doing anything, not just data access. However, if you specified all your data accesses as actions, you would get an AST that specifies all the queries and commands to the data store. You could then interpret this however you like: run it against a live database, run it against a mock, just log the commands for debugging or even try optimizing the queries.
Lets look at a very simple example for, say, a key value store. For now, we'll just treat both keys and values as strings, but you could add types with a bit of effort.
The
next
parameter lets us combine actions. We can use this to write a program that gets "foo" and sets "bar" with that value:Unfortunately, this is not enough for a meaningful DSL. Since we used
next
for composition, the type ofp1
is the same length as our program (ie 3 commands):In this particular example, using
next
like this seems a little odd, but it's important if we want our actions to have different type variables. We might want a typedget
andset
, for example.Note how the
next
field is different for each action. This hints that we can use it to makeDSL
a functor:In fact, this is the only valid way to make it a Functor, so we can use
deriving
to create the instance automatically by enabling theDeriveFunctor
extension.The next step is the
Free
type itself. That's what we use to represent our AST structure, build on top of theDSL
type. You can think of it like a list at the type level, where "cons" is just nesting a functor likeDSL
:So we can use
Free DSL next
to give programs of different sizes the same types:Which has the much nicer type:
However, the actual expression with all of its constructors is still very awkward to use! This is where the monad part comes in. As the name "free monad" implies,
Free
is a monad—as long asf
(in this caseDSL
) is a functor:Now we're getting somewhere: we can use
do
notation to make our DSL expressions nicer. The only question is what to put in fornext
? Well, the idea is to use theFree
structure for composition, so we will just putReturn
for each next field and let the do-notation do all the plumbing:This is better, but it's still a bit awkward. We have
Free
andReturn
all over the place. Happily, there's a pattern we can exploit: the way we "lift" a DSL action intoFree
is always the same—we wrap it inFree
and applyReturn
fornext
:Now, using this, we can write nice versions of each of our commands and have a full DSL:
Using this, here's how we can write our program:
The neat trick is that while
p4
looks just like a little imperative program, it's actually an expression that has the valueSo, the free monad part of the pattern has gotten us a DSL that produces syntax trees with nice syntax. We can also write composable sub-trees by not using
End
; for example, we could havefollow
which takes a key, gets its value and then uses that as a key itself:Now
follow
can be used in our programs just likeget
orset
:So we get some nice composition and abstraction for our DSL as well.
Now that we have a tree, we get to the second half of the pattern: the interpreter. We can interpret the tree however we like just by pattern-matching on it. This would let us write code against a real data store in
IO
, as well as other things. Here's an example against a hypothetical data store:This will happily evaluate any
DSL
fragment, even one that isn't ended withend
. Happily, we can make a "safe" version of the function that only accepts programs closed withend
by setting the input type signature to(forall a. Free DSL a) -> IO ()
. While the old signature accepts aFree DSL a
for anya
(likeFree DSL String
,Free DSL Int
and so on), this version only accepts aFree DSL a
that works for every possiblea
—which we can only create withend
. This guarantees we won't forget to close the connection when we're done.(We can't just start by giving
runIO
this type because it won't work properly for our recursive call. However, we could move the definition ofrunIO
into awhere
block insafeRunIO
and get the same effect without exposing both versions of the function.)Running our code in
IO
is not the only thing we could do. For testing, we might want to run it against a pureState Map
instead. Writing out that code is a good exercise.So this is the free monad + interpreter pattern. We make a DSL, taking advantage of the free monad structure to do all the plumbing. We can use do-notation and the standard monad functions with our DSL. Then, to actually use it, we have to interpret it somehow; since the tree is ultimately just a data structure, we can interpret it however we like for different purposes.
When we use this to manage accesses to an external data store, it is indeed similar to the Repository pattern. It intermediates between our data store and our code, separating the two out. In some ways, though, it's more specific: the "repository" is always a DSL with an explicit AST which we can then use however we like.
However, the pattern itself is more general than that. It can be used for lots of things which do not necessarily involve external databases or storage. It makes sense wherever you want fine control of effects or multiple targets for a DSL.