The actual pattern is actually significantly more general than just data access. It's a lightweight way of creating a domain-specific language that gives you an AST, and then having one or more interpreters to "execute" the AST however you like.
The free monad part is just a handy way to get an AST that you can assemble using Haskell's standard monad facilities (like do-notation) without having to write lots of custom code. This also ensures that your DSL is composable: you can define it in parts and then put the parts together in a structured way, letting you take advantage of Haskell's normal abstractions like functions.
Using a free monad gives you the structure of a composable DSL; all you have to do is specify the pieces. You just write a data type that encompasses all of the actions in your DSL. These actions could be doing anything, not just data access. However, if you specified all your data accesses as actions, you would get an AST that specifies all the queries and commands to the data store. You could then interpret this however you like: run it against a live database, run it against a mock, just log the commands for debugging or even try optimizing the queries.
Lets look at a very simple example for, say, a key value store. For now, we'll just treat both keys and values as strings, but you could add types with a bit of effort.
data DSL next = Get String (String -> next)
| Set String String next
| End
The next
parameter lets us combine actions. We can use this to write a program that gets "foo" and sets "bar" with that value:
p1 = Get "foo" $ \ foo -> Set "bar" foo End
Unfortunately, this is not enough for a meaningful DSL. Since we used next
for composition, the type of p1
is the same length as our program (ie 3 commands):
p1 :: DSL (DSL (DSL next))
In this particular example, using next
like this seems a little odd, but it's important if we want our actions to have different type variables. We might want a typed get
and set
, for example.
Note how the next
field is different for each action. This hints that we can use it to make DSL
a functor:
instance Functor DSL where
fmap f (Get name k) = Get name (f . k)
fmap f (Set name value next) = Set name value (f next)
fmap f End = End
In fact, this is the only valid way to make it a Functor, so we can use deriving
to create the instance automatically by enabling the DeriveFunctor
extension.
The next step is the Free
type itself. That's what we use to represent our AST structure, build on top of the DSL
type. You can think of it like a list at the type level, where "cons" is just nesting a functor like DSL
:
-- compare the two types:
data Free f a = Free (f (Free f a)) | Return a
data List a = Cons a (List a) | Nil
So we can use Free DSL next
to give programs of different sizes the same types:
p2 = Free (Get "foo" $ \ foo -> Free (Set "bar" foo (Free End)))
Which has the much nicer type:
p2 :: Free DSL a
However, the actual expression with all of its constructors is still very awkward to use! This is where the monad part comes in. As the name "free monad" implies, Free
is a monad—as long as f
(in this case DSL
) is a functor:
instance Functor f => Monad (Free f) where
return = Return
Free a >>= f = Free (fmap (>>= f) a)
Return a >>= f = f a
Now we're getting somewhere: we can use do
notation to make our DSL expressions nicer. The only question is what to put in for next
? Well, the idea is to use the Free
structure for composition, so we will just put Return
for each next field and let the do-notation do all the plumbing:
p3 = do foo <- Free (Get "foo" Return)
Free (Set "bar" foo (Return ()))
Free End
This is better, but it's still a bit awkward. We have Free
and Return
all over the place. Happily, there's a pattern we can exploit: the way we "lift" a DSL action into Free
is always the same—we wrap it in Free
and apply Return
for next
:
liftFree :: Functor f => f a -> Free f a
liftFree action = Free (fmap Return action)
Now, using this, we can write nice versions of each of our commands and have a full DSL:
get key = liftFree (Get key id)
set key value = liftFree (Set key value ())
end = liftFree End
Using this, here's how we can write our program:
p4 :: Free DSL a
p4 = do foo <- get "foo"
set "bar" foo
end
The neat trick is that while p4
looks just like a little imperative program, it's actually an expression that has the value
Free (Get "foo" $ \ foo -> Free (Set "bar" foo (Free End)))
So, the free monad part of the pattern has gotten us a DSL that produces syntax trees with nice syntax. We can also write composable sub-trees by not using End
; for example, we could have follow
which takes a key, gets its value and then uses that as a key itself:
follow :: String -> Free DSL String
follow key = do key' <- get key
get key'
Now follow
can be used in our programs just like get
or set
:
p5 = do foo <- follow "foo"
set "bar" foo
end
So we get some nice composition and abstraction for our DSL as well.
Now that we have a tree, we get to the second half of the pattern: the interpreter. We can interpret the tree however we like just by pattern-matching on it. This would let us write code against a real data store in IO
, as well as other things. Here's an example against a hypothetical data store:
runIO :: Free DSL a -> IO ()
runIO (Free (Get key k)) =
do res <- getKey key
runIO $ k res
runIO (Free (Set key value next)) =
do setKey key value
runIO next
runIO (Free End) = close
runIO (Return _) = return ()
This will happily evaluate any DSL
fragment, even one that isn't ended with end
. Happily, we can make a "safe" version of the function that only accepts programs closed with end
by setting the input type signature to (forall a. Free DSL a) -> IO ()
. While the old signature accepts a Free DSL a
for any a
(like Free DSL String
, Free DSL Int
and so on), this version only accepts a Free DSL a
that works for every possible a
—which we can only create with end
. This guarantees we won't forget to close the connection when we're done.
safeRunIO :: (forall a. Free DSL a) -> IO ()
safeRunIO = runIO
(We can't just start by giving runIO
this type because it won't work properly for our recursive call. However, we could move the definition of runIO
into a where
block in safeRunIO
and get the same effect without exposing both versions of the function.)
Running our code in IO
is not the only thing we could do. For testing, we might want to run it against a pure State Map
instead. Writing out that code is a good exercise.
So this is the free monad + interpreter pattern. We make a DSL, taking advantage of the free monad structure to do all the plumbing. We can use do-notation and the standard monad functions with our DSL. Then, to actually use it, we have to interpret it somehow; since the tree is ultimately just a data structure, we can interpret it however we like for different purposes.
When we use this to manage accesses to an external data store, it is indeed similar to the Repository pattern. It intermediates between our data store and our code, separating the two out. In some ways, though, it's more specific: the "repository" is always a DSL with an explicit AST which we can then use however we like.
However, the pattern itself is more general than that. It can be used for lots of things which do not necessarily involve external databases or storage. It makes sense wherever you want fine control of effects or multiple targets for a DSL.
You have described an effect system. It’s true that there are other effect systems than monads, but in practice monads give you a lot of expressive power that you would need to reinvent in any practical effect system you might devise.
For example:
- I start by tagging my I/O procedures with an
io
effect.
- My pure functions can still throw exceptions, but throwing exceptions isn’t
io
, so I add an exn
effect. Now I need to be able to combine effects.
- I want to have non-I/O functions that use mutation internally on a local heap
h
, and I want the compiler to make sure that my mutable references don’t escape their scope, so I add an st<h>
effect. Now I need parameterised effects.
- For higher-order functions, I need to write multiple versions for every combination of effects I want to support—for example,
map
with a pure function versus map
with an impure function. Now I need effect polymorphism.
- I notice that my user-defined type could be used to model effects (benign failure, nondeterminism) that the language doesn’t have natively. Now I need user-defined effects.
Monads support all of these use cases:
- Functions that may return errors can use the
Either
monad.
- Effects can be combined with monad transformers—if I need exceptions and I/O, I can use the
EitherT IO
monad.
- Monads can be parameterised like any other type; the
ST h
monad lets you do local mutation on STRef
s in the scope of a heap h
.
- Because of the
Monad
typeclass, I can overload an operation to work on any kind of effect. For example, mapM :: (Monad m) => (a -> m b) -> [a] -> m [b]
works in IO
, but it also works in Either
or ST
or any other monad.
- All monads are implemented as user-defined types, except for things like
IO
and ST
which must be implemented in the Haskell runtime.
There are of course some significant warts in Haskell’s use of monads for side effects.
For one thing, monad transformers are generally not commutative, so StateT Maybe
is different from MaybeT State
, and you have to be careful to choose the combination that you actually mean.
The bigger problem is that a function can be polymorphic in which monad is being used, but it cannot be polymorphic in whether a monad is being used. So we get loads of duplicated code: map
vs. mapM
; zipWith
vs. zipWithM
, &c.
Now, in answer to your actual question…
Koka from Microsoft Research is a language with an effect system that does not suffer from these problems. For example, the type of Koka’s map
function includes a type parameter e
for effects:
map : (xs : list<a>, f : (a) -> e b) -> e list<b>
And this parameter can be instantiated to the null effect, making map
work for pure and impure functions alike. In addition, the order in which effects are specified is immaterial: <exn,io>
is the same as <io,exn>
.
It’s also worth mentioning that Java’s checked exception mechanism is an example of an effect system that people have widely accepted, but I don’t feel it adequately answers your question because it’s not general.
Best Answer
That's the bit where I think you're not seeing it from the Haskellers' perspective. So we have a program like this:
I think a typical Haskeller's take on this would be that
convert
, the pure part:IO
parts;IO
at all.So they don't see this as
convert
being "contained" inIO
, but rather, as it being isolated fromIO
. From its type, whateverconvert
does can never depend on anything that happens in anIO
action.I'd say that this splits into two things:
convert
depends on the state of the file.convert
function does, that doesn't depend on the state of the file.convert
is always the same function, even if it is invoked with different arguments at different points.This is a somewhat abstract point, but it's really key to what Haskellers mean when they talk about this. You want to write
convert
in such a way that given any valid argument, it will produce a correct result for that argument. When you look at it like that, the fact that reading a file is a stateful operation doesn't enter into the equation; all that matters is that whatever argument is fed to it and wherever that may have come from,convert
must handle it correctly. And the fact that purity restricts whatconvert
can do with its input simplifies that reasoning.So if
convert
produces incorrect results from some arguments, andreadFile
feeds it such an argument, we don't see that as a bug introduced by state. It's a bug in a pure function!