IO Monad Pattern – Academic Benefits for Handling Side Effects

functional programmingmonadpythonside-effectstate

Sorry for yet another FP + side effects question, but I couldn't find an existing one which quite answered this for me.

My (limited) understanding of functional programming is that state/side effects should be minimised and kept separate from stateless logic.

I also gather Haskell’s approach to this, the IO monad, achieves this by wrapping stateful actions in a container, for later execution, considered outside the scope of the program itself.

I’m trying to understand this pattern, but actually to determine whether to use it in a Python project, so want to avoid Haskell specifics if poss.

Crude example incoming.

If my program converts an XML file to a JSON file:

def main():
    xml_data = read_file('input.xml')  # impure
    json_data = convert(xml_data)  # pure
    write_file('output.json', json_data) # impure

Isn’t the IO monad’s approach effectively to do this:

steps = list(
    read_file,
    convert,
    write_file,
)

then absolve itself of responsibility by not actually calling those steps, but letting the interpreter do it?

Or put another way, it’s like writing:

def main():  # pure
    def inner():  # impure
        xml_data = read_file('input.xml')
        json_data = convert(xml_data)
        write_file('output.json', json_data)
    return inner

then expecting someone else to call inner() and saying your job is done because main() is pure.

The whole program is going to end up contained in the IO monad, basically.

When the code is actually executed, everything after reading the file depends on that file’s state so will still suffer from the same state-related bugs as the imperative implementation, so have you actually gained anything, as a programmer who will maintain this?

I totally appreciate the benefit of reducing and isolating stateful behaviour, which is in fact why I structured the imperative version like that: gather inputs, do pure stuff, spit out outputs. Hopefully convert() can be completely pure and reap the benefits of cachability, threadsafety, etc.

I also appreciate that monadic types can be useful, especially in pipelines operating on comparable types, but don’t see why IO should use monads unless already in such a pipeline.

Is there some additional benefit to dealing with side effects the IO monad pattern brings, which I’m missing?

Best Answer

The whole program is going to end up contained in the IO monad, basically.

That's the bit where I think you're not seeing it from the Haskellers' perspective. So we have a program like this:

module Main

main :: IO ()
main = do
  xmlData <- readFile "input.xml"
  let jsonData = convert xmlData
  writeFile "output.json" jsonData

convert :: String -> String
convert xml = ...

I think a typical Haskeller's take on this would be that convert, the pure part:

  1. Is probably the bulk of this program, and by far more complicated than the IO parts;
  2. Can be reasoned about and tested without having to deal with IO at all.

So they don't see this as convert being "contained" in IO, but rather, as it being isolated from IO. From its type, whatever convert does can never depend on anything that happens in an IO action.

When the code is actually executed, everything after reading the file depends on that file’s state so will still suffer from the same state-related bugs as the imperative implementation, so have you actually gained anything, as a programmer who will maintain this?

I'd say that this splits into two things:

  1. When the program runs, the value of the argument to convert depends on the state of the file.
  2. But what the convert function does, that doesn't depend on the state of the file. convert is always the same function, even if it is invoked with different arguments at different points.

This is a somewhat abstract point, but it's really key to what Haskellers mean when they talk about this. You want to write convert in such a way that given any valid argument, it will produce a correct result for that argument. When you look at it like that, the fact that reading a file is a stateful operation doesn't enter into the equation; all that matters is that whatever argument is fed to it and wherever that may have come from, convert must handle it correctly. And the fact that purity restricts what convert can do with its input simplifies that reasoning.

So if convert produces incorrect results from some arguments, and readFile feeds it such an argument, we don't see that as a bug introduced by state. It's a bug in a pure function!