Haskell Performance – Why Haskell’s Built-in Max Function Runs Faster

haskellperformance

I noticed that for some reason, Haskell's built in max function (which returns the greatest of two numbers) runs much faster than the one I wrote, even though they are essentially identical.

From this site: http://www.haskell.org/onlinereport/standard-prelude.html,
I found that the standard max function is defined as:

    max x y 
     | x <= y    =  y
     | otherwise =  x

which is capable of executing

foldr max 0 [0..10000000]

in 7.6 seconds (my laptop is in super power saving mode)

I wrote the exact same function though, and ran it, and

foldr myMax 0 [0..10000000]

took an average of 23.74 seconds

The two functions look identical, except that the built-in max doesn't seem to have a type signature (unless its hidden somewhere.)

Does anyone know what might be going on here? I seriously doubt that a built in function will run more than three times faster than an identical, user defined one. To me that would be very strange.

(When I say that they're identical, I mean literally clones of each other. Just to test, I C&P'd it right out of the Prelude, and it's still significantly slower.)

Edit: I thought about it more, and I think it might have something to do with the included functions being pre-compiled, where-as my functions are being interpreted via GHCI (Which would make sense then). I'll leave this up in case someone has a better answer, but I suspect this to be the cause.

(One thing I realized that I don't understand though is why GHCI says that's it compiled my code after an edit, but then goes on the say that's it's interpreting it. You don't interpret compiled code do you?)

Best Answer

When you loaded the definition for your own max function into ghci, you may have not noticed that ghci indicated that it was interpreted (something along these lines):

Prelude> :l mymax.hs
[1 of 1] Compiling Main             ( mymax.hs, interpreted )
Ok, modules loaded: Main.

In order to really benchmark performance, compile your definition with ghc and run the test again. It should execute at about the same speed as the builtin max function. Don't forget the include optimization too (-O2).