Haskell – Why is Haskell (GHC) so darn fast

ghchaskellhigher-order-functionslambda-calculusperformance

Haskell (with the GHC compiler) is a lot faster than you'd expect. Used correctly, it can get close-ish to low-level languages. (A favorite thing for Haskellers to do is to try and get within 5% of C (or even beat it, but that means you are using an inefficient C program, since GHC compiles Haskell to C).) My question is, why?

Haskell is declarative and based on lambda calculus. Machine architectures are clearly imperative, being based on turing machines, roughly. Indeed, Haskell doesn't even have a specific evaluation order. Also, instead of dealing with machine data types, you make algebraic data types all the time.

Weirdest of all though is higher order functions. You would think that creating functions on the fly, and throwing them around, would make a program slower. But using higher order functions actually makes Haskell faster. Indeed, it seems that, to optimize Haskell code, you need to make it more elegant and abstract instead of more machine-like. None of Haskell's more advanced features seem to even affect its performance, if they don't improve it.

Sorry if this is sounding ranty, but here is my question: Why is Haskell (compiled with GHC) so fast, considering its abstract nature and differences from physical machines?

Note: The reason I say C and other imperative languages are somewhat similar to Turing Machines (but not to the extent that Haskell is similar to Lambda Calculus) is that in an imperative language, you have a finite number of states (a.k.a. line number), along with a Tape (the ram), such that the state and the current tape determine what to do to the tape. See the Wikipedia entry, Turing machine equivalents, for the transition from Turing Machines to computers.

Best Answer

I agree with Dietrich Epp: it's a combination of several things that make GHC fast.

First and foremost, Haskell is very high-level. This enables the compiler to perform aggressive optimisations without breaking your code.

Think about SQL. Now, when I write a SELECT statement, it might look like an imperative loop, but it isn't. It might look like it loops over all rows in that table trying to find the one that matches the specified conditions, but actually the "compiler" (the DB engine) could be doing an index lookup instead — which has completely different performance characteristics. But because SQL is so high-level, the "compiler" can substitute totally different algorithms, apply multiple processors or I/O channels or entire servers transparently, and more.

I think of Haskell as being the same. You might think you just asked Haskell to map the input list to a second list, filter the second list into a third list, and then count how many items resulted. But you didn't see GHC apply stream-fusion rewrite rules behind the scenes, transforming the entire thing into a single tight machine code loop that does the whole job in a single pass over the data with no allocation — the kind of thing that would be tedious, error-prone and non-maintainable to write by hand. That's only really possible because of the lack of low-level details in the code.

Another way to look at it might be… why shouldn't Haskell be fast? What does it do that should make it slow?

It's not an interpreted language like Perl or JavaScript. It's not even a virtual machine system like Java or C#. It compiles all the way down to native machine code, so no overhead there.

Unlike OO languages [Java, C#, JavaScript…], Haskell has full type erasure [like C, C++, Pascal…]. All type checking happens at compile-time only. So there's no run-time type-checking to slow you down either. (No null-pointer checks, for that matter. In, say, Java, the JVM must check for null pointers and throw an exception if you deference one. Haskell doesn't have to bother with that check.)

You say it sounds slow to "create functions on the fly at run-time", but if you look very carefully, you don't actually do that. It might look like you do, but you don't. If you say (+5), well, that's hard-coded into your source code. It cannot change at run-time. So it's not really a dynamic function. Even curried functions are really just saving parameters into a data block. All the executable code actually exists at compile-time; there is no run-time interpretation. (Unlike some other languages that have an "eval function".)

Think about Pascal. It's old and nobody really uses it any more, but nobody would complain that Pascal is slow. There are plenty of things to dislike about it, but slowness is not really one of them. Haskell isn't really doing that much that's different to Pascal, other than having garbage collection rather than manual memory management. And immutable data allows several optimisations to the GC engine [which lazy evaluation then complicates somewhat].

I think the thing is that Haskell looks advanced and sophisticated and high-level, and everybody thinks "oh wow, this is really powerful, it must be amazingly slow!" But it isn't. Or at least, it isn't in the way you'd expect. Yes, it's got an amazing type system. But you know what? That all happens at compile-time. By run-time, it's gone. Yes, it allows you to construct complicated ADTs with a line of code. But you know what? An ADT is just a plain ordinary C union of structs. Nothing more.

The real killer is lazy evaluation. When you get the strictness / laziness of your code right, you can write stupidly fast code that is still elegant and beautiful. But if you get this stuff wrong, your program goes thousands of times slower, and it's really non-obvious why this is happening.

For example, I wrote a trivial little program to count how many times each byte appears in a file. For a 25KB input file, the program took 20 minutes to run and swallowed 6 gigabytes of RAM! That's absurd!! But then I realized what the problem was, added a single bang-pattern, and the run-time dropped to 0.02 seconds.

This is where Haskell goes unexpectedly slowly. And it sure takes a while to get used to it. But over time, it gets easier to write really fast code.

What makes Haskell so fast? Purity. Static types. Laziness. But above all, being sufficiently high-level that the compiler can radically change the implementation without breaking your code's expectations.

But I guess that's just my opinion...

Related Solutions

When is assembly faster than C?

Here is a real world example: Fixed point multiplies on old compilers.

These don't only come handy on devices without floating point, they shine when it comes to precision as they give you 32 bits of precision with a predictable error (float only has 23 bit and it's harder to predict precision loss). i.e. uniform absolute precision over the entire range, instead of close-to-uniform relative precision (float).

Modern compilers optimize this fixed-point example nicely, so for more modern examples that still need compiler-specific code, see

Getting the high part of 64 bit integer multiplication: A portable version using uint64_t for 32x32 => 64-bit multiplies fails to optimize on a 64-bit CPU, so you need intrinsics or __int128 for efficient code on 64-bit systems.
_umul128 on Windows 32 bits: MSVC doesn't always do a good job when multiplying 32-bit integers cast to 64, so intrinsics helped a lot.

C doesn't have a full-multiplication operator (2N-bit result from N-bit inputs). The usual way to express it in C is to cast the inputs to the wider type and hope the compiler recognizes that the upper bits of the inputs aren't interesting:

// on a 32-bit machine, int can hold 32-bit fixed-point integers.
int inline FixedPointMul (int a, int b)
{
  long long a_long = a; // cast to 64 bit.

  long long product = a_long * b; // perform multiplication

  return (int) (product >> 16);  // shift by the fixed point bias
}

The problem with this code is that we do something that can't be directly expressed in the C-language. We want to multiply two 32 bit numbers and get a 64 bit result of which we return the middle 32 bit. However, in C this multiply does not exist. All you can do is to promote the integers to 64 bit and do a 64*64 = 64 multiply.

x86 (and ARM, MIPS and others) can however do the multiply in a single instruction. Some compilers used to ignore this fact and generate code that calls a runtime library function to do the multiply. The shift by 16 is also often done by a library routine (also the x86 can do such shifts).

So we're left with one or two library calls just for a multiply. This has serious consequences. Not only is the shift slower, registers must be preserved across the function calls and it does not help inlining and code-unrolling either.

If you rewrite the same code in (inline) assembler you can gain a significant speed boost.

In addition to this: using ASM is not the best way to solve the problem. Most compilers allow you to use some assembler instructions in intrinsic form if you can't express them in C. The VS.NET2008 compiler for example exposes the 32*32=64 bit mul as __emul and the 64 bit shift as __ll_rshift.

Using intrinsics you can rewrite the function in a way that the C-compiler has a chance to understand what's going on. This allows the code to be inlined, register allocated, common subexpression elimination and constant propagation can be done as well. You'll get a huge performance improvement over the hand-written assembler code that way.

For reference: The end-result for the fixed-point mul for the VS.NET compiler is:

int inline FixedPointMul (int a, int b)
{
    return (int) __ll_rshift(__emul(a,b),16);
}

The performance difference of fixed point divides is even bigger. I had improvements up to factor 10 for division heavy fixed point code by writing a couple of asm-lines.

Using Visual C++ 2013 gives the same assembly code for both ways.

gcc4.1 from 2007 also optimizes the pure C version nicely. (The Godbolt compiler explorer doesn't have any earlier versions of gcc installed, but presumably even older GCC versions could do this without intrinsics.)

See source + asm for x86 (32-bit) and ARM on the Godbolt compiler explorer. (Unfortunately it doesn't have any compilers old enough to produce bad code from the simple pure C version.)

Modern CPUs can do things C doesn't have operators for at all, like popcnt or bit-scan to find the first or last set bit. (POSIX has a ffs() function, but its semantics don't match x86 bsf / bsr. See https://en.wikipedia.org/wiki/Find_first_set).

Some compilers can sometimes recognize a loop that counts the number of set bits in an integer and compile it to a popcnt instruction (if enabled at compile time), but it's much more reliable to use __builtin_popcnt in GNU C, or on x86 if you're only targeting hardware with SSE4.2: _mm_popcnt_u32 from <immintrin.h>.

Or in C++, assign to a std::bitset<32> and use .count(). (This is a case where the language has found a way to portably expose an optimized implementation of popcount through the standard library, in a way that will always compile to something correct, and can take advantage of whatever the target supports.) See also https://en.wikipedia.org/wiki/Hamming_weight#Language_support.

Similarly, ntohl can compile to bswap (x86 32-bit byte swap for endian conversion) on some C implementations that have it.

Another major area for intrinsics or hand-written asm is manual vectorization with SIMD instructions. Compilers are not bad with simple loops like dst[i] += src[i] * 10.0;, but often do badly or don't auto-vectorize at all when things get more complicated. For example, you're unlikely to get anything like How to implement atoi using SIMD? generated automatically by the compiler from scalar code.

Haskell – Getting started with Haskell

I'm going to order this guide by the level of skill you have in Haskell, going from an absolute beginner right up to an expert. Note that this process will take many months (years?), so it is rather long.

Absolute Beginner

Firstly, Haskell is capable of anything, with enough skill. It is very fast (behind only C and C++ in my experience), and can be used for anything from simulations to servers, guis and web applications.

However there are some problems that are easier to write for a beginner in Haskell than others. Mathematical problems and list process programs are good candidates for this, as they only require the most basic of Haskell knowledge to be able to write.

Some good guides to learning the very basics of Haskell are the Happy Learn Haskell Tutorial and the first 6 chapters of Learn You a Haskell for Great Good (or its JupyterLab adaptation). While reading these, it is a very good idea to also be solving simple problems with what you know.

Another two good resources are Haskell Programming from first principles, and Programming in Haskell. They both come with exercises for each chapter, so you have small simple problems matching what you learned on the last few pages.

A good list of problems to try is the haskell 99 problems page. These start off very basic, and get more difficult as you go on. It is very good practice doing a lot of those, as they let you practice your skills in recursion and higher order functions. I would recommend skipping any problems that require randomness as that is a bit more difficult in Haskell. Check this SO question in case you want to test your solutions with QuickCheck (see Intermediate below).

Once you have done a few of those, you could move on to doing a few of the Project Euler problems. These are sorted by how many people have completed them, which is a fairly good indication of difficulty. These test your logic and Haskell more than the previous problems, but you should still be able to do the first few. A big advantage Haskell has with these problems is Integers aren't limited in size. To complete some of these problems, it will be useful to have read chapters 7 and 8 of learn you a Haskell as well.

Beginner

After that you should have a fairly good handle on recursion and higher order functions, so it would be a good time to start doing some more real world problems. A very good place to start is Real World Haskell (online book, you can also purchase a hard copy). I found the first few chapters introduced too much too quickly for someone who has never done functional programming/used recursion before. However with the practice you would have had from doing the previous problems you should find it perfectly understandable.

Working through the problems in the book is a great way of learning how to manage abstractions and building reusable components in Haskell. This is vital for people used to object-orientated (oo) programming, as the normal oo abstraction methods (oo classes) don't appear in Haskell (Haskell has type classes, but they are very different to oo classes, more like oo interfaces). I don't think it is a good idea to skip chapters, as each introduces a lot new ideas that are used in later chapters.

After a while you will get to chapter 14, the dreaded monads chapter (dum dum dummmm). Almost everyone who learns Haskell has trouble understanding monads, due to how abstract the concept is. I can't think of any concept in another language that is as abstract as monads are in functional programming. Monads allows many ideas (such as IO operations, computations that might fail, parsing,...) to be unified under one idea. So don't feel discouraged if after reading the monads chapter you don't really understand them. I found it useful to read many different explanations of monads; each one gives a new perspective on the problem. Here is a very good list of monad tutorials. I highly recommend the All About Monads, but the others are also good.

Also, it takes a while for the concepts to truly sink in. This comes through use, but also through time. I find that sometimes sleeping on a problem helps more than anything else! Eventually, the idea will click, and you will wonder why you struggled to understand a concept that in reality is incredibly simple. It is awesome when this happens, and when it does, you might find Haskell to be your favorite imperative programming language :)

To make sure that you are understanding Haskell type system perfectly, you should try to solve 20 intermediate haskell exercises. Those exercises using fun names of functions like "furry" and "banana" and helps you to have a good understanding of some basic functional programming concepts if you don't have them already. Nice way to spend your evening with a bunch of papers covered with arrows, unicorns, sausages and furry bananas.

Intermediate

Once you understand Monads, I think you have made the transition from a beginner Haskell programmer to an intermediate haskeller. So where to go from here? The first thing I would recommend (if you haven't already learnt them from learning monads) is the various types of monads, such as Reader, Writer and State. Again, Real world Haskell and All about monads gives great coverage of this. To complete your monad training learning about monad transformers is a must. These let you combine different types of Monads (such as a Reader and State monad) into one. This may seem useless to begin with, but after using them for a while you will wonder how you lived without them.

Now you can finish the real world Haskell book if you want. Skipping chapters now doesn't really matter, as long as you have monads down pat. Just choose what you are interested in.

With the knowledge you would have now, you should be able to use most of the packages on cabal (well the documented ones at least...), as well as most of the libraries that come with Haskell. A list of interesting libraries to try would be:

Parsec: for parsing programs and text. Much better than using regexps. Excellent documentation, also has a real world Haskell chapter.
QuickCheck: A very cool testing program. What you do is write a predicate that should always be true (eg length (reverse lst) == length lst). You then pass the predicate the QuickCheck, and it will generate a lot of random values (in this case lists) and test that the predicate is true for all results. See also the online manual.
HUnit: Unit testing in Haskell.
gtk2hs: The most popular gui framework for Haskell, lets you write gtk applications.
happstack: A web development framework for Haskell. Doesn't use databases, instead a data type store. Pretty good docs (other popular frameworks would be snap and yesod).

Also, there are many concepts (like the Monad concept) that you should eventually learn. This will be easier than learning Monads the first time, as your brain will be used to dealing with the level of abstraction involved. A very good overview for learning about these high level concepts and how they fit together is the Typeclassopedia.

Applicative: An interface like Monads, but less powerful. Every Monad is Applicative, but not vice versa. This is useful as there are some types that are Applicative but are not Monads. Also, code written using the Applicative functions is often more composable than writing the equivalent code using the Monad functions. See Functors, Applicative Functors and Monoids from the learn you a haskell guide.
Foldable,Traversable: Typeclasses that abstract many of the operations of lists, so that the same functions can be applied to other container types. See also the haskell wiki explanation.
Monoid: A Monoid is a type that has a zero (or mempty) value, and an operation, notated <> that joins two Monoids together, such that x <> mempty = mempty <> x = x and x <> (y <> z) = (x <> y) <> z. These are called identity and associativity laws. Many types are Monoids, such as numbers, with mempty = 0 and <> = +. This is useful in many situations.
Arrows: Arrows are a way of representing computations that take an input and return an output. A function is the most basic type of arrow, but there are many other types. The library also has many very useful functions for manipulating arrows - they are very useful even if only used with plain old Haskell functions.
Arrays: the various mutable/immutable arrays in Haskell.
ST Monad: lets you write code with a mutable state that runs very quickly, while still remaining pure outside the monad. See the link for more details.
FRP: Functional Reactive Programming, a new, experimental way of writing code that handles events, triggers, inputs and outputs (such as a gui). I don't know much about this though. Paul Hudak's talk about yampa is a good start.

There are a lot of new language features you should have a look at. I'll just list them, you can find lots of info about them from google, the haskell wikibook, the haskellwiki.org site and ghc documentation.

Multiparameter type classes/functional dependencies
Type families
Existentially quantified types
Phantom types
GADTS
others...

A lot of Haskell is based around category theory, so you may want to look into that. A good starting point is Category Theory for Computer Scientist. If you don't want to buy the book, the author's related article is also excellent.

Finally you will want to learn more about the various Haskell tools. These include:

ghc (and all its features)
cabal: the Haskell package system
darcs: a distributed version control system written in Haskell, very popular for Haskell programs.
haddock: a Haskell automatic documentation generator

While learning all these new libraries and concepts, it is very useful to be writing a moderate-sized project in Haskell. It can be anything (e.g. a small game, data analyser, website, compiler). Working on this will allow you to apply many of the things you are now learning. You stay at this level for ages (this is where I'm at).

Expert

It will take you years to get to this stage (hello from 2009!), but from here I'm guessing you start writing phd papers, new ghc extensions, and coming up with new abstractions.

Getting Help

Finally, while at any stage of learning, there are multiple places for getting information. These are:

the #haskell irc channel
the mailing lists. These are worth signing up for just to read the discussions that take place - some are very interesting.
other places listed on the haskell.org home page

Conclusion

Well this turned out longer than I expected... Anyway, I think it is a very good idea to become proficient in Haskell. It takes a long time, but that is mainly because you are learning a completely new way of thinking by doing so. It is not like learning Ruby after learning Java, but like learning Java after learning C. Also, I am finding that my object-orientated programming skills have improved as a result of learning Haskell, as I am seeing many new ways of abstracting ideas.

Best Answer

Related Solutions

When is assembly faster than C?

Haskell – Getting started with Haskell

Related Topic