How/why do functional languages (specifically Erlang) scale well

concurrencyerlangfunctional programmingscalability

I have been watching the growing visibility of functional programming languages and features for a while. I looked into them and didn't see the reason for the appeal.

Then, recently I attended Kevin Smith's "Basics of Erlang" presentation at Codemash.

I enjoyed the presentation and learned that a lot of the attributes of functional programming make it much easier to avoid threading/concurrency issues. I understand the lack of state and mutability makes it impossible for multiple threads to alter the same data, but Kevin said (if I understood correctly) all communication takes place through messages and the mesages are processed synchronously (again avoiding concurrency issues).

But I have read that Erlang is used in highly scalable applications (the whole reason Ericsson created it in the first place). How can it be efficient handling thousands of requests per second if everything is handled as a synchronously processed message? Isn't that why we started moving towards asynchronous processing – so we can take advantage of running multiple threads of operation at the same time and achieve scalability? It seems like this architecture, while safer, is a step backwards in terms of scalability. What am I missing?

I understand the creators of Erlang intentionally avoided supporting threading to avoid concurrency problems, but I thought multi-threading was necessary to achieve scalability.

How can functional programming languages be inherently thread-safe, yet still scale?

Best Answer

A functional language doesn't (in general) rely on mutating a variable. Because of this, we don't have to protect the "shared state" of a variable, because the value is fixed. This in turn avoids the majority of the hoop jumping that traditional languages have to go through to implement an algorithm across processors or machines.

Erlang takes it further than traditional functional languages by baking in a message passing system that allows everything to operate on an event based system where a piece of code only worries about receiving messages and sending messages, not worrying about a bigger picture.

What this means is that the programmer is (nominally) unconcerned that the message will be handled on another processor or machine: simply sending the message is good enough for it to continue. If it cares about a response, it will wait for it as another message.

The end result of this is that each snippet is independent of every other snippet. No shared code, no shared state and all interactions coming from a a message system that can be distributed among many pieces of hardware (or not).

Contrast this with a traditional system: we have to place mutexes and semaphores around "protected" variables and code execution. We have tight binding in a function call via the stack (waiting for the return to occur). All of this creates bottlenecks that are less of a problem in a shared nothing system like Erlang.

EDIT: I should also point out that Erlang is asynchronous. You send your message and maybe/someday another message arrives back. Or not.

Spencer's point about out of order execution is also important and well answered.

Related Topic