Multithreading Server – Many Blocking vs. Single Non-Blocking Workers

multithreadingserver

Assume there is an HTTP server which accepts connections and then it has somehow wait for headers to be fully sent in. I wonder what is the most common way of implementing it and what are the rest pros and cons. I can only think of these:

Many blocking workers are good because:

  • It is more responsive.
  • Easier to introduce new connections (workers pick them up them selves rather than outsider waiting till it can add it to a synchronized list).
  • CPU usage balances automatically (without any additional effort) as number of connections increases and decreases.
  • Less CPU usage (blocked threads are taken out of the execution loop and do not require any logic for jumping between clients).

Single non-blocking worker is good because:

  • Uses less memory.
  • Less vulnerable to lazy clients (which connect to the server and send headers slowly or don't send at all).

As you probably can see, in my opinion multiple worker-threads seem a bit better solution overall. The only problem with it is that it is easier to attack such server.

Edit (more research):
Some resource I found on the web (Thousands of Threads and Blocking I/O
– The old way to write Java Servers is New again (and way better)
by Paul Tyma) hints that blocking approach is generally better but I still don't really know how to deal with fake connections.

P.S. Do not suggest using some library or applications for the task. I am more interested in knowing how it actually works or may work rather than have it working.

P.S.S. I have split logic into multiple parts and this one only handles accepting HTTP headers. Does not process them.

Best Answer

There's no silver bullet

In practice it depends...

tl;dr - easy solution, use nginx...

Blocking:

For instance, Apache by default uses a blocking scheme where the process is forked for every connection. That means every connection needs its own memory space and the sheer amount of context-switching overhead increases more as the number of connections increases. But the benefit is, once a connection is closed the context can be disposed and any/all memory can be easily retrieved.

A multi-threaded approach would be similar in that the overhead of context switching increases with the number of connections but may be more memory efficient in a shared context. The problem with such an approach is it's difficult to manage shared memory in a manner that's safe. The approaches to overcome memory synchronization problems often include their own overhead, for instance locking may freeze the main thread on CPU-intensive loads, and using immutable types adds a lot of unnecessary copying of data.

AFAIK, using a multi-process approach on a blocking HTTP server is generally preferred because it's safer/simpler to manage/recovery memory in a manner that's safe. Garbage collection becomes a non-issue when recovering memory is as simple as stopping a process. For long-running processes (ie a daemon) that characteristic is especially important.

While context-switching overhead may seem insignificant with a small number of workers, the disadvantages become more relevant as the load scales up to hundreds-to-thousands of concurrent connections. At best, context switching scales O(n) to the number of workers present but in practice it's most-likely worse.

Where servers that use blocking may not be the ideal choice for IO heavy loads, they are ideal for CPU-intensive work and message passing is kept to a minumum.

Non-Blocking:

Non-blocking would be something like Node.js or nginx. These are especially known for scaling to a much larger number of connections per node under IO-intensive load. Basically, once people hit the upper limit of what thread/process-based servers could handle they started to explore alternative options. This is otherwise known as the C10K problem (ie the ability to handle 10,000 concurrent connections).

Non-blocking async servers generally shares a lot of characteristics with a multi-threaded-with-locking approach in that you have to be careful to avoid CPU-intensive loads because you don't want to overload the main thread. The advantage is that the overhead incurred by context switching is essentially eliminated and with only one context message passing becomes a non-issue.

While it may not work for many networking protocols, HTTPs stateless nature works especially well for non-blocking architectures. By using the combination of a reverse-proxy and multiple non-blocking HTTP servers it's possible to identify and route around the nodes experiencing heavy load.

Even on a server that only has one node, it's very common for the setup to include one server per processor core to maximize throughput.

Both:

The 'ideal' use case would be a combination of both. A reverse proxy at the front dedicated to routing requests at the top, then a mix of blocking and non-blocking servers. Non-blocking for IO tasks like serving static content, cache content, html content. Blocking for CPU-heavy tasks like encoding images/video, streaming content, number crunching, database writes, etc.

In your case:

If you're just checking headers but not actually processing the requests, what you're essentially describing is a reverse proxy. In such a case I'd definitely go with an async approach.

I'd suggest checking out the documentation for the nginx built-in reverse proxy.

Aside:

I read the write-up from the link you provided and it makes sense that async was a poor choice for their particular implementation. The issue can be summed up in one statement.

Found that when switching between clients, the code for saving and restoring values/state was difficult

They were building a state-ful platform. In such a case, an async approach would mean that you'd have to constantly save/load the state every time the context switches (ie when an event fires). In addition, on the SMTP side they're doing a lot of CPU-intensive work.

It sounds like they had a pretty poor grasp of async and, as a result, made a lot of bad assumptions.

Related Topic