Architecture – Scaling websocket client connections (not server) to multiple servers

Architecturedistributed computingscalabilitywebsockets

I wrote a Slack bot which must connect to Slack teams through websocket connections. Since the bot might be used by thousands of team, I will eventually need to distribute the teams across multiple servers. New teams are added through an HTTP server which handles the initial OAuth authentication.

I'm looking for a solutions that will help me achieve the following:

When a server goes down or reboots, all the teams it was assigned to must be re-assigned to the remaining servers. It's ok if the connection to the Slack team temporarily goes down as long as the team is quickly picked up by a server.
When a team is added, it gets assigned to the less "busy" server. Busy could be simply defined by the amount of team it currently handles.
I'd like to do all of this with minimal custom code to write.

So far, I've considered the following solutions:

1) Work queue with RabbitMQ. Bot servers compete to receive teams. That's an OK solution though I need a reliable way to put back teams in the queue when a server goes down.

2) Write a custom "orchestration" service. The orchestration service would receive teams from the http server and dispatch them to a cluster of servers. It would need to keep track of when servers go down and which teams need to be re-assigned. I'm not really sure how to write such a service reliably and this would become a single point of failure.

3) Your suggestions!

Best Answer

Fundamentally, you're looking for advice on how to achieve load balancing. Even with the limitations you've offered, this is a pretty broad topic.

One possible solution:

Starting under the assumption that you have some means for any arbitrary client to obtain a list of currently active servers, you could use some variation on Consistent Hashing or Rendezvous Hashing.

As a simplified example, you map each server to numerous random bucket values between 0 and 1. For a given client, you hash some sort of client id (e.g., the client's IP address) to a number, then send it to the server with a bucket which is closest to the selected server^[1].

There are several benefits to this approach:

Other than maintaining the list of servers, all of this logic can be done client-side.
The logic is incredibly simple; implementing this functionality should only require 10-20 lines of code.
Adding or removing servers is handled cleanly. When a server is added, most clients will not switch servers. When a server is lost, only that server's clients will be remapped.

The downside of this approach is that it is random. There is a risk that load will not be distributed evenly, especially if the number of clients is low.

_{[1] This calculation wraps (hence why most discussions talk about angles or circles), but the impact of ignoring this is pretty small. The simplest fix is to add an extra 1+Min(bucket_value) bucket. Basically, 0.01 should be treated as closer to 0.99 than to 0.5 to avoid weighing the highest and lowest bucket values less unevenly.}

Best Answer

Related Solutions

Distributed Computing – Building a Redundant Application

Architecture – REST vs Message Queue in Multi-Tier Systems

Related Topic