Architecture – Saving High-Frequency Events to a Connection-Limit Constrained Database

Architectureevent-sourcingperformancescalability

We've got a situation where I have to deal with a massive influx of events coming in to our server, at about 1000 events per second, on average (peak could be ~2000).

The problem

Our system is hosted on Heroku and uses a relatively expensive Heroku Postgres DB, that allows a maximum of 500 DB connections. We use connection pooling to connect from the server to the DB.

Events come in faster than the DB connection pool can handle

The problem we have is that events come faster than the connection pool can handle. By the time one connection has finished the network roundtrip from the server to the DB, so it can get released back to the pool, more than n additional events come in.

Eventually the events stack up, waiting to get saved and because there are no available connections in the pool, they time out and the whole system is rendered non-operational.

We've solved the emergency by emitting the offending high-frequency events at a slower pace from the clients, but we still want to know how to handle this scenarios in the event we need to handle that high-frequency events.

Constraints

Possible solutions

Enqueue the events on our server

We could enqueue the events on the server (with the queue having a maximum concurrency of 400 so the connection pool doesn't run out).

This is bad idea because:

It will eat up available server memory. The stacked-up enqueued events will consume massive amounts of RAM.
Our servers restart once every 24 hours. This is a hard limit imposed by Heroku. The server can restart while events are enqueued causing us to lose the enqueued events.
It introduces state on the server, thus hurting scalability. If we have a multi-server setup and a client wants to read all the enqueued + saved events, we won't know on which server the enqueued events live.

Use a separate message queue

I assume we could use a message queue, (like RabbitMQ?), where we pump the messages in it and on the other end there is another server that only deals with saving the events on the DB.

I'm not sure if message queues allow querying enqueued events (that weren't saved yet) so if another client wants to read the messages of another client, I can just get the saved messages from the DB and the pending messages from the queue and concatenate them together so I can send them back to the read-request client.

Use multiple databases, each saving a portion of the messages with a central DB-coordinator server to manage them

Another solution we've though is to use multiple databases, with a central "DB coordinator/load balancer". Upon receiving an event it
this coordinator would choose one of the databases to write the message to. This should allow us to use multiple Heroku databases thus upping the connection limit to 500 x number of databases.

Upon a read query, this coordinator could issue SELECT queries to each database, merge all the results and send them back to the client that requested the read.

This is bad idea because:

This idea sounds like … ahem.. over-engineering? Would be a nightmare to manage as well (backups etc..). It's complicated to build and maintain and unless it's absolutely necessary it sounds like a KISS violation.
It sacrifices Consistency. Doing transactions across multiple DB's is a no-go if we go with this idea.

Best Answer

Input stream

It is not clear if your 1000 events/second represent peaks or if it's a continuous load:

if it's a peak, you could use a message queue as buffer to spread the load on the DB server over a longer time;
if it's constant load, the message queue alone is not sufficient, because the DB server will never be able to catch up. Then you'd need to think about a distributed database.

Proposed solution

Intuitively, in both cases, I'd go for a Kafka based event-stream:

All events are systematically published on a kafka topic
A consumer would subscribe to the events and store them to the database.
A query processor will handle the requests from the clients and query the DB.

This is highly scalable at all levels:

If DB server is the bottleneck, just add several consumers. Each could subscribe to the topic, and write to a different DB server. However, if the distribution occurs randomly across the DB servers, the query processor will not be able to predict the DB server to take and have to query several DB servers. This could lead to a new bottleneck on the query side.
The DB distribution scheme could therefore be anticipated by organising the event stream into several topics (for example, using groups of keys or properties, to partition the DB according to a predictable logic).
If one message server is not sufficient to handle a growing flood of input events, you could add kafka partitions to distribute kafka topics across several physical servers.

Offering events not yet written in the DB to clients

You want your clients to be able to get access also to information still in the pipe and not yet written to the DB. This is a little more delicate.

Option 1: Using a cache to complement db queries

I have not analysed in depth, but the first idea that comes to my mind would be to make the query processor(s) a consumer(s) of the kafka topics, but in a different kafka consumer group. The request processor would then receive all the messages that the DB writer will receive, but independently. It could then keep them in a local cache. The queries would then run on DB + cache (+ elimination of duplicates).

The design would then look like:

The scalability of this query layer could be achieved by adding more query processors (each in its own consumer group).

Option 2: design a dual API

A better approach IMHO would be to offer a dual API (use the mechanism of the separate consumer group):

a query API for accessing events in the DB and/or making analytics
a streaming API that just forwards messages directly from the topic

The advantage, is that you let the client decide what is interesting. This could avoid that you systematically merge DB data with freshly cashed data, when the client is only interested in new incoming events. If the delicate merge between fresh and archived events is really needed, then the client would have to organise it.

Variants

I proposed kafka because it's designed for very high volumes with persistent messages so that you can restart the servers if needed.

You could build a similar architecture with RabbitMQ. However if you need persistent queues, it might decrease performance. Also, as far as I know, the only way to achieve the parallel consumption of the same messages by several readers (e.g. writer+cache) with RabbitMQ is to clone the queues. So a higher scalability might come at a higher price.

Related Solutions

Multi-Tenant Application – Architecture Design Patterns

It is not scalable to maintain one fork of your software per client, regardless of whether you try to maintain this as multiple repositories or as multiple branches in one repository. You will be unable to apply cross-cutting changes to all your clients, except with extraordinary effort. Common cross-cutting changes are refactorings, redesigns, or security fixes.

The solution is twofold:

Recognize that having an individual deployment for a client is not the same as running an individual project for that client. You can deploy with different configurations from the same codebase.
Create client-specific variants through feature toggles, build-time or run-time configuration, plugin systems, and dependency injection.

Do not hardcode modifications. If you have more than one client, you will regret this. Instead, make the engine configurable so that client-specific plugins can be loaded. When you want to modify the behavior, refactor your core engine to support a plugin there, for example by introducing a new interface. Then provide an implementation for that interface as customer-specific code. When you see that multiple clients might need that functionality you can move that code into the core, but possibly disable it for clients that don't need it.

The shared core of your software is crucial for making modifications with low effort. As it evolves, make sure that it stays well-designed. This is your framework for building client-specific variants.

For the repository layout, this hinges on whether you need to give access to the development repository to clients. If not, consider keeping all work in a single monorepo. For example:

crm/
  build-tools/
  core-engine/  (contains source, tests, default assets)
  client-a/     (contains config and client-specific code, tests, assets)
  client-b/
  ...

If clients need access to their repos, then it might be better to create separate repos for the core engine and the client-specific adaptions. It may be best to treat the core engine as a library that is used by client-specific apps. However, separate repositories make it more difficult to apply backwards-incompatible changes to your core, like changing a method's signature or renaming a class.

Event Sourcing – Syncing Between Write and Read Models in Event Sourcing

Greg Young's Polyglot Data talk includes a discussion of pull vs push.

Pub/sub is "fine" when you are dealing with a single message in isolation. It's also a reasonable choice when you need to handle sparse events with low latency.

But pull based solutions are a lot easier to reason about when you are dealing with sequences of messages. The consumer always gets messages in the correct order, different consumers can use their own local cursor to track where they are in the stream, and to request the slice of events that they need to process.

The pub/sub mechanism becomes, in effect, a notification mechanism, to wake up any hibernating consumers (who can then poll the data to catch up to the present before they go back to sleep).

In February 2016, Raymond Rutjes wrote:

At the DDD Europe conference, I realized that the speakers I talked with where avoiding Pub/Sub whenever possible.

Greg and Yves Reynhout (among others) chime in.