Caching – Redis Write-Through Cache vs Write-Behind Cache

cachingdesignredis

I have a simple use case:

  • Newsfeeds aggregated based on posts that are linked to geopoints
  • Users can comment on feed posts
  • Users can like feed posts and comments

I'm certain that Redis as a cache can provide me with what I need if this app were to come under high load, however I am a bit puzzled about the following options:

Write-through cache

Java app -> Redis -> DB

Seems like the best option data consistency-wise, in that the request will only return when the database has been updated. However the disadvantage implies that no batch updates are possible.
If the app were to come under heavy load, this doesn't seem to be the best option.

Write-behind cache

Java app -> Redis -> Java app (batch config) -> DB

Seems like the best option performance wise. In case of heavy load, data is transferred from the cache to the DB through Java (configuring batch sizes, delay etc).
However, the disadvantage seems that if there is a system failure on Redis, all in-memory data is gone (?)!

My question is simple:

  • Does a high-availability cluster with write-behind cache solve all of my problems?
  • Does high-availability write to disk (Redis), guaranteeing that no data will be lost?
  • Even with write-behind cache, is it the safest course of action to add a message queue (eg RabbitMq) or is this generally not necessary even under high load?

Missing posts in a user app like this in case of failure is a big no-no, so I wonder what is your view on this, and what is the best resolution of the problem.

Best Answer

Your question is specific, so I'll answer it head-on first.

I anticipate that "write-behind" cache will be better. "write-through" might help with occasional spikes in ingress data, but won't alleviate the problem of disk throughput (and transaction throughput).

Of course, this would need to be verified in real-life with benchmarking.


My recommended implementation

I strongly recommend that you don't overengineer your architecture. Less is more.

YAGNI: Start simple and tunable, then update your architecture as needed.

1. Bypass the Java and Redis, and write directly to the database

  • see https://colossal.gitbook.io/microprocess/definition/data-web-gateway. This approach lets you write directly to the database over HTTP.
  • Choose MariaDB(MySQL) as the database and choose the InnoDB engine; I am quite sure that it generally has higher throughput than PostgreSQL, and supports a higher number of concurrent connections (~200k vs 500) - but you should measure that yourself using the target hardware/VM that you plan to use. (see [Compatible Database Performance Comparisons] in https://colossal.gitbook.io/microprocess/database-system/introduction)
  • Have a table with the available Database-Web-Gateway endpoints (that you can add to in the future) and have your client connect to a random one; select * from View_DataWebGateways order by RAND() limit 1;
  • Use a writeable view so that you can point straight to the posts table today, and be able to change that in the future. Query something like: insert into WriteView_Posts (...) values (...). The same goes for PostComments and PostReactions
  • (Install MySQL and the Database Web Gateway on the same VM)

2. Optimise DB and Hardware

Right off the bat, you need a DBA and SysOps person who you can call on to scale up your DB and VM/Hardware and stay ahead of your growth curve. You'll probably find that starts happening in say 2 years anyway. If it happens sooner, your business is doing great and you can now hire more programmers yay!

3. Logical Sharding

With posts, you have a natural shard topic - the post. You won't have PostComments that belong to two posts, they will have an affinity for the Post. You also have Geo affiliated data, so I am guessing that people close to those locations will want that data more - with Sharding you can master that data in a closer data center.

Create a "Management" database to hold a Shards table, as well as the DataWebGateways table. This will be a replicated database with one master, and each new shard will have a read-replica of this.

For each new shard:

  • Create a new VM, create a new Database with a numbered scheme "Posts_1", "Posts_2". (there is NO need to replicate these databases)
  • Make sure you have a read-replica of the Management database to this MariaDB instance.
  • Insert the relevant Shard and DataWebGateway records into the master instance (manually I guess).

Now, you can make the View_DataWebGateways view a bit smarter. Let the client get all of the records, so that it can randomly choose one to use, while pinging the others to see which is closest (by latency), then switch to that.

4. Staging Table

It's probably best that you just create another shard, but if you do testing, and find that batching greatly improves insert performance, then you can do the following:

  • Change WriteView_Posts to point to PostStaging instead of Posts table.
  • Benchmark different MariaDB engines, and select the right engine and configuration for the PostStaging table that optimise for Insert performance. This table can be configured to a dedicated Disk Volume that is also benchmarked to perform best for the simulated load (sequential reads).
  • Ensure you don't have any constraints or triggers on this PostStaging table. (While Posts can)
  • Create a Timed-Interval Microprocess which batches inserts into Posts from PostStaging every X seconds.

Now, if the power goes out, you don't have to lose your "cached" posts. This also helps because you can also redirect posts to the appropriate shard by geography at this point.

5. "High load" probably means Reads not writes

  • Add more VM/hardware: create Read-Replicas of Shards within the same datacentre, and add DataWebGateways records with ReadOnly flag for the client to use. The client will do all reading from read-replicas, and writing to the single write master DWebG.

6. Curate feeds with FeedViews

Avoid caching at all costs - https://colossal.gitbook.io/microprocess/building/caching

If you would like to curate specific geographic feeds (eg. Germany, France, ...). While a geography shard server is great, Germans might want a mix of posts from around the world, while the French only want French posts - that's up to you to figure out.

To make this work:

  • Have a crude view View_CountryFeed and use it like select * from View_CountryFeed order by post_id desc limit 10;. This will only work on a single shard to begin with.

When that is reaching its limits, improve your algorithm and also introduce manual materialisation with the following:

  • Log interest in posts per country
  • Run your algorithms as microprocesses
  • The algorithm parameters will be shared on the Management replicated database
  • The algorithm will run in isolation on each Shard grouping to Country IDs
  • The Management database will be configured on which shard to store the feed
  • A microprocess will pull country feeds from other shards to aggregate. Each shard will pull the right CountryIDs as configured. "Pulling" is done on the resource-sensitive side so this can be tuned per-shard to minimise resource usage.
  • Update View_CountryFeed: having PostID, PostBodyJSON, ShardID. The ShardID is important because that's where post interactions will be sourced from and published, while the PostBodyJSON will enable efficient rendering of the post by the client.

7. Realtime Notifications

Database Web Gateway will be able to help here to in the future. see https://colossal.gitbook.io/microprocess/database-system/feature-gaps. When the CountryFeed table is updated, a trigger will fire and notify the DWebG which will rerun subscribed Views and send new records to the subscribed web clients.

Related Topic