Design – Persist Notification Before or After Publishing in Redis?

ArchitecturedatabasedesignMySQLredis

I'm implementing a mechanism to notify a group of users about newly inserted blog comments.
The architecture uses the Redis Pub/Sub mechanism.

By definition, the pub/sub mechanism aims to propagate message to the right subscribers, without storing/persisting anything.

Although blog comments are obviously persisted in DB, I also expect all notifications to be sent to be persisted in order to be retrieved at any time by an offline user that gets logged again.
Of course, Redis could have persistent messages, but the cost for that is pretty high.

I use a distinct main database, let's take for the example MySql.
Currently, the workflow is:

  1. A blog comment posted by a user is handled by my backend and firstly stored in DB.
  2. An event is raised, let's call it "CommentedBlogEvent", that triggers a worker aiming to detect all the targeted user of the comment.
  3. Assuming that 10 users are target of the comment, I insert 10 "notification rows" in my database associating the commentId and the targeted userId, each specific to a user since it would have a flag read/unread.
  4. Then, once all notifications have been persisted, I use Redis Pub/Sub to trigger subscribers aiming to push results to the concerned online clients (through WebSockets for instance).

The "issue" is that the process could be slow because of the step 3.

Would it be tolerated to make the step 4 before the step 3, meaning before persisting it in DB, since a potential data loss isn't dramatic in my case (non-financial data etc)?
Advantage: Client gets result more quickly.
Drawback: User could receive a notification that failed to be stored in background, leading to a missing notification when user refreshes the page.

What is the best way to handle this case, while keeping my main database as the notifications store?

Best Answer

The "issue" is that the process could be slow because of the step 3.

You may want to use a pipeline: While step 3 is still inserting notifications, have step 4 start process notifications that are already inserted. You may have to redesign how step 3 works, e.g. by splitting up the users into batches.

Would it be tolerated to make the step 4 before the step 3, since a potential data loss isn't dramatic in my case (non-financial data etc)?

Whether data loss is dramatic or not very much depends on how users perceive this loss, and how often it happens. If you tell your users "never miss a blog comment" and then fail to deliver notifications reliably, that could still be quite dramatic. If you fail to deliver 1 ppm that's probably not too bad, if you fail to deliver 500'000 ppm, well that's just bad design.

Related Topic