CQRS: How to restore read model

cqrsevent-sourcingread-model

It's very interesting how to restore read model in system based on CQRS.

In regular mode system processes commands, creates domain events and posts them to message bus. Then another part of system (call it RM subsystem) processes these messages and saves them to read model. This mode is good enough for regular purpose.

But how should I repair my read model? For example storage with read model was corrupted or I changed location of storage. I want my system to restore read model during initialization, before queries begin to try read data. And I want to know the end of repairing process.

I can imagine two ways:

  1. Create REST controller, throug which my RM subsystem will be able to query all domain events (in messages forms) and restore it synchronously.
  2. Create special mechanism, calling which my RM subsystem will be able to start replaying all messages. As for me, this way isn't very good, because I cannot control time of finish of repairing process. And the second, if there are other consumers of messages, they probably can corrupt their data.

Which way is more preferable?

Best Answer

Both methods are totally fine, and as usual, the answer is: "it depends".

The 2nd method you suggested is easy to implement and is used quite often - in particular, I'd consider it to be the standard method to bring new event handlers/projections online into an existing system, since new consumers need to be replayed the full event history at least once.

Regarding other consumers, please not that

  • you can (I'm tempted to say should) make them idempotent consumers, which at the same time helps with the "at least once"-guarantee paradigm of common message bus systems, and
  • you can always opt to only replay events into a single consumer, such that other consumers are not affected by the selective replay.

Furthermore, have you actually measured the time that it takes to replay all events into a single projection (the corrupted one)? Usually you can easily handle tens of thousands of events and read model updates per second (use one big transaction), so replaying all events to repair a single read model should be a matter of minutes. We actually replay all events on every system startup, since our read model is only stored in memory, and it's blazing fast.

If your event store/messaging infrastructure does not support query by event type, the 1st method you suggested is a bit harder to implement, since you need to implement the query interface. This might be extremely hard or it might be trivial, depending on how your event store is designed. So if you don't want to use the second method you suggested, implement the 1st method, use selective queries to repair a read model and call it a day.