CQRS: How to restore read model

cqrsevent-sourcingread-model

It's very interesting how to restore read model in system based on CQRS.

In regular mode system processes commands, creates domain events and posts them to message bus. Then another part of system (call it RM subsystem) processes these messages and saves them to read model. This mode is good enough for regular purpose.

But how should I repair my read model? For example storage with read model was corrupted or I changed location of storage. I want my system to restore read model during initialization, before queries begin to try read data. And I want to know the end of repairing process.

I can imagine two ways:

Create REST controller, throug which my RM subsystem will be able to query all domain events (in messages forms) and restore it synchronously.
Create special mechanism, calling which my RM subsystem will be able to start replaying all messages. As for me, this way isn't very good, because I cannot control time of finish of repairing process. And the second, if there are other consumers of messages, they probably can corrupt their data.

Which way is more preferable?

Best Answer

Both methods are totally fine, and as usual, the answer is: "it depends".

The 2nd method you suggested is easy to implement and is used quite often - in particular, I'd consider it to be the standard method to bring new event handlers/projections online into an existing system, since new consumers need to be replayed the full event history at least once.

Regarding other consumers, please not that

you can (I'm tempted to say should) make them idempotent consumers, which at the same time helps with the "at least once"-guarantee paradigm of common message bus systems, and
you can always opt to only replay events into a single consumer, such that other consumers are not affected by the selective replay.

Furthermore, have you actually measured the time that it takes to replay all events into a single projection (the corrupted one)? Usually you can easily handle tens of thousands of events and read model updates per second (use one big transaction), so replaying all events to repair a single read model should be a matter of minutes. We actually replay all events on every system startup, since our read model is only stored in memory, and it's blazing fast.

If your event store/messaging infrastructure does not support query by event type, the 1st method you suggested is a bit harder to implement, since you need to implement the query interface. This might be extremely hard or it might be trivial, depending on how your event store is designed. So if you don't want to use the second method you suggested, implement the 1st method, use selective queries to repair a read model and call it a day.

Related Solutions

Event Sourcing – Replaying and Versioning Explained

First, it is important to understand and be able to leverage the difference between Commands and Events.

As this question succinctly points out, Commands are things we would like to happen, and Events are things that have already happened. A command does not necessarily result in a significant event in the system, but it usually does. For example, a send message command may be rejected, in which case no event happens (typically an error would not be considered an event in this sense, though we may still choose to log it in a diagnostic log). Now, if the send message command is accepted, the message sent event occurs, and event details could describe the sender, the receiver, and the content.

When we talk about the system state, we are actually discussing not a culmination of commands, but of events. Only events reflect changes of state in the system. To draw from a life example, suppose I go to the local Publix supermarket and buy a Florida lottery ticket. The command was "Buy Ticket" and the event was "Ticket issued." My next command then is to the lottery to draw my numbers for the PowerBall. The lottery is going to ignore my command (but I have no knowledge), and the event "PowerBall numbers chosen" takes place irrespective of my wishes. If my numbers match, the event "Jackpot won" happens to me (and I think my command was heard). If not, I realize my command was ignored.

From a historical perspective, the lottery is only interested in a subset of events. The lottery only cares that (a) a ticket was issued, (b) the numbers were chosen, and (c) the jackpot was won. Those are the items of interest. The act of purchasing the ticket, wanting to win, etc. are all irrelevant, as is what I do with my ticket after I lose. While the real world does change for mundane events, we only need to record those events which are significant to our system.

In theory, under an event-sourcing technique, a stream of events may be replayed from the beginning of time to arrive at the current state. This relies upon the assumption that the underlying system conditions are constant and deterministic. However, these assumptions are not valid in many systems. The data associated with an event, as well as the types of events we are interested in, may change as our computer software evolves. In addition, it can be computationally expensive to re-compute the current state in response to every query. For this reason, snapshots of the system state are often taken to represent known points in time, which most recent events can then be added to.

While it is still possible to replay an event stream across multiple versions, the amount of human effort involved in doing so is likely to be cost-prohibitive. Unless there is a justifiable reason to design that capability into the system, you are better off building your system to utilize snapshots.

Example in Question

In the example given in the question, the architecture is not truly event-based; it is command-based. Replaying commands creates the system state. This is an anti-pattern and should be fixed. Instead, the primary events are:

Buyer asks question
Seller responds to question

Each of these events can be "replayed" to give the current state. For example, in the act of asking a question, the system behavior might be to email the seller and increment the unanswered question counter. This behavior can be changed; however, the fact that the question was asked does not. Similarly, the system might decrement the unanswered question counter when the seller responds. This behavior is changable, but the fact that the seller responded is not.

Most event-sourcing systems would dynamically compute the count of unanswered questions by replaying the specific event stream in response to a query.

Event Sourcing – How to Deal with Side Effects in Event Sourcing

How do I deal with side effects in Event Sourcing?

Short version: the domain model doesn't perform side effects. It tracks them. Side effects are performed using a port that connects to the boundary; when the email is sent, you send the acknowledgement back to the domain model.

This means that the email is sent outside of the transaction that updates the event stream.

Precisely where, outside, is a matter of taste.

So conceptually, you have a stream of events like

EmailPrepared(id:123)
EmailPrepared(id:456)
EmailPrepared(id:789)
EmailDelivered(id:456)
EmailDelivered(id:789)

And from this stream you can create a fold

{
    deliveredMail : [ 456, 789 ],
    undeliveredMail : [123]
}

The fold tells you which emails haven't been acknowledged, so you send them again:

undeliveredMail.each ( mail -> {
    send(mail);
    dispatch( new EmailDelivered.from(mail) );
}

Effectively, this is a two phase commit: you are modifying SMTP in the real world, and then you are updating the model.

The pattern above gives you an at-least-once delivery model. If you want at-most-once, you can turn it around

undeliveredMail.each ( mail -> {
    commit( new EmailDelivered.from(mail) );
    send(mail);
}

There's a transaction barrier between making EmailPrepared durable and actually sending the email. There's also a transaction barrier between sending the email and making EmailDelivered durable.

Udi Dahan's Reliable Messaging with Distributed Transactions may be a good starting point.

Best Answer

Related Solutions

Event Sourcing – Replaying and Versioning Explained

Event Sourcing – How to Deal with Side Effects in Event Sourcing

Related Topic