Design Patterns – Granularity of Events in Event Sourcing

design-patternsevent-sourcing

I'm growing my understanding of event sourcing. My understanding is that it provides a means of recording events as they happen so that given a common beginning state and an audit log of recorded events, the events can be replayed to achieve the same final state.

I know that sometimes events are triggered by other events. When I consider a system as a whole, I imagine that some events will be triggered externally by inputs from the user or another external source and that other events will be triggered and processed internally as side effects. I am calling the former originating events and the latter secondary events.

Originating events trigger a domino effect of secondary events. Since secondary events hinge directly on some originating event, should secondary events be recorded as part of the audit?

If you're aware of some source that discusses this in depth please cite.

Best Answer

I think you're mixing up event sourcing and command sourcing. There is crucial difference between them. With command sourcing, you register external commands, like user inputs. In event sourcing, you register effects of those commands, i.e. what you've called secondary events.

In practice, you should choose either command or event sourcing and stick to it, not combining these 2 approaches. As for which one to choose, it seems that industry consensus is skewing towards event sourcing. For example, in Akka, the Akka Persistence module is based on event sourcing.

Main idea behind choosing event sourcing over command sourcing is that former allows for idempotency. I.e. however many times you replay the effects of your operation, your system would end up in the same state, while with command sourcing you're trying to rely on a fact that same command executed at different time would produce the same effect, which is, needless to say, is incautious.

Related Solutions

Sharing of event source stream between aggregates

This events will be fetched by AR2 and then processed.

Ohh, that sounds like a bad idea.

One more notice - in AR2 this layer would be read-only = needed to do some business logic inside AR2.

So looking at your picture, AR2 is writing out events (d,e,f), which says that AR2 is not read only -- which is good; read only aggregate roots don't make sense.

The usual pattern for what you seem to be trying to do here is to use a process manager to coordinate the activities of the two aggregate roots. The role of the process manager is to listen for events, and respond by dispatching commands.

In that picture, you would have something like the following:

Command(A) arrives at AR(1)
    AR(1) loads its history [x,y,z]
    AR(1) executes Command(A), producing events [a,b,c]
    AR(1) writes its new history [x,y,z,a,b,c]

Events(a,b,c) are published

ProcessManager receives the events (a,b,c)
    ProcessManager dispatches Command(B) to AR(2)

Command(B) arrives at AR(2)
    AR(2) loads its own history [d,e,f]
    AR(2) executes Command(B), producing events [g,h]
    AR(2) writes its new history [d,e,f,g,h]

Events(g,h) are published

Trying to have two different aggregate roots share a common event history is really weird; it strongly suggests that your model needs rethinking (why are there two different authorities for the same fact? what happens when AR1 writes an event that violates the invariant enforced by AR2?).

But taking some of the state from one event, and making that an argument in a command sent to another aggregate; that pattern is pretty common. The process manager itself is just a substitute for a human being reading the events and deciding what commands to fire.

I always need current state of AR1 layer, even if AR2 is created from the scratch it needs to have content of layer from AR1. The layer would be read only.

There's no such thing as getting the "current" state of another aggregate; AR1 could be changing while AR2 is doing its work, and there's no way to know that. If that's not acceptable, then your aggregate boundaries are in the wrong place.

If stale data is acceptable, you can have the AR2 command handler query the state of AR1, and use that information in processing the command. If you are going to do that, I normally prefer to wrap the query in a Domain Service, which gives you an extra layer of indirection to work with (the domain model doesn't need to know how the service is implemented). In this design, AR2 doesn't see the AR1 events at all; AR2 passes some state to the domain service, and the domain service looks at the events to figure out the answer, and passes that answer back as a value that AR2 will understand.

Whittaker's solution isn't bad; once you recognize that the data is stale anyway, you have the option of deciding whether the state available at the time of creating the command is good enough. I'm of mixed minds on this -- putting everything into the command is nice, and really easy to understand. On the other hand, there is a larger window for a change to happen, and to some degree discovering the right data to use requires accessing state internal to the aggregate that can change while the command is in flight.

I much prefer designs where the aggregates aren't coupled, though.

But it seems that this is again sharing of data between the AR-s, since fat command will use data from layer from AR1 to supply to AR2

You might look into what Udi Dahan has to say about services as technical authorities. In that case, the data that gets shared is mostly limited to opaque identifiers.

Event Sourcing – How to Deal with Side Effects in Event Sourcing

How do I deal with side effects in Event Sourcing?

Short version: the domain model doesn't perform side effects. It tracks them. Side effects are performed using a port that connects to the boundary; when the email is sent, you send the acknowledgement back to the domain model.

This means that the email is sent outside of the transaction that updates the event stream.

Precisely where, outside, is a matter of taste.

So conceptually, you have a stream of events like

EmailPrepared(id:123)
EmailPrepared(id:456)
EmailPrepared(id:789)
EmailDelivered(id:456)
EmailDelivered(id:789)

And from this stream you can create a fold

{
    deliveredMail : [ 456, 789 ],
    undeliveredMail : [123]
}

The fold tells you which emails haven't been acknowledged, so you send them again:

undeliveredMail.each ( mail -> {
    send(mail);
    dispatch( new EmailDelivered.from(mail) );
}

Effectively, this is a two phase commit: you are modifying SMTP in the real world, and then you are updating the model.

The pattern above gives you an at-least-once delivery model. If you want at-most-once, you can turn it around

undeliveredMail.each ( mail -> {
    commit( new EmailDelivered.from(mail) );
    send(mail);
}

There's a transaction barrier between making EmailPrepared durable and actually sending the email. There's also a transaction barrier between sending the email and making EmailDelivered durable.

Udi Dahan's Reliable Messaging with Distributed Transactions may be a good starting point.

Best Answer

Related Solutions

Sharing of event source stream between aggregates

Event Sourcing – How to Deal with Side Effects in Event Sourcing

Related Topic