How Add/Create* commands should be handled in CQRS + Event Sourcing architecture

cqrsdomain-driven-designevent-sourcingrepository

I want to implement my first application using CQRS pattern along with Event Sourcing. I am wondering how creation of aggregate roots should be handled properly. Let's say someone sends CreateItem command. How it should be handled? Where the event ItemCreated should be stored? As first event of a new Item? Or should I have some kind of ItemList entity that aggregates all items and its event list consists only of ItemCreated events?

Udi Dahan suggests not to create aggregate roots and always use instead some kind of fetch method. But how I can fetch something that is new and certainly does not have any ID assigned. I understand the idea behind and it is pretty reasonable to think that a new object is an object that has its state composed of zero events replied on it. But how should I use it? Should I have a distinct method in my Repository like getNewItem() or make my get(id) method accepting Optional<ItemId> instead?

Edit: After some time of digging I found really interesting implementation of the aforementioned patterns using actors. The author instead of creating the aggregate, retrieves it from some kind of repository with newly created UUID. The drawback of this approach is he allows for a temporary inconsistency state. I am also wondering how I can implement delete method with such approach. Simply add Deleted event to the event list of the aggregate?

Best Answer

The idea in Udi's post, as I gather, is that no kind of item appears out of thin air. There is (almost) always something, or more specifically, some domain operation, which caused the item to be created. Just like Udi's example of a user actually being born out of a visitor registering to the site. At that point and at that bounded context Visitor is the aggregate root, which is retrieved by his IP address. This Visitor then creates the new "item", a user at this point, through a domain operation called Register. Same goes for the step before, which is another bounded context: Referrer is the AR, which is retrieved by the URL and that has a domain operation called BroughtVisitorWithIp, where the visitor is born.

Udi writes very nicely on deletion as well: http://www.udidahan.com/2009/09/01/dont-delete-just-dont/. Main idea is, that you don't delete anything, ever. There's always a domain operation behind, which we want to capture. Like an order being cancelled, rather than deleted. Read it, it's a very good post.

The main point here on both accounts, doing DDD and especially Event Sourcing, is that you should never do straight CRUD-operations. If you find yourself in a situation where you really need to just insert, update or delete some data, and there truly is no domain operation behind it, then maybe DDD and Event Sourcing is not a good fit for that bounded context. You are free to combine these two as you wish as long as a single bounded context adheres to one principle. This way the CRUD-style bounded context might create some row in the database, that becomes an entity and an Aggregate root in another bounded context, where you now can retrieve the AR and not have to create it.

Related Solutions

Sharing of event source stream between aggregates

This events will be fetched by AR2 and then processed.

Ohh, that sounds like a bad idea.

One more notice - in AR2 this layer would be read-only = needed to do some business logic inside AR2.

So looking at your picture, AR2 is writing out events (d,e,f), which says that AR2 is not read only -- which is good; read only aggregate roots don't make sense.

The usual pattern for what you seem to be trying to do here is to use a process manager to coordinate the activities of the two aggregate roots. The role of the process manager is to listen for events, and respond by dispatching commands.

In that picture, you would have something like the following:

Command(A) arrives at AR(1)
    AR(1) loads its history [x,y,z]
    AR(1) executes Command(A), producing events [a,b,c]
    AR(1) writes its new history [x,y,z,a,b,c]

Events(a,b,c) are published

ProcessManager receives the events (a,b,c)
    ProcessManager dispatches Command(B) to AR(2)

Command(B) arrives at AR(2)
    AR(2) loads its own history [d,e,f]
    AR(2) executes Command(B), producing events [g,h]
    AR(2) writes its new history [d,e,f,g,h]

Events(g,h) are published

Trying to have two different aggregate roots share a common event history is really weird; it strongly suggests that your model needs rethinking (why are there two different authorities for the same fact? what happens when AR1 writes an event that violates the invariant enforced by AR2?).

But taking some of the state from one event, and making that an argument in a command sent to another aggregate; that pattern is pretty common. The process manager itself is just a substitute for a human being reading the events and deciding what commands to fire.

I always need current state of AR1 layer, even if AR2 is created from the scratch it needs to have content of layer from AR1. The layer would be read only.

There's no such thing as getting the "current" state of another aggregate; AR1 could be changing while AR2 is doing its work, and there's no way to know that. If that's not acceptable, then your aggregate boundaries are in the wrong place.

If stale data is acceptable, you can have the AR2 command handler query the state of AR1, and use that information in processing the command. If you are going to do that, I normally prefer to wrap the query in a Domain Service, which gives you an extra layer of indirection to work with (the domain model doesn't need to know how the service is implemented). In this design, AR2 doesn't see the AR1 events at all; AR2 passes some state to the domain service, and the domain service looks at the events to figure out the answer, and passes that answer back as a value that AR2 will understand.

Whittaker's solution isn't bad; once you recognize that the data is stale anyway, you have the option of deciding whether the state available at the time of creating the command is good enough. I'm of mixed minds on this -- putting everything into the command is nice, and really easy to understand. On the other hand, there is a larger window for a change to happen, and to some degree discovering the right data to use requires accessing state internal to the aggregate that can change while the command is in flight.

I much prefer designs where the aggregates aren't coupled, though.

But it seems that this is again sharing of data between the AR-s, since fat command will use data from layer from AR1 to supply to AR2

You might look into what Udi Dahan has to say about services as technical authorities. In that case, the data that gets shared is mostly limited to opaque identifiers.

How to Implement a Process Manager in Event Sourcing

Review what Rinat Abdullin wrote about evolving business process. In particular, notice his recommendation for developing a business process in a fast changing environment -- a process manager is "just" an automated replacement for a human being staring at a screen.

My own mental model of a process manager is that it is an event sourced projection that you can query for a list of pending commands.

Do I need to persist the process manager? It seems like I do, but I'm not sure

It's a read model. You can rebuild the process manager from the history of events each time you need it; or you can treat it like a snapshot that you update.

If I do, I need to save the events for the process manager.

No - the process manager is a manager. It doesn't do anything useful on its own; instead it tells aggregates to do work (ie, make changes to the book of record).

How do I know what basket the ProductReserved events are for? Is it OK to have a BasketId on those too, or is that leaking info

Sure. Note: in most "real" shopping domains, you wouldn't insist on reserving inventory before processing an order; it adds unnecessary contention to the business. It's more likely that your business would want to accept the order, then apologize in the rare case that the order can't be fulfilled in the required time.

How do I keep a relationship between events, how do I know which ItemAdded produced which ProductReserved event?

Messages have meta data - in particular, you can include a causationIdentifier (so you can identify which commands produced which events) and a correlationIdentifier, to generally track the conversation.

For instance, the process manager writes its own id as the correlationId in the command; the events produced by a copy the correlation id of the command, and your process manager subscribes to all events that have its own correlationId.

Should I implement the Basket as a process manager instead of a simple aggregate?

My recommendation is no. But Udi Dahan has a different opinion that you should review; which is that CQRS only makes sense if your aggregates are sagas -- Udi used saga in the place where process manager has become the dominant spelling.

should process managers retrieve aggregates?

Not really? Process managers are primarily concerned with orchestration, not domain state. An instance of a process will have "state", in the form of a history of all of the events that they have observed -- the correct thing to do in response to event Z depends on whether or not we have seen events X and Y. So you may need to be able to store and load a representation of that state (which could be something flat, or could be the history of observed events).

(I say "not really" because aggregate is defined vaguely enough that it's not completely wrong to claim that list of observed events is an "aggregate". The differences are more semantic than implementation -- we load process state and then decide what messages to send to the parts of the system responsible for domain state. There's a bit of hand waving going on here.)

So the PM does not need to use one type of state management over another because it is only responsible for doing stuff live and never during replays?

Not quite - state management isn't a do-er, it's a keeper tracker of-er. In circumstances where the process manger shouldn't emit any signals, you give it inert connections to the world. In other words, dispatch(command) is a no-op.

Best Answer

Related Solutions

Sharing of event source stream between aggregates

How to Implement a Process Manager in Event Sourcing

Related Topic