Domain-Driven Design – Understanding Aggregate Meaning

aggregatedomain-driven-design

In a discussion about domain driven design I have learned the different people seem to think of different things when using the word aggregate. The main difficulty is that some people use the word aggregate for what other people call aggregate type.

It is quite difficult to have a discussion if people assume different meaning for the same words. For this reason I set out trying to clarify on what most people and the literature agrees too. If you give an answer to this question I would be very happy if you could provide a reference to literature.


For one person an aggregate is the boundary that groups a collection of entities. It is more a conceptional clustering boundary.

For another person an aggregate is a collection of entities transfered from a database repository (having transitional consistency). So an aggregate is something real and not just a concept. If I for example load two users from a database then I have loaded two aggregates of the same aggregate type.

Another person that also thinks that a collection of entities that are transactional consent but thinks that if you load data of a given aggregate type you can also load it partially (with some data just null for example) and still call the whole thing one aggregate while others would see this as two aggregates (with eventual consistency, meaning the consistency is given after both aggregates are saved).


To find the true meaning of the word aggregate myself I have had a look at the definition of Martin Fowler. Here an aggregate is something real and there can be two aggregates of the same aggregate type. But when reading something like this article from Vaughn Vernon I get the impression that he calls aggregate what according to the 'Martin Folwer like interpreted understanding' should be called aggregate type.

Best Answer

For terminology in Domain Driven Design, start from "the blue book" -- Domain Driven Design by Eric Evans.

AGGREGATE A cluster of associated objects that are treated as a unit for the purpose of data changes. External references are restricted to one member of the aggregate, designated as the root. A set of consistency rules applies within the aggregate's boundaries.

That last sentence, I think you can turn around -- the boundaries of the aggregate are defined by the consistency rules.

It's definitely the case that an aggregate has state. Each time the domain model changes, an aggregate is taken from one consistent state to another. The data that we persist is used to reconstruct this state. So in that sense, it is a real thing.

But the aggregate itself doesn't necessarily have a word in the ubiquitous language. It's a derived concept.

Broadly, we could put the entire domain model under a single aggregate, that enforces all of the consistency rules. We don't, because it that design doesn't scale: we can't change the domain model two different ways at the same time, even when the changes we are making don't share any consistency rules. It's a poor way to model a business that can do more than one thing at a time.

Instead, we decompose the consistency rules into sets, subject to the constraint that two rules that reference the same data must be part of the same set. (In doing this, we are also working with the ubiquitous language and the domain experts to determine if we are correctly describing the consistency rules).

To update the model, we identify the aggregate responsible for a piece of data and propose the change. If the aggregate verifies all of its local consistency rules, we know that the change is globally valid, and we can apply the change. This restores our ability to do more than one thing at a time - changes to data in different aggregates can't possibly conflict with each other, by construction.

Best practices suggest that most aggregates should contain only the root entity. So you can conflate the aggregate with the entity without too much risk. But my guess it there won't usually be anything in the ubiquitous language to hang on the cluster when it includes more than one entity; so you end up with the ShoppingCart aggregate maintaining the consistency rules for the ShoppingCart entity and the CartItems entity collection and....

Partial loading of an aggregate is broken when trying to apply a change -- how could a well designed aggregate possible validate all of its consistency rules with a subset of the data? It's certainly the case that, if you have a requirement where this makes sense, your modeling is broken somewhere.

But if you are doing a read, loading only some of the data guarded by the aggregate can make sense. Command Query Responsibility Separation (CQRS) takes this a step further; once the model has verified that the data satisfies the consistency rules, you can completely rearrange that data into whatever read only form makes your life easiest. Put another way, if you aren't concerned with data changes, you don't need to worry about the aggregate boundary at all.