Domain-Driven Design – Is Domain/Persistence Model Isolation Awkward?

domain-driven-designdomain-modelpersistence

I'm diving into the concepts to Domain-Driven Design (DDD) and found some principles strange, especially regarding the isolation of domain and persistence model. Here is my basic understanding:

  1. A service on the application layer (providing a feature set) requests domain objects from a repository it needs to carry out its function.
  2. The concrete implementation of this repository fetches data from the storage it was implemented for
  3. The service tells the domain object, which encapsulates business logic, to perform certain tasks which modifies its state.
  4. The service tells the repository to persist the modified domain object.
  5. The repository needs to map the domain object back to the corresponding representation in storage.

Flow Illustration

Now, given the above assumptions, the following seems awkward:

Ad 2.:

The domain model seems to load the entire domain object (including all fields and references), even if they are not needed for the function that requested it. Loading entirely might not even be possible if other domain objects are referenced, unless you load those domain objects as well and all the objects they reference in turn, and so on and so forth. Lazy loading comes to mind, which however means that you start querying your domain objects which should be the responsibility of the repository in the first place.

Given this problem, the "correct" way of loading domain objects seems to be having a dedicated loading function for each use case. These dedicated functions would then only load the data required by the use case they were designed for. Here's where the awkwardness comes into play: First, I would have to maintain a considerable amount of loading functions for each implementation of the repository, and domain objects would end up in incomplete states carrying null in their fields. The latter should technically not be a problem because if a value was not loaded, it should not be required by the functionality that requested it anyway. Still it's awkward and a potential hazard.

Ad 3.:

How would a domain object verify uniqueness constrains upon construction if it does not have any notion of the repository? For instance, if I wanted to create a new User with a unique social security number (which is given), the earliest conflict would occur upon asking the repository to save the object, only if there is a uniqueness constraint defined on the database. Otherwise, I could look for a User with the given social security and report an error in case it exists, before creating a new one. But then the constraint checks would live in the service and not in the domain object where they belong. I just realised that the domain objects are very well allowed to use (injected) repositories for validation.

Ad 5.:

I perceive the mapping of domain objects to a storage backend as a work-intensive process in comparison to having the domain objects modify the underlaying data directly. It is, of course, an essential prerequisite to decouple the concrete storage implementation from the domain code. However, does it indeed come at such a high cost?

You apparently have the option to use ORM tools to do the mapping for you. These would often require you to design the domain model according to the ORM's restrictions, however, or even introduce a dependency from the domain to infrastructure layer (by using ORM annotations in the domain objects, for instance). Also I've read that ORMs introduce a considerable computational overhead.

In the case of NoSQL databases, for which hardly any ORM-like concepts exist, how do you keep track of which properties changed in the domain models upon save()?

Edit: Also, in order for a repository to access the domain object's state (i.e. the value of each field), the domain object needs to reveal its internal state which breaks encapsulation.

In general:

  • Where would transactional logic go? This is certainly persistence
    specific. Some storage infrastructure might not even support
    transactions at all (like in-memory mock repositories).
  • For bulk operations that modify multiple objects, would I have to load, modify and store each object individually in order to go through the object's encapsulated validation logic? That is opposed to executing a single query directly onto the database.

I would appreciate some clarification on this topic. Are my assumptions correct? If not, what is the correct way of tackling these problems?

Best Answer

Your basic understanding is correct and the architecture you sketch out is good and works well.

Reading between the lines it seems like you are coming from a more database-centric active record style of programming? To get to a working implementation I would say you need to

1: Domain objects don't have to include the whole object graph. For example I could have:

public class Customer
{
    public string AddressId {get;set;}
    public string Name {get;set;}
}

public class Address
{
    public string Id {get;set;}
    public string HouseNumber {get;set;
}

Address and Customer need only be part of the same aggregate if you have some logic such as, "the customer name can only start with the same letter as the housename". You are right to avoid lazy loading and 'Lite' versions of objects.

2: Uniqueness constraints generally are the purview of the repository not the domain object. Don't inject repositories into Domain Objects, that's a move back to active record, simply error when the service attempts to save.

The business rule isn't "No two instances of User with the same SocialSecurityNumber can exist at the same time ever"

It's that they cant exist on the same repository.

3: It's not hard to write repositories rather than individual property update methods. In fact, you'll find that you have pretty much the same code either way. Its just which class you put it in.

ORMs these days are easy and have no extra constraints on your code. Having said that, personally I prefer to simply hand crank the SQL. It's not that hard, you never run into any issues with ORM features and you can optimise where required.

There really is no need to keep track of which properties changed when you saved. Keep your Domain Objects small and simply overwrite the old version.

General Questions

  1. Transaction logic goes in the repository. But you shouldn't have much if any of it. Sure you need some if you have child tables in which you are putting the child objects of the aggregate, but that will be entirely encapsulated within the SaveMyObject repository method.

  2. Bulk updates. Yes you should individually alter each object then just add a SaveMyObjects(List objects) method to your repository, to do the bulk update.

    You want the Domain Object or Domain Service to contain the logic. Not the database. That means you can't just do "update customer set name=x where y", because for all you know the Customer object, or CustomerUpdateService does 20 odd other things when you change the name.