How to handle business rules that are “uniqueness” constraints

domain-driven-design

In a hypothetical system that handles adding users, there are several business rules. Some of the rules can easily be checked in the model. For example a user registration can only be saved if they entered a 10 digit phone number.

But what if we want this phone number to be unique?

In most databases it's fairly easy to add a constraint that generates an error when trying to store a duplicate value. But when following that approach, the model doesn't explicitly make clear that a phone number should be unique. If the model is reused or the database is changed, these business rules could be overlooked.

If we want to encapsulate this knowledge in the domain, we could create a Domain Service (since Entities are not supposed to communicate with the outside world like the database). This UserService could use the UserRepository to check if the phone number already exists.

The second approach requires more code and an additional round trip to the database, but it improves the domain model.

Are there (better) alternatives? Which approach would you choose and why?

Best Answer

But what if we want this phone number to be unique?

An important question is "what is it worth to the business?" How expensive are errors here? Will a best effort approximation keep you under the error budget?

In most databases it's fairly easy to add a constraint that generates an error when trying to store a duplicate value. But when following that approach, the model doesn't explicitly make clear that a phone number should be unique. If the model is reused or the database is changed, these business rules could be overlooked.

That's the most common approach -- the general term for the uniqueness problem is set validation, and RDBMS systems tend to be really really good at sets.

So what that answer might look like is using an RDBMS as your message store, and also have in the schema a table for the phone number mappings, with the appropriate uniqueness constraints.

But as you note, that solution isn't entirely satisfactory, because the responsibility for managing the constraints is split between the storage appliance and the domain model.

If you want to solve this in the model, well... the answer is to pull the set into the domain model. So far, I've only seen two variations.

One is to make "the set" into an aggregate - so all of the users would be part of a single aggregate. Checking the uniqueness of the phone number is easy, but contention for other kinds of edits goes up.

Another is to make "the set of all users with this phone number" into an aggregate. Instead of "this user has a phone number", it's "this phone number has a user".

A big consideration in these cases, and the reason you'll get a lot of advice not to got down this particular rabbit hole, is that your system isn't the authority for the relationship that you are trying to enforce. Unless you are a phone switchboard assigning numbers to accounts, it's not your data, and it can change without giving you notice. (Note: you get the same problems when using the database to enforce constraints on data you don't own.)

If you pull phone numbers into model how you will handle concurrency, if someone adds the same number after you add it second time but just after you pulled set from db...

You'll handle concurrency the same way that you needed to anyway -- first writer wins semantics on your persistence store. You load an old value into your model, compute the new value, and then apply a compare-and-swap operation on your persistent store. The winner of the data race is done, the loser gets to compensate using whatever strategy you prefer (abort, retry, ignore)...

There's nothing about set validation that is special here if you remember that you need to model the set as a first class entity (which means paying for the contention).

Why would you prefer this over creating a Domain Service to check uniqueness before inserting/updating?

Because if you aren't locking the set against modification, your validation has race bugs.

Fundamentally, all of the checks you run in local memory are working off of a copy of the data. If the underlying source changes while you are looking at the copy, then you haven't really checked what you think you have.

I only know of two patterns to address this; you either put a hard lock on the set (which is what we are effectively doing when we arrange that the database enforces the constraint), or you apply an atomic compare-and-swap to ensure that the copy of the data that you used for validation is still up to date.