Architecture – Addressing Primary Keys Outside Business Domain

Architecture

In almost all circumstances, primary keys are not a part of your business domain. Sure, you may have some important user-facing objects with unique indices (UserName for users or OrderNumber for orders) but in most cases, there is no business need to overtly identify domain objects by a single value or set of values, to anyone but perhaps an administrative user. Even in those exceptional cases, especially if you are using global unique identifiers (GUID), you will like or want to employ an alternate key rather than expose the primary key itself.

So, if my understanding of domain-driven design is accurate, primary keys need not and thus should not be exposed, and good riddance. They're ugly and cramp my style. But if we choose not to include primary keys in the domain model, there are consequences:

Naively, data transfer objects (DTO) that derive exclusively from combinations of domain models will not have primary keys
Incoming DTO's will not have a primary key

So, is it safe to say that if you are really going to stay pure and eliminate primary keys in your domain model, you should be prepared to be able to handle every request in terms of unique indices on that primary key?

Put in another way, which of the following solutions is the correct approach to dealing with identifying particular objects after removing PK in domain models?

Being able to identify the objects you need to deal with by other attributes
Getting the primary key back in the DTO; ie, eliminating the PK when mapping from persistence to domain, then recombining the PK when mapping from domain to DTO?

EDIT: Let's make this concrete.

Say my domain model is VoIPProvider which includes fields like Name, Description, URL, as well as references like ProviderType, PhysicalAddress, and Transactions.

Now let's say I want to build a web service that will allow privileged users to manage VoIPProviders.

Perhaps a user-friendly ID is useless in this case; after all, VoIP providers are companies whose names tend to be distinct in the computer sense and even distinct enough in the human sense for business reasons. So it may be enough to say that a unique VoIPProvider is completely determined by (Name, URL). So now let's say I need a method PUT api/providers/voip so that privileged users can update VoIP providers. They send up a VoIPProviderDTO, which includes many but not all of the fields from the VoIPProvider, including some flattening potentially. However, I can't read their minds, and they still need to tell me which provider we are talking about.

It seems I have 2 (maybe 3) options:

Include a primary key or alternate key in my domain model and send it to the DTO, and vice versa
Identify the provider we care about via the unique index, like (Name, Url)
Introduce some sort of intermediate object that can always map between persistence layer, domain, and DTO in a way that does not expose implementation details about the persistence layer – say by introducing an in-memory temporary identifier when going from domain to DTO and back,

Best Answer

This is the way how we solve this (since more than 15 years, when even the term "domain driven design" was not invented):

when mapping the domain model to a database implementation or a class model in a specific programming language, you have a simple, consistent rule like "for each domain object mapped to a relational table, the primary key is "TablenameID".
this primary key is completely artificial, it has always the same type, and no business meaning - just a surrogate key
the "graphical version" of your domain model (the one you use to talk to your domain experts) does not contain primary keys. You don't expose them directly to the experts (but you expose them to anyone who is actually implementing code for the system).

So whenever you need a primary key for technical purposes (like mapping relations to a database), you have one available, but as long as you don't want to "see it", change your level of abstraction to the "domain experts model". And you don't have to maintain "two models" (one with PKs and one without); instead, maintain only a model without PKs and use a code generator to create the DDL for your DB, which adds the PK automatically according to the mapping rules.

Note that this does not forbid to add any "business keys" like an additional "OrderNumber", besides the surrogate OrderID. Technically these business keys become alternate keys when mapping to your database. Just avoid using these for creating references to other tables, always prefer using the surrogate keys if possible, this will make things a hell lot easier.

To your comment: using a surrogate key for identifying records is no business-related operation, it is a purely technical operation. To make this clear, look at your example: as long as you don't define additional unique-contraints, it would be possible to have two VoIPProvider objects with the same combination of (name,url), but different VoIPProviderIDs.

Related Solutions

Architecture – Validation and data persistence in a domain model

I could be mistaken in my understanding, but I believe your trying to redesign your Domain Model so that it isn't riddled with anemia. A couple of really, really great books on the matter are:

Martin Fowler - Patterns of Enterprise Application Architecture
Eric Evans: Domain-Driven Design: Tackling Complexity in the Heart Of Software
Vaughn Vernon: Implementing Domain Driven Design

They have a lot of content which answers quite a few questions on your goal. One of which is:

A rich Domain Model can look different from the database design, with inheritance, strategies, and other Gang of Four patterns, and complex webs of small interconnected objects. A rich Domain model is better for more complex logic, but is harder to map to the database.

A simple Domain Model can use Active Record, whereas a rich Domain Model requires a Data Mapper. Since the behavior of the business subject to alot of change, it's important to be able to modify, build, and test this layer easily.

That is an excerpt from Fowler. He will go into details about the model being anemic and bloated models as well. The scope of the question to me appears quite large; so I'm not sure I could delve far into it. Those books should point you in the right direction; also will allow you to gauge your current project and implementation to refactor it quickly.

From a relatively basic understanding:

You should be able to abstract your model, which will allow a brief overlay of the UI and Data Access into your Domain Model. But otherwise the model should allow a clear, clean, concise implementation which will help separate responsibility.

Sorry I couldn't be more help.

Architecture – How to Maintain User Avatars in NoSQL Databases

One of the possible solutions would be to add a route such as https://example.com/user/<id>/avatar which would redirect the browser to the actual avatar.

For instance, if the real avatar is stored at https://linkedin.com/avatars/40bd001563085fc35165, your website will only store this URI once, in users document, associated with the user 123. Everywhere in the user interface, i.e. in all entities such as comments, the avatar will be implemented like this:

<img src="https://example.com/user/123/avatar" alt="..." />

During a HTTP request to https://example.com/user/123/avatar, the server will load the stored URI and respond with:

HTTP/1.1 302 Found
Location: https://linkedin.com/avatars/40bd001563085fc35165

which would effectively force the browser to show the correct image.

Notes:

In terms of performance, there shouldn't be too much issues. The request is relatively fast to process, and uses only marginal bandwidth (unlike serving images yourself).
It is essential to use HTTP 302 and not HTTP 301; otherwise, users who changed avatars will sometimes continue to see the old avatar, possibly for a long time.
Proper client-side caching can be implemented to prevent the browser from requesting the same avatar over and over (usually, when changing an avatar, one wouldn't be surprised to still see the old one for several minutes on some sites).

Note that if you're experiencing this difficulty with the avatar, you'll probably have the same issue with other pieces of information as well, due to improper normalization/denormalization. While some NoSQL databases encourage you to duplicate data in order to make queries faster and guarantee data consistency within a document, this comes at a cost of not being able to easily change the data scattered all over your database. Therefore:

Make sure you understand when to duplicate data and where to reference a single piece of data from other documents.
When applicable, rely on techniques such as the one I presented here, which make it possible to store a piece of data once, while not referencing it directly in other documents. For instance, on Stack Exchange, the zone which displays a user shows not only the user's name and avatar, but also the badges, the geographical location, the link to the website and a signature. Since those data are not crucial for the website, they can be queried through AJAX after the page is loaded, meaning that a question/answer doesn't need to contain this information or link to it in any way.

Best Answer

Related Solutions

Architecture – Validation and data persistence in a domain model

Architecture – How to Maintain User Avatars in NoSQL Databases

Related Topic