CQRS Command – How to Validate and Transform to Domain Object

business-logiccqrsdomain-driven-design

I have been adapting poor-man's CQRS1 for quite some time now because I love its flexibility to have granular data in one data store, providing great possibilities for analysis and thus increasing business value and when needed another for reads containing denormalized data for increased performance.

But unfortunately pretty much from the beginning I have been struggling with the problem where exactly I should place business logic in this type of architecture.

From what I understand, a command is a mean to communicate intent and does not have ties to a domain by itself. They are basically data (dumb – if you wish) transfer objects. This is to make commands easily transferable between different technologies. Same applies to events as responses to successfully completed events.

In a typical DDD application the business logic resides within entities, value objects, aggregate roots, they are rich in both data as well as behavior. But a command is not a domain object thus it should not be limited to domain representations of data, because that puts too much strain on them.

So the real question is: Where exactly is the logic?

I have found out I tend to face this struggle most often when trying to construct a quite complicated aggregate which sets some rules about combinations of its values. Also, when modeling domain objects I like to follow the fail-fast paradigm, knowing when an object reaches a method it's in a valid state.

Let's say an aggregate Car uses two components:

  • Transmission,
  • Engine.

Both Transmission and Engine value objects are represented as super types and have according sub types, Automatic and Manual transmissions, or Petrol and Electric engines respectively.

In this domain, living on its own a successfully created Transmission, be it Automatic or Manual, or either type of an Engine is completely fine. But the Car aggregate introduces a few new rules, applicable only when Transmission and Engine objects are used in the same context. Namely:

  • When a car uses Electric engine the only allowed transmission type is Automatic.
  • When a car uses Petrol engine it may have either type of Transmission.

I could catch this component combination violation at the level of creating a command, but as I have stated before, from what I understand that should not be done because the command would then contain business logic which should be limited to the domain layer.

One of the options is to move this business logic validation to command validator itself, but this does not seem to be right either. It feels like I would be deconstructing the command, checking its properties retrieved using getters and comparing them within the validator and inspecting results. That screams like a violation of the law of Demeter to me.

Discarding the mentioned validation option because it does not seem viable, it seems like one should use the command and construct the aggregate from it. But where should this logic exist? Should it be within the command handler responsible for handling a concrete command? Or should it perhaps be within the command validator (I don't like this approach either)?

I am currently using a command and create an aggregate from it within the responsible command handler. But when I do this, should I have a command validator it would not contain anything at all, because should the CreateCar command exist it would then contain components which I know are valid on separate cases but the aggregate might say different.


Let's imagine a different scenario mixing different validation processes – creating a new user using a CreateUser command.

The command contains an Id of a users which will have been created and their Email.

The system states the following rules for user's email address:

  • must be unique,
  • must not be empty,
  • must have at most 100 characters (max length of a db column).

In this case, even though having a unique email is a business rule, checking it in an aggregate makes very little sense, because I would need to load the entire set of current emails in the system to a memory and check the email in the command against the aggregate (Eeeek! Something, something, performance.). Because of that, I would move this check to the command validator, which would take UserRepository as a dependency and use the repository to check whether a user with the email present in the command already exists.

When it comes to this it suddenly makes sense to put the other two email rules in the command validator as well. But I have a feeling the rules should be really present within a User aggregate and that the command validator should only check about the uniqueness and if validation succeeds I should proceed to create the User aggregate in the CreateUserCommandHandler and pass it to a repository to be saved.

I feel like this because the repository's save method is likely to accept an aggregate which ensures that once the aggregate is passed all invariants are fulfilled. When the logic (e.g. the non-emptiness) is only present within the command validation itself another programmer could completely skip this validation and call the save method in the UserRepository with a User object directly which could lead to a fatal database error, because the email might have been too long.

How do you personally handle these complex validations and transformations? I am mostly happy with my solution, but I feel like I need affirmation that my ideas and approaches are not completely stupid to be pretty happy with the choices. I am entirely open to completely different approaches. If you have something you have personally tried and worked very well for you I would love to see your solution.


1 Working as a PHP developer responsible for creating RESTful systems my interpretation of CQRS deviates a little from the standard async-command-processing approach, such as sometimes returning results from commands due to the need of processing commands synchronously.

Best Answer

The following answer is in the context of the CQRS style promoted by the cqrs.nu in which commands arrive directly on the aggregates. In this architectural style the application services are being replaced by an infrastructure component (the CommandDispatcher) that identifies the aggregate, loads it, sends it the command and then persists the aggregate (as a series of events if Event sourcing is used).

So the real question is: Where exactly is the logic?

There are multiple kinds of (validation) logic. The general idea is to execute the logic as early as possible - fail fast if you want. So, the situations are as follows:

  • the structure of the command object itself; the command's constructor has some required fields that must be present for the command to be created; this is the first and fastest validation; this is obviously contained in the command.
  • low level field validation, like the non-emptiness of some fields (like the username) or the format (a valid email address). This kind of validation should be contained inside the command itself, in the constructor. There is another style of having an isValid method but this seems pointless to me as someone would have to remember to call this method when in fact successful command instantiation should suffice.
  • separate command validators, classes that have the responsibility to validated a command. I use this kind of validation when I need to check information from multiple aggregates or external sources. You could use this to check the uniqueness of an username. Command validators could have any dependencies injected, like repositories. Keep in mind that this validation is eventually consistent with the aggregate (i.e. when the user gets created, another user with the same username could be created in the meantime)! Also, do not try to put here logic that should reside inside the aggregate! Command validators are different from the Sagas/Process managers which generate commands based on events.
  • the aggregate methods that receive and process the commands. This is the last (kind of) validation that occurs. The aggregate extract the data from the command and using some core business logic it accepts (it performs changes to it's state) or rejects it. This logic is checked in a strong consistent manner. This is the last line of defense. In your example, the rule When a car uses Electric engine the only allowed transmission type is Automatic should be checked here.

I feel like this because the repository's save method is likely to accept an aggregate which ensures that once the aggregate is passed all invariants are fulfilled. When the logic (e.g. the non-emptiness) is only present within the command validation itself another programmer could completely skip this validation and call the save method in the UserRepository with a User object directly which could lead to a fatal database error, because the email might have been too long.

Using the above techniques nobody can create invalid commands or bypass the logic inside the aggregates. Command validators are automatically loaded+called by the CommandDispatcher so nobody can send a command directly to the aggregate. One could call a method on the aggregate passing a command but could not persist the changes so it would be pointless/harmless to do so.

Working as a PHP developer responsible for creating RESTful systems my interpretation of CQRS deviates a little from the standard async-command-processing approach, such as sometimes returning results from commands due to the need of processing commands synchronously.

I'm also a PHP programmer and I don't return anything from my command handlers (aggregate methods in the form handleSomeCommand). I do, however, quite often, return information to the client/browser in the HTTP response, for example the ID of the newly created aggregate root or something from a read-model but I never return (really never) anything from my aggregate command methods. The simple fact that the command was accepted (and processed - we are talking about synchronous PHP processing, right?!) is sufficient.

We return something to the browser (and still doing CQRS by the book) because CQRS is not a high level architecture.

An example of how command validators work:

Command's path through command validators on its way to the Aggregate