CQRS – Where to Place Validation Logic?

cdomain-driven-design

I asked this question a while ago: What is the "best" way to apprach validation from the perspective of a DDD purist?

At the time I decided to put the validation logic inside the domain object constructor like one of the answerers suggested.

The business have recently become interested in how data has arrived at its current state. Therefore I am familiarising myself with Event Sourcing and also CQRS. I am unsure where to put the validation logic when using CQRS. For example, please see the Command object below:

public class RequestBookingCommand : Command
    {
        public RequestBookingCommand(int courtId, int hour, int length, string userName, string notes)
        {
            Name = "AddBooking";
            CourtId = courtId;
            Hour = hour;
            Length = length;
            UserName = userName;
            Notes = notes;
        }

        public int CourtId { get; private set; }
        public int Hour { get; private set; }
        public int Length { get; private set; }
        public string UserName { get; private set; }
        public string Notes { get; private set; }
    }

There is also a domain object that accepts the RequestBookingCommand in the constructor like this:

public RequestBooking(RequestBookingCommand requestBookingCommand)
        {
            Name = requestBookingCommand.Name;
            CourtId = requestBookingCommand.CourId;
            Hour = requestBookingCommand.Hour;
            Length = requestBookingCommand.Length;
            UserName = requestBookingCommand.UserName;
            Notes = requestBookingCommand.Notes;
        }

I believe I have three options for the validation:

Option 1 - Put the validation in the Command object only.
Option 2 - put the validation in the Domain object only.
Option 3 - Put the validation in both the Command object and the domain object

An example of validation is (it is simple):

if (CourtId = 0)
{
    throw new ArguementException();
}
//Ensure the rest of the members are populated with values.....

At first I thought option 2 or 3 was the most appropriate because the Domain object should be responsible for maintaining the invariants and also the business analysts have requested this validation. However everywhere I look suggests that the validation logic should go in the Command object e.g. here and here. Steven also suggests it here.

It appears to me that putting the validation logic in a command is like putting it in the application service (making the domain model anemic). Is that not the case?

I realise my application will work regardless of what option I choose. However, I am trying to follow the principle of least astonishment here.

Update

Following on from VoiceOfUnreasons answer. Say I have a class called Customers and also a class called CustomerCommand:

public class CustomerCommand
{
  public Guid id;
  public string Age;

  public CustomerCommand()
  {
       If (Age <0)
       {
            throw new ArgumentException();
       }
       If (Id==Guid.Empty())
       {
            throw new ArgumentException();
       }
  }

}

public class Customer 
{
  public Guid id;
  public decimal Age;
  public IOffer Offer;

  public Customer()
  {
       If (Age < 18)
       {
            throw new ArgumentException();
       }
  }

  public AssignOffer(List<IOffer> allOffers, IOfferCalculator offerCalculator)
  {
     Offer = offerCalculator.calculateOffer(this, allOffers);
  }

}

Say the company only creates an offer for customers over the age of 18. In this example code I believe:

1) The command is validating that it is well formed e.g. a person cannot be aged -50 because this is impossible.

2) Given the customer is over 18, then assign an offer.

3) If the customer is over 18, then the domain object can satisfy all queries.

Does this satisfy your three parts?

Best Answer

One area that I found very confusing in the early going is that messages and domain values are not really the same thing.

In particular, messages are part of your API - they describe an agreed upon protocol/representation for communicating semantic information across the boundary of the domain model. In situations where different organizations are sharing messages (example, a client sending messages to a server), messages schemas tend to change slowly, and with a great deal of care for backwards compatibility.

Value objects are not part of your API; they are domain semantics applied to data structures. So you can change those as aggressively as you like - because ultimately, the domain objects don't leak out of the boundaries of your service.

Over the lifetime of a successful project, you are likely change your value objects much more often than you change your messages.

If you are familiar with Gary Bernhardt's talk Boundaries: messages are how we replicate information from one imperative shell to another. Our imperative shell turns the message into values which are used by the functional core.

Microservices don't share value objects. They share message schema - that allows either team to change up their domain model as much as they like without forcing cascading maintenance across the entire system.

That means we might be expecting the version 2.1 representation of a message, but we are expected to be able to handle 2.0 (backward compatibility) and 2.2 (forward compatibility).

Note: messages are not limited to a single representation either. You might have one channel listening for messages expressed as byte arrays that can be decoded using a JSON parser, and another channel listening for messages expressed as byte arrays that are application/x-www-form-urlencoded.

It's our adapter that is responsible for ensuring that the bytes we have received can be sensibly understood as a message of the appropriate schema. Prior to that point, the data is untrusted -- so you certainly want to have validation in place as you take the domain agnostic bytes and create an in memory representation of the message semantics.

Converting the message to a value object - you are really going from one trusted representation to another. There's a lot less risk there; faults that are introduced at this point are much more likely to be programmer error than they are malicious attacks.

But there are likely to be cases where you aren't working with your adapter. For instance, the tests that you run between changes to ensure you have broken anything -- do you need validation there?

Furthermore, the concept of "untrusted" data gets fuzzier when dealing with stand alone applications. For example; does your desktop calculator really need separate message and value object models?

Domain Modeling Made Functional has a lot of good material on this topic; but in short you can treat validation as a pipeline, and make "this data is untrusted" an explicit concept in your program.

But while "trust" is an important element of some domains, the dangers of malicious byte representations of messages are part of the "secure software domain", and should be modeled there - not in the core business logic (as a rule).

To the best of my knowledge, this isn't a topic well covered in the Blue Book -- I simply can't remember any details discussions of the problems of distributed messaging.

Are you saying that if I remove primitive obsession, then all of the validation is done in the domain? Is there any validation done on the command?

No. I'm saying that you have two separate logical translations

Untrusted Message Representation -> Trusted Message
Trusted Message -> Domain Values

To offer a contrived example: your message specification may say that Estimated Arrival Time is an ISO-8601 Local Date, where your domain model tracks that information as a System.DateTime.

Untrusted Representation -> ISO Local Date
ISO Local Date -> System.DateTime

Later on in the project, you realize that tracking Estimated Arrival Time as "seconds since epoch" meets your needs better, you change the value objects in your domain to use the new in memory representation.

Untrusted Representation -> ISO Local Date
ISO Local Date -> long

And so your Value Object changes, and the adapter that translates from Trusted Message to Value Object changes.

But the message schema that you use to communicate with other processes stays the same.

The motivation for "validation" in the two cases is very different. In the first, we are using run time checks to protect the system from corrupt data, malicious data, and messaging errors by other parties (our partner sends a message to the wrong endpoint, a message gets misdirected by the routing middleware).

In the second, we're guarding against errors being introduced by the impedance mismatch between our domain semantics and the domain agnostic in memory representations that we use.

So, in short: ideally, you should have "validation" in both your messages and in your value objects. But the trade offs in the two cases are not the same, and there are circumstances where you might reasonably choose one without the other.

Every single example I look at has only context validation.

Not a fan - I'm more comfortable with a different type in each context:

  • I prefer having the differences in semantics between validated and unvalidated representations be explicit in code
  • I often program in languages with type checking; using explicit types catches some of the errors I make

I don't mind re-using the same underlying data structure at different stages -- the expression of the semantics in memory is an implementation detail.

Note that to some degree, it's a quibble between

class UntrustedMessage {
    boolean isValid();
}

which I hate, and the almost equivalent

class UntrustedMessage {
    Optional<TrustedMessage> validate();
}

As a thumb rule: you use validation (in the command) to ensure the input data is in valid format, then the business rules (in the domain) decide how/if the input changes a model

This isn't quite right. I see three parts, rather than two

  • Is the "command" message well formed? This is basically a validation check against the messaging schema. Is Withdraw(200 USD) a correct message?

  • Given the current state of the domain, what should we do with this message? That's business rule, it's the semantics of what's supposed to happen. I think of this as part of the problem domain. Somebody withdrew 200USD - do they have funds to cover that? do we need to escalate? notify government authorities? offer them overdraft protection? All of this is variations on "figure out how this message changes the current domain state".

But the part that's missing is one I'm trying to demonstrate is separate from message validation

  • Are the domain objects in a valid state? With is to say, are they in a condition that allows them to satisfy the post conditions of all queries that might be asked of them.

Crazy thought experiment - suppose our domain model had SmallMoney and BigMoney; we've discovered in our experience of the domain that there are different processes in place for small amounts of money vs large amounts -- you can get SmallMoney from an ATM, or a cashier, but BigMoney requires talking to a bank officer, or whatever. So SmallMoney, the value object, has a post condition like SmallMoney::toInt returns a number in the range 0-100.

That has nothing to do with the message schema at all, and it is only loosely coupled to the domain logic (which is to say, we're pretending that SmallMoney means something to the domain experts), but we use validation on the object so that we can do it in one place, rather than everywhere/not at all.