C# – Using struct to enforce validation of built-in type

cnet

Commonly domain objects have properties which can be represented by a built-in type but whose valid values are a subset of the values which may be represented by that type.

In these cases, the value can be stored using the built-in type but it is necessary to ensure values are always validated at the point of entry, otherwise we might end up working with an invalid value.

One way to solve this is to store the value as a custom struct which has a single private readonly backing field of the built-in type and whose constructor validates the provided value. We can then always be sure of only using validated values by using this struct type.

We can also provide cast operators from and to the underlying built-in type so that values can seamlessly enter and exit as the underlying type.

Take as an example a situation where we need to represent the name of a domain object, and valid values are any string which is between 1 and 255 characters in length inclusive. We could represent this using the following struct:

public struct ValidatedName : IEquatable<ValidatedName>
{
    private readonly string _value;

    private ValidatedName(string name)
    {
        _value = name;
    }

    public static bool IsValid(string name)
    {
        return !String.IsNullOrEmpty(name) && name.Length <= 255;
    }

    public bool Equals(ValidatedName other)
    {
        return _value == other._value;
    }

    public override bool Equals(object obj)
    {
        if (obj is ValidatedName)
        {
            return Equals((ValidatedName)obj);
        }
        return false;
    }

    public static implicit operator string(ValidatedName x)
    {
        return x.ToString();
    }

    public static explicit operator ValidatedName(string x)
    {
        if (IsValid(x))
        {
            return new ValidatedName(x);
        }
        throw new InvalidCastException();
    }

    public static bool operator ==(ValidatedName x, ValidatedName y)
    {
        return x.Equals(y);
    }

    public static bool operator !=(ValidatedName x, ValidatedName y)
    {
        return !x.Equals(y);
    }

    public override int GetHashCode()
    {
        return _value.GetHashCode();
    }

    public override string ToString()
    {
        return _value;
    }
}

The example shows the to-string cast as implicit as this can never fail but the from-string cast as explicit as this will throw for invalid values, but of course these could both be either implicit or explicit.

Note also that one can only initialize this struct by way of a cast from string, but one can test whether such a cast will fail in advance using the IsValid static method.

This would seem to be a good pattern to enforce validation of domain values which can be represented by simple types, but I don't see it used often or suggested and I'm interested as to why.

So my question is: what do you see as being the advantages and disadvantages of using this pattern, and why?

If you feel that this is a bad pattern, I would like to understand why and also what you feel is the best alternative.

NB I originally asked this question on Stack Overflow but it was put on hold as primarily opinion-based (ironically subjective in itself) – hopefully it can enjoy more success here.

Above is the original text, below a couple more thoughts, partly in response to the answers received there before it went on hold:

  • One of the major points made by the answers was around the amount of boiler plate code necessary for the above pattern, especially when many such types are required. However in defence of the pattern, this could be largely automated using templates and actually to me it doesn't seem too bad anyway, but that is just my opinion.
  • From a conceptual point of view, does it not seem strange when working with a strongly-typed language such as C# to only apply the strongly-typed principle to composite values, rather than extending it to values which can be represented by an instance of a built-in type?

Best Answer

This is fairly common in ML-style languages like Standard ML/OCaml/F#/Haskell where it's much easier to create the wrapper types. It provides you with two benefits:

  • It allows a piece of code to enforce that a string has undergone validation, without having to take care of that validation itself.
  • It allows you to localize the validation code in one place. If a ValidatedName ever contains an invalid value, you know the error is in the IsValid method.

If you get the IsValid method right, you have a guarantee that any function that receives a ValidatedName is in fact receiving a validated name.

If you need to do string manipulations you can add a public method that accepts a function that takes a String (the value of the ValidatedName) and returns a String (the new value) and validates the result of applying the function. That eliminates the boilerplate of getting the underlying String value and re-wrapping it.

A related use for wrapping values is to track their provenance. E.g. C-based OS APIs sometimes give handles for resources as integers. You can wrap the OS APIs to instead use a Handle structure and only provide access to the constructor to that part of the code. If the code that produces the Handles is correct, then only valid handles will ever be used.

Related Topic