C# – When Should a .NET Class Override Equals()? When Should it Not

cequalsgethashcode

The VS2005 documentation Guidelines for Overloading Equals() and Operator == (C# Programming Guide) states in part

Overriding operator == in non-immutable types is not recommended.

The newer .NET Framework 4 documentation Guidelines for Implementing Equals and the Equality Operator (==) omits that statement, although one post in Community Content repeats the assertion and references the older documentation.

It seems that it is reasonable to override Equals() at least for some trivial mutable classes, such as

public class ImaginaryNumber
{
    public double RealPart { get; set; }
    public double ImaginaryPart { get; set; }
}

In math, two imaginary numbers that have the same real part and the same imaginary part are in fact equal at the point in time that equality is tested. It is incorrect to assert that they are not equal, which would happen if separate objects with the same RealPart and ImaginaryPart were Equals() not overridden.

On the other hand, if one overrides Equals() one should also override GetHashCode(). If an ImaginaryNumber that overrides Equals() and GetHashCode() is placed in a HashSet, and a mutable instance changes its value, that object would no longer be found in the HashSet.

Was MSDN incorrect to remove the guideline about not overriding Equals() and operator== for non-immutable types?

Is it reasonable to override Equals() for mutable types where "in the real world" equivalence of all properties means that the objects themselves are equal (as with ImaginaryNumber)?

If it is reasonable, how does one best deal with potential mutability while an object instance is participating in a HashSet or something else that relies on GetHashCode() not changing?

UPDATE

Just came across this in MSDN

Typically, you implement value equality when objects of the type are
expected to be added to a collection of some sort, or when their
primary purpose is to store a set of fields or properties. You can
base your definition of value equality on a comparison of all the
fields and properties in the type, or you can base the definition on a
subset. But in either case, and in both classes and structs, your
implementation should follow the five guarantees of equivalence:

Best Answer

I came to realize that I wanted Equals to mean two different things, depending on the context. After weighing the input here as well as here, I have settled on the following for my particular situation:

I'm not overriding Equals() and GetHashCode(), but rather preserving the common but by no means ubiquitous convention that Equals() means identity equality for classes, and that Equals() means value equality for structs. The largest driver of this decision is the behavior of objects in hashed collections (Dictionary<T,U>, HashSet<T>, ...) if I stray from this convention.

That decision left me still missing the concept of value equality (as discussed on MSDN)

When you define a class or struct, you decide whether it makes sense to create a custom definition of value equality (or equivalence) for the type. Typically, you implement value equality when objects of the type are expected to be added to a collection of some sort, or when their primary purpose is to store a set of fields or properties.

A typical case for desiring the concept of value equality (or as I'm terming it "equivalence") is in unit tests.

Given

public class A
{
    int P1 { get; set; }
    int P2 { get; set; }
}

[TestMethod()]
public void ATest()
{
    A expected = new A() {42, 99};
    A actual = SomeMethodThatReturnsAnA();
    Assert.AreEqual(expected, actual);
}

the test will fail because Equals() is testing reference equality.

The unit test certainly could be modified to test each property individually, but that moves the concept of equivalence out of the class into the test code for the class.

To keep that knowledge encapsulated in the class, and to provide a consistent framework for testing equivalence, I defined an interface that my objects implement

public interface IEquivalence<T>
{
    bool IsEquivalentTo(T other);
}

the implementation typically follows this pattern:

public bool IsEquivalentTo(A other)
{
    if (object.ReferenceEquals(this, other)) return true;

    if (other == null) return false;

    bool baseEquivalent = base.IsEquivalentTo((SBase)other);

    return (baseEquivalent && this.P1 == other.P1 && this.P2 == other.P2);
}

Certainly, if I had enough classes with enough properties, I could write a helper that builds an expression tree via reflection to implement IsEquivalentTo().

Finally, I implemented an extension method that tests the equivalence of two IEnumerable<T>:

static public bool IsEquivalentTo<T>
    (this IEnumerable<T> first, IEnumerable<T> second)

If T implements IEquivalence<T> that interface is used, otherwise Equals() is used, to compare elements of the sequence. Allowing the fallback to Equals() lets it work e.g. with ObservableCollection<string> in addition to my business objects.

Now, the assertion in my unit test is

Assert.IsTrue(expected.IsEquivalentTo(actual));