C# – What kind of members should be used in a GetHashCode() implementation

chashing

We're building some Roslyn analyzers concerning GetHashCode(), including an analyzer that implements it for you in a given class.

While researching the subject we've found that there are many considerations to take into account, mainly on what kind of members should be avoided in a GetHashCode() implementation.

The main kind of member to avoid are mutable members (to avoid hashcode changes when used in a collection).

In general, for mutable reference types, you should override GetHashCode only if:

  • You can compute the hash code from fields that are not mutable

[Source]

Now the question that remains: what kind of members are we talking about then? The member has to be immutable but then again it should also be (ideally) different for each instance otherwise you'll just end up with the same hashcode for every object.

This leads me to conclude that the only acceptable members for a GetHashCode() implementation are readonly immutable structs and readonly immutable classes Note: not const. Likewise no static fields since they would be the same across instances anyway.

The way this would be implemented in reality would be by using a

  • Field: readonly T myField
  • Property: T myProp { get; }

Since it would be very not feasible to detect if a class is immutable, we should probably eliminate that entirely aside from string perhaps.

This leaves the list of applicable members for a GetHashCode() implementation to:

  • Readonly fields
  • And getter-only properties
  • That are immutable structs
  • And aren't static
  • And aren't interfaces
  • + string

Does that seem correct or have I made a mistake in the thought process somewhere? It seems like we're suddenly left with a lot less fields than people typically use. I realize that there is a pragmatic aspect to this that just says "add all the fields and let the dev figure it out" but I try to avoid introducing hidden bugs in the analyzers.

Best Answer

Readonly fields

This goes without saying.

And getter-only properties

Yes, as long as by "getter-only properties" we strictly mean type prop { get; }

That are immutable structs

Yes, in the sense that in C# primitives are also implemented as immutable structs.

And aren't static

This goes without saying.

And aren't interfaces

This goes without saying too, but do keep in mind that you can always involve the identity hashcode of an interface if you need to.

  • string

Again, goes without saying.

The fact that you are wondering so hard about how to implement GetHashCode() may be indicative of confusion about when to implement GetHashCode() and when not to. There are two types of objects:

  • Objects with value semantics
  • Objects with reference ("object") semantics.

Objects with value semantics are used for storing values. They should always be immutable and should always implement GetHashCode() taking into consideration every single one of their members.

Objects with reference semantics contain logic, and usually some state. They will usually not be immutable, (though it is possible in rare circumstances,) but one thing that is certain about them is that they will never be used as values, therefore they never need to override the default GetHashCode() implementation of object, which returns the identity hashcode of the object.

Related Topic