C# – What kind of members should be used in a GetHashCode() implementation

chashing

We're building some Roslyn analyzers concerning GetHashCode(), including an analyzer that implements it for you in a given class.

While researching the subject we've found that there are many considerations to take into account, mainly on what kind of members should be avoided in a GetHashCode() implementation.

The main kind of member to avoid are mutable members (to avoid hashcode changes when used in a collection).

In general, for mutable reference types, you should override GetHashCode only if:

You can compute the hash code from fields that are not mutable

[Source]

Now the question that remains: what kind of members are we talking about then? The member has to be immutable but then again it should also be (ideally) different for each instance otherwise you'll just end up with the same hashcode for every object.

This leads me to conclude that the only acceptable members for a GetHashCode() implementation are readonly immutable structs and readonly immutable classes Note: not const. Likewise no static fields since they would be the same across instances anyway.

The way this would be implemented in reality would be by using a

Field: readonly T myField
Property: T myProp { get; }

Since it would be very not feasible to detect if a class is immutable, we should probably eliminate that entirely aside from string perhaps.

This leaves the list of applicable members for a GetHashCode() implementation to:

Readonly fields
And getter-only properties
That are immutable structs
And aren't static
And aren't interfaces
+ string

Does that seem correct or have I made a mistake in the thought process somewhere? It seems like we're suddenly left with a lot less fields than people typically use. I realize that there is a pragmatic aspect to this that just says "add all the fields and let the dev figure it out" but I try to avoid introducing hidden bugs in the analyzers.

Best Answer

Readonly fields

This goes without saying.

And getter-only properties

Yes, as long as by "getter-only properties" we strictly mean type prop { get; }

That are immutable structs

Yes, in the sense that in C# primitives are also implemented as immutable structs.

And aren't static

This goes without saying.

And aren't interfaces

This goes without saying too, but do keep in mind that you can always involve the identity hashcode of an interface if you need to.

string

Again, goes without saying.

The fact that you are wondering so hard about how to implement GetHashCode() may be indicative of confusion about when to implement GetHashCode() and when not to. There are two types of objects:

Objects with value semantics
Objects with reference ("object") semantics.

Objects with value semantics are used for storing values. They should always be immutable and should always implement GetHashCode() taking into consideration every single one of their members.

Objects with reference semantics contain logic, and usually some state. They will usually not be immutable, (though it is possible in rare circumstances,) but one thing that is certain about them is that they will never be used as values, therefore they never need to override the default GetHashCode() implementation of object, which returns the identity hashcode of the object.

Related Solutions

Why and When to Make a Class ‘Static’?

It makes it obvious to users how the class is used. For instance, it would be complete nonsense to write the following code:

Math m = new Math();

C# doesn’t have to forbid this but since it serves no purpose, might as well tell the user that. Certain people (including me) adhere to the philosophy that programming languages (and APIs …) should be as restrictive as possible to make them hard to use wrong: the only allowed operations are then those that are meaningful and (hopefully) correct.

C# Design – Managing Many Properties, Complex Constructors, and Equality

If you insist of having your object to immutable, there is obviously no other way than providing the 20 strings through the constructor. And if you want to be able to leave some of the strings out, you must say which arguments you are providing and which are not, which leads you to some form of named parameters. Of course, besides the possibilities you mentioned by yourself, you can also

provide the 20 arguments by a list of strings
provide them by a dictionary (key=>value, where "key" is the attribute name)
provide them by an object of a helper class, which has the same 20 properties with getters and setters, so this one won't be immutable

It may also be a good idea to implement Equals and GetHashCodeby utilizing reflection, looping over all public string properties of your class.

Best Answer

Related Solutions

Why and When to Make a Class ‘Static’?

C# Design – Managing Many Properties, Complex Constructors, and Equality

Related Topic