Java – How to Override equals() Method

javaobjectobject-orientedobject-oriented-designvalue-object

Short question: Why does Java allow overriding equals(), why is it not final?

I am reading Effective Java 2nd edition by Joshua Bloch.
I am a bit baffled by the conclusion that

There is no way to extend an instantiable class and add a value component while preserving the equals contract

Is not it saying the same as equals() should be a final method?

The book gives an example of a class A with equals() method and then a class AX extending A having its own equals() method which is different from the equals() method in A.

I will not go into details of equals() in A and equals() in AX, but it suffices to say that they are different. Therefore we have inter-operatability problem which guarantees violation of transitivity and/or symmetry (maybe even something else) of equals() when we mix different implementations of A in some contexts (especially HashSet, HashMap type).

Thinking further, I don't think I can agree with with the conclusion that having something like

public boolean equals(Object o) {
    if (o == null || o.getClass() != getClass())
      return false;
    ...
}

is wrong. I think this is precisely the proper way to deal with overriding equals().

Java makes it possible so Java allows overriding equals() for a reason. If it had taken into account Liskov substitution principle in the strict sense, then it would not have allowed overriding equals() and implicitly makes any implementation of equals() final at the compiler level. What are your thoughts?

I can think of a case where composition is simply not suitable, and overriding equals() is the best option. This is the case where the class A is to be made persistent in a database and the context implies that there is no collection having both an implementation of A and subclasses of A such as AX.

Best Answer

equals() is a byproduct of an attempt to improve C++ when it was created. C++ has operator overloading which allows you to perform custom operations when called with otherwise valid operators such as <, >, !=, ==, and even =.

The team made the decision (wisely so) to make equality be class method rather than having external static methods as it was done in C++. However, this also meant that equals() coupled with hashCode() defined how such classes were handled in collection classes.

Since any class could in theory override equals() or hashCode(), it means that just because you have a collection of a certain type does not guarantee that behavior is uniform.

For instance, suppose class A has two members x and y used to determine equality. Along comes class B which has x, y, and z. If an instance of B had the same values x and y, how would you go about inserting this instance in a Set? If you call the equals of an instance of A, it will determine the two to be equal and if you call the equals of an instance of B, it will return false since it is not an instance of B.

To be perfectly correct, class B would have to treat member z as an additional condition only in the case in which it is an instance of B, otherwise it lends itself to the equals() method of class A, and, if such a thing is not possible, class B should not allow itself to override equals() or hashCode(). This creates a sticky situation since in theory you should not concern yourself with how the parent class works from an implementation standpoint (if done right anyway), but yet here we are.

You could make class A final to prevent such things from happening, but then of course you can never extend class A. Java makes a point of making certain standard classes like String final to prevent complications of this nature (very smart decision on their part). I think at the end of the day, what matters is that you are very careful in your usage of equals() and hashCode(). I try to use it sparingly, and I am always mindful of which classes I mean to be available in a library and which classes are for internal use as to not to create conditions where things could go horribly wrong.

The Liskov substitution principle is fine in theory, but in practice you can never quite manage it. Take Collection as an example. Collection is implemented by ArrayList, Set, or LinkedList among others. While it is true that you could achieve the same ultimate goal by replacing a Collection with say a HashSet, it is not an ideal implementation for performing operations on all objects contained within (better LinkedHashSet at that point). It wouldn't break existing code, but you may potentially render it grossly inefficient depending on how that Collection is used. Consider that this is a rather clean example too.

If you're lucky, only the implementation details change, but many behave radically different, with some methods throwing a NotImplementedException.

Thus requiring that classes implementing equals() must respect the Liskov substitution principle is asking a lot, and I suspect that they didn't want to alienate the majority of C++ programmers getting familiar with Java.