Do We Always Need to Override Equals/Hashcode in Java Classes?

collectionsjavaobject-oriented

When creating a new class, should we always override the equals and hashCode even if we don’t intent at that point to use the class with any Collection classes?
Or is it better to wait till such a need arises before overriding equals and hashCode so as to be sure on what the logical equality should be?

Best Answer

should we always override the equals and hashCode even if we don’t intent at that point to use the class with any Collection classes?

No and I would go even further as to say you probably don't need to override them even if you are going to put them into a collection. The default implementations are perfectly compatible with collections as is. For lists, it's irrelevant anyway.

You should override equals if and only if your objects have a logical identity that is independent of their physical identity. In other words, if you want to have multiple objects that represent the same entity. Integer, is a good example of this. The integer 2 is always 2 regardless of whether there are 100 instances or 1 of the Integer object with the value of 2.

You must override hashcode if you've overridden equals.

That's it. There are no other good reasons to modify these.

As a side note, the use of equals to implement algorithms is highly overused. I would only do this if your object has a true logical identity. In most cases, whether two objects represent the same thing is highly context dependent and it's easier and better to use a property of the object explicitly and leave equals alone. There are many pitfalls to overriding equals, especially if you are using inheritance.

An example:

Let's say you have an online store and you are writing a component that fulfills orders. You decide to override the equals() method of the Order object to be based on the order number. As far as you know, that's the it's identity representation. You query a DB every so often and create objects from the response. You take each Order object and keep it as a key in a set which is all the orders in process. But there's a problem, orders can be modified in a certain time frame. You now might get a response from the DB that contains the same Order number but with different properties.

You can't just write this object to your set because you'll lose track of what was in process. You might end up processing the same order twice. So, what are the options? You could update the equals() method to include a version number or something additional. But now you have to go through your code to figure out where you need to base the logic on the having the same order number and where you need it to be based on the new object identity. In other words, there's not just one answer to whether two objects represent the same thing. In some contexts it's the order, and in some contexts it's the order and the version.

Instead of the set, you should build a map and use the order number as the key. Now you can check for the existence of the key in the map and retrieve the other object for comparison. This is much more straightforward and easy to understand than trying to make sense of when the equals method works they way you need it to for different needs.

A good example of this kind of complexity can be found in the BigDecimal class. You might think that BigDecimal("2.0") and BigDecimal("2.00") are equal. But they are not.

Conclusion:

It can make sense to implement equals() (make sure you update hashcode() too if you do) and doing so doesn't prevent the use of any of the techniques I describe here.

But if you are doing this:

  • Make sure your object is immutable
  • Use every meaningful property of the object to define equals.
  • If you are using inheritance, either:
    1. make equals final on the base class OR
    2. you must check that the actual type matches in subclasses

If any of these aren't followed, you are in for trouble.