I have a domain model in which I use a few aggregation relations, i.e. an object of class A
contains zero or more objects of class B
.
I use Java for the implementation and I represent such an aggregation as a field of type List<B>
in A
, and a field of type A
in B
. In this way, each object can be the root of an aggregation tree. Classes A
and B
may also contain other shallow fields, i.e. fields of type int
, float
, String
, and so on.
Now I need to define different kinds of equality methods on my model:
- Shallow equality: compare two instances of
A
by comparing its shallow fields only, i.e. leaving out references to other domain objects. In this case, I am only interested to know if two nodes have the same contents. - Deep equality: compare two instances of
A
by comparing its shallow fields and by recursively comparing its children. In this case, I want to check if two complete trees are equal.
I considered overriding the hashCode()
and equals()
methods for class A
but I do not know if this should be the shallow equality or the deep equality method. Once I decide which of the two equality methods is implemented as A.equals()
, I will define the other method with another name. This is an important choice because the equals()
method determines such things as membership in a Set
.
So, is one of the two possibilities (shallow versus deep equality) considered a more idiomatic choice for implementing the equals
method in Java?
Best Answer
I prefer to think of "equals" this way: if a.equals(b) then you can replace all references to b with references to a and the program behavior will not change. This is true for immutable value classes like String and should be true for quasi-value classes like Date. I think this is the way most Java programmers expect "equals" to behave.
Defining equals in some other way, so that things are sort-of equal, is likely to lead to subtle bugs when some future programmer puts these instances in a hash table or set. That programmer will then hate you forever.