Saturday, 7 July 2012

Item 8: Obey the general contract when overriding equals


Overriding the equals method seems simple, but there are many ways to get it wrong, and consequences can be dire. The easiest way to avoid problems is not to override the equals method, in which case each instance of the class is equal only to itself.

This is the right thing to do if any of the following conditions apply:

Each instance of the class is inherently unique.
This is true for classes such as Thread that represent active entities rather than values.
You don’t care whether the class provides a “logical equality” test.
For example, java.util.Random could have overridden equals to check whether two Random instances would produce the same sequence of random numbers going forward
A superclass has already overridden equals, and the superclass behavior is appropriate for this class.
For example, most Set implementations inherit their equals implementation from AbstractSet, List implementations from AbstractList, and Map implementations from AbstractMap.
The class is private or package-private, and you are certain that its equals method will never be invoked.
Arguably, the equals method should be overridden under these circumstances, in case it is accidentally invoked:
@Override public boolean equals(Object o) {
throw new AssertionError(); // Method is never called
}

So when is it appropriate to override Object.equals?
When a class has a notion of logical equality that differs from mere object identity, and a superclass has not already overridden equals to implement the desired behavior. This is  generally the case for value classes. A value class is simply a class that represents a value, such as Integer or Date.

One kind of value class that does not require the equals method to be overridden is a class that uses instance control (Item 1) to ensure that at most one object exists with each value. Enum types (Item 30) fall into this category.

When you override the equals method, you must adhere to its general contract. Here is the contract, copied from the specification for Object [JavaSE6]:

The equals method implements an equivalence relation. It is:

Reflexive: For any non-null reference value x, x.equals(x) must return true.
An object must be equal to itself.
Symmetric: For any non-null reference values x and y, x.equals(y) must return true if and only if y.equals(x) returns true.
Any two objects must agree on whether they are equal.

// Broken - violates symmetry!
public final class CaseInsensitiveString {
private final String s;
public CaseInsensitiveString(String s) {
if (s == null)
throw new NullPointerException();
this.s = s;
}
// Broken - violates symmetry!
@Override public boolean equals(Object o) {
if (o instanceof CaseInsensitiveString)
return s.equalsIgnoreCase(
((CaseInsensitiveString) o).s);
if (o instanceof String) // One-way interoperability!
return s.equalsIgnoreCase((String) o);
return false;
}
... // Remainder omitted
}

CaseInsensitiveString cis = new CaseInsensitiveString("Polish");
String s = "polish";

As expected, cis.equals(s) returns true. The problem is that while the equals method in CaseInsensitiveString knows about ordinary strings, the equals method in String is oblivious to case- insensitive strings. Therefore s.equals(cis) returns false, a clear violation of symmetry. Suppose you put a case-insensitive string into a collection:
List<CaseInsensitiveString> list =
new ArrayList<CaseInsensitiveString>();
list.add(cis);

What does list.contains(s) return at this point? Who knows? In Sun’s current implementation, it happens to return false.

Once you’ve violated the equals contract, you simply don’t know how other objects will behave when confronted with your object.

To eliminate the problem,
@Override public boolean equals(Object o) {
return o instanceof CaseInsensitiveString &&
((CaseInsensitiveString) o).s.equalsIgnoreCase(s);
}

Transitive: For any non-null reference values x, y, z, if x.equals(y) returns true and y.equals(z) returns true, then x.equals(z) must return true.
If one object is equal to a second and the second object is equal to a third, then the first
object must be equal to the third.

public class Point {
private final int x;
private final int y;
public Point(int x, int y) {
this.x = x;
this.y = y;
}
@Override public boolean equals(Object o) {
if (!(o instanceof Point))
return false;
Point p = (Point)o;
return p.x == x && p.y == y;
}
... // Remainder omitted
}
Suppose you want to extend this class, adding the notion of color to a point:
public class ColorPoint extends Point {
private final Color color;
public ColorPoint(int x, int y, Color color) {
super(x, y);
this.color = color;
}
... // Remainder omitted
}

How should the equals method look?

// Broken - violates symmetry!
@Override public boolean equals(Object o) {
if (!(o instanceof ColorPoint))
return false;
return super.equals(o) && ((ColorPoint) o).color == color;
}

The problem with this method is that you might get different results when comparing a point to a color point and vice versa.

To make this concrete, let’s create one point and one color point:

Point p = new Point(1, 2);
ColorPoint cp = new ColorPoint(1, 2, Color.RED);

Then p.equals(cp) returns true, while cp.equals(p) returns false. You might try to fix the problem by having ColorPoint.equals ignore color when doing “mixed comparisons”:

// Broken - violates transitivity!
@Override public boolean equals(Object o) {
if (!(o instanceof Point))
return false;
// If o is a normal Point, do a color-blind comparison
if (!(o instanceof ColorPoint))
return o.equals(this);
// o is a ColorPoint; do a full comparison
return super.equals(o) && ((ColorPoint)o).color == color;
}

This approach does provide symmetry, but at the expense of transitivity:

ColorPoint p1 = new ColorPoint(1, 2, Color.RED);
Point p2 = new Point(1, 2);
ColorPoint p3 = new ColorPoint(1, 2, Color.BLUE);

Now p1.equals(p2) and p2.equals(p3) return true, while p1.equals(p3) returns false, a clear violation of transitivity. The first two comparisons are “color-blind,” while the third takes color into account. So what’s the solution?

There is no way to extend an instantiable class and add a value component while preserving the equals contract.

// Broken - violates Liskov substitution principle
@Override public boolean equals(Object o) {
if (o == null || o.getClass() != getClass())
return false;
Point p = (Point) o;
return p.x == x && p.y == y;
}

The Liskov substitution principle says that any important property of a type should also hold for its subtypes, so that any method written for the type should work equally well on its subtypes [Liskov87].

There are some classes in the Java platform libraries that do extend an instantiable class and add a value component. For example, java.sql.Timestamp extends java.util.Date and adds a nanoseconds field. The equals implementation for Timestamp does violate symmetry and can cause erratic behavior if Timestamp and Date objects are used in the same collection or are otherwise intermixed.

Consistent: For any non-null reference values x and y, multiple invocations of x.equals(y) consistently return true or consistently return false, provided no information used in equals comparisons on the objects is modified.

If two objects are equal, they must remain equal for all time unless one (or both) of them
is modified. In other words, mutable objects can be equal to different objects at different times while immutable objects can’t.

Whether or not a class is immutable, do not write an equals method that depends on unreliable resources.

• For any non-null reference value x, x.equals(null) must return false.

“Non-nullity”—The final requirement, which in the absence of a name I have taken the liberty of calling “non-nullity,” says that all objects must be unequal to null.

Many classes have equals methods that guard against this with an explicit test for null:

@Override public boolean equals(Object o) {
if (o == null)
return false;
...
}

This test is unnecessary. To test its argument for equality, the equals method must first cast its argument to an appropriate type so its accessors may be invoked or its fields accessed. Before doing the cast, the method must use the instanceof operator to check that its argument is of the correct type:

@Override public boolean equals(Object o) {
if (!(o instanceof MyType))
return false;
MyType mt = (MyType) o;
...
}

Putting it all together, here’s a recipe for a high-quality equals method:

1. Use the == operator to check if the argument is a reference to this object.
If so, return true. This is just a performance optimization, but one that is worth doing if the comparison is potentially expensive.
2. Use the instanceof operator to check if the argument has the correct type.
If not, return false. Typically, the correct type is the class in which the method occurs. Occasionally, it is some interface implemented by this class. Use an interface if the class implements an interface that refines the equals contract to permit comparisons across classes that implement the interface. Collection interfaces such as Set, List, Map, and Map.Entry have this property.
3. Cast the argument to the correct type.
Because this cast was preceded by an instanceof test, it is guaranteed to succeed.
4. For each “significant” field in the class, check if that field of the argument matches the corresponding field of this object.
If all these tests succeed, return true; otherwise, return false. If the type in step 2 is an interface, you must access the argument’s fields via interface methods; if the type is a class, you may be able to access the fields directly, depending on their accessibility.
5. When you are finished writing your equals method, ask yourself three questions:
Is it symmetric? Is it transitive? Is it consistent? And don’t just ask yourself; write unit tests to check that these properties hold! If they don’t, figure out why not, and modify the equals method accordingly. Of course your equals method also has to satisfy the other two properties (reflexivity and “non-nullity”), but these two usually take care of themselves.

For a concrete example of an equals method constructed according to the above recipe, see PhoneNumber.equals in Item 9. Here are a few final caveats:

Always override hashCode when you override equals (Item 9).
Don’t try to be too clever.
For example, the File class shouldn’t attempt to equate symbolic links referring to the same file. Thankfully, it doesn’t.
Don’t substitute another type for Object in the equals declaration.
It is not uncommon for a programmer to write an equals method that looks like this, and then spend hours puzzling over why it doesn’t work properly:
public boolean equals(MyClass o) {
...
}

Reference: Effective Java 2nd Edition by Joshua Bloch