A common
source of bugs is the failure to override the hashCode
method.
You
must override hashCode in every class
that overrides equals.
Here is the
contract, copied from the Object specification
[JavaSE6]:
• Whenever it
is invoked on the same object more than once during an execution of an
application, the hashCode method must
consistently return the same integer.
• If two
objects are equal according to the equals(Object)
method,
then calling the hashCode method on each
of the two objects must produce the same integer result.
• It is not
required that if two objects are unequal according to the equals(Object) method, then calling the hashCode method on each of the two objects must
produce distinct integer results.
The key
provision that is violated when you fail to override hashCode is the second one: equal objects
must have equal hash codes.
For example,
consider the following simplistic PhoneNumber
class,
whose equals method is
constructed according to the recipe in Item 8, but doesn’t override hashCode
method. Suppose you attempt to use this class with a HashMap:
Map<PhoneNumber, String> m
= new HashMap<PhoneNumber, String>();
m.put(new PhoneNumber(707, 867, 5309),
"Jenny");
At this point,
you might expect m.get(new PhoneNumber(707, 867, 5309))
to
return "Jenny", but it
returns null. Notice that two PhoneNumber instances are involved: one is used for
insertion into the HashMap, and a
second, equal, instance is used for (attempted) retrieval. The PhoneNumber class’s failure to override hashCode causes the two equal instances to have
unequal hash codes, in violation of the hashCode
contract.
Fixing this
problem is as simple as providing a proper hashCode
method
for the PhoneNumber class. So what
should a hashCode method look
like? It’s trivial to write one that is legal but not good. This one, for
example, is always legal but should never be used:
// The worst possible legal hash function - never
use!
@Override public int hashCode() { return 42; }
It ensures
that every object has the same hash code. Therefore, every
object hashes to the same bucket.
Luckily it’s
not too difficult to achieve a fair approximation. Here is a simple recipe:
1. Store some
constant nonzero value, say, 17, in an int
variable
called result.
2. For each
significant field f in your object (each field taken
into account by the equals method, that
is), do the following:
a. Compute an int hash code c
for
the field:
i. If the field is a boolean, compute (f
? 1 : 0).
ii. If the field is a byte, char, short, or int, compute (int) f.
iii. If the field is a long, compute (int)
(f ^ (f >>> 32)).
iv. If the field is a float, compute Float.floatToIntBits(f).
v. If the field is a double, compute Double.doubleToLongBits(f), and then
hash the resulting long as in step 2.a.iii.
vi. If the field is an object reference and
this class’s equals method
compares the field by recursively invoking equals, recursively
invoke hashCode on the field.
If a more complex comparison is required, compute a “canonical representation”
for this field and
invoke hashCode
on
the canonical representation. If the value of the field is null, return 0
(or
some other constant, but 0 is traditional).
vii. If the field is an array, treat it as if
each element were a separate field. That is, compute a hash code for each
significant element by applying these rules recursively, and combine these
values per step 2.b. If every element in an array field is significant, you can
use one of the Arrays.hashCode methods added
in release 1.5.
b. Combine the
hash code c computed in step 2.a into result as follows:
result = 31 * result + c;
3. Return result.
4. When you
are finished writing the hashCode method, ask
yourself whether equal instances have equal hash codes. Write unit tests to
verify your intuition! If equal instances have unequal hash codes, figure out
why and fix the problem.
The value 17
is arbitrary. The value 31 was chosen because it is an odd prime. If it were
even and the multiplication overflowed, information would be lost, as
multiplication by 2 is equivalent to shifting. The advantage of using a prime
is less clear, but it is traditional. A nice property of 31 is that the
multiplication can be replaced by a shift and a subtraction for better
performance:
31 * i == (i << 5) - i. Modern VMs
do this sort of optimization automatically. Let’s apply the above recipe to the
PhoneNumber class. There
are three significant fields, all of type short:
@Override public int hashCode() {
int result = 17;
result = 31 * result + areaCode;
result = 31 * result + prefix;
result = 31 * result + lineNumber;
return result;
}
If a class is
immutable and the cost of computing the hash code is significant, you might
consider caching the hash code in the object rather than recalculating it each
time it is requested. You might choose to lazily initialize
it
the first time hashCode is invoked (Item
71).
// Lazily initialized, cached hashCode
private volatile int hashCode; // (See Item 71)
@Override public int hashCode() {
int result = hashCode;
if (result == 0) {
result = 17;
result = 31 * result + areaCode;
result = 31 * result + prefix;
result = 31 * result + lineNumber;
hashCode = result;
}
return result;
}
Do not be
tempted to exclude significant parts of an object from the hash code
computation to improve performance.
Reference: Effective Java 2nd Edition by Joshua Bloch