Saturday, 7 July 2012

Item 12: Consider implementing Comparable


By implementing Comparable, a class indicates that its instances have a natural ordering. Sorting an array of objects that implement Comparable is as simple as this:

Arrays.sort(a);

the following program, which relies on the fact that String implements Comparable, prints an
alphabetized list of its command-line arguments with duplicates eliminated:

public class WordList {
public static void main(String[] args) {
Set<String> s = new TreeSet<String>();
Collections.addAll(s, args);
System.out.println(s);
}
}

If you are writing a value class with an obvious natural ordering, such as alphabetical order, numerical order, or chronological order, you should strongly consider implementing the interface:

public interface Comparable<T> {
int compareTo(T t);
}

The general contract of the compareTo method is similar to that of equals:

Compares this object with the specified object for order. Returns a negative integer, zero, or a positive integer as this object is less than, equal to, or greater than the specified object. Throws ClassCastException if the specified object’s type prevents it from being compared to this object. In the following description, the notation sgn(expression) designates the mathematical signum function, which is defined to return -1, 0, or 1, according to whether the value of expression is negative, zero, or positive.

• The implementor must ensure sgn(x.compareTo(y)) == -sgn(y.compare-To(x)) for all x and y. (This implies that x.compareTo(y) must throw an exception if and only if y.compareTo(x) throws an exception.)

• The implementor must also ensure that the relation is transitive: (x.compareTo( y) > 0 && y.compareTo(z) > 0) implies x.compareTo(z) > 0.

• Finally, the implementor must ensure that x.compareTo(y) == 0 implies that sgn(x.compareTo(z)) == sgn(y.compareTo(z)), for all z.

• It is strongly recommended, but not strictly required, that (x.compareTo(y) == 0) == (x.equals(y)). Generally speaking, any class that implements the Comparable interface and violates this condition should clearly indicate this fact. The recommended language is “Note: This class has a natural ordering that is inconsistent with equals.”

A class that violates the compareTo contract can break other classes that depend on comparison.Classes that depend on comparison include the sorted collections TreeSet and TreeMap, and the utility classes Collections and Arrays, which contain searching and sorting algorithms.

Let’s go over the provisions of the compareTo contract. The first provision says that if you reverse the direction of a comparison between two object references, the expected thing happens: if the first object is less than the second, then the second must be greater than the first; if the first object is equal to the second, then the second must be equal to the first; and if the first object is greater than the second, then the second must be less than the first. The second provision says that if one object is greater than a second, and the second is greater than a third, then the first must be greater than the third. The final provision says that all objects that compare as equal must yield the same results when compared to any other object.

One consequence of these three provisions is that the equality test imposed by a compareTo method must obey the same restrictions imposed by the equals contract: reflexivity, symmetry, and transitivity.

If this provision is obeyed, the ordering imposed by the compareTo method is said to be consistent with equals. If it’s violated, the ordering is said to be inconsistent with equals.

For example, consider the BigDecimal class, whose compareTo method is inconsistent with equals. If you create a HashSet instance and add new BigDecimal("1.0") and new BigDecimal("1.00"), the set will contain two elements because the two BigDecimal instances added to the set are unequal when compared using the equals method. If, however, you perform the same procedure using a TreeSet instead of a HashSet, the set will contain only one element because the two BigDecimal instances are equal when compared using the compareTo method.

public final class CaseInsensitiveString
implements Comparable<CaseInsensitiveString> {
public int compareTo(CaseInsensitiveString cis) {
return String.CASE_INSENSITIVE_ORDER.compare(s, cis.s);
}
... // Remainder omitted
}

Note that the CaseInsensitiveString class implements Comparable< CaseInsensitiveString>. This means that a CaseInsensitiveString reference can be compared only to other CaseInsensitiveString references.

If a class has multiple significant fields, the order in which you compare them is critical. You must start with the most significant field and work your way down. If a comparison results in anything other than zero (which represents equality), you’re done; just return the result. If the most significant fields are equal, go on to compare the next-most-significant fields, and so on. If all fields are equal, the objects are equal; return zero. The technique is demonstrated by this compareTo method for the PhoneNumber class in Item 9:

public int compareTo(PhoneNumber pn) {
// Compare area codes
if (areaCode < pn.areaCode)
return -1;
if (areaCode > pn.areaCode)
return 1;
// Area codes are equal, compare prefixes
if (prefix < pn.prefix)
return -1;
if (prefix > pn.prefix)
return 1;
// Area codes and prefixes are equal, compare line numbers
if (lineNumber < pn.lineNumber)
return -1;
if (lineNumber > pn.lineNumber)
return 1;
return 0; // All fields are equal
}

While this method works, it can be improved. Recall that the contract for compareTo does not specify the magnitude of the return value, only the sign. You can take advantage of this to simplify the code and probably make it run a bit faster:

public int compareTo(PhoneNumber pn) {
// Compare area codes
int areaCodeDiff = areaCode - pn.areaCode;
if (areaCodeDiff != 0)
return areaCodeDiff;
// Area codes are equal, compare prefixes
int prefixDiff = prefix - pn.prefix;
if (prefixDiff != 0)
return prefixDiff;
// Area codes and prefixes are equal, compare line numbers
return lineNumber - pn.lineNumber;
}

This trick works fine here but should be used with extreme caution. Don’t use it unless you’re certain the fields in question are non-negative or, more generally, that the difference between the lowest and highest possible field values is less than or equal to Integer.MAX_VALUE (231-1). The reason this trick doesn’t always work is that a signed 32-bit integer isn’t big enough to hold the difference between two arbitrary signed 32-bit integers.

Reference: Effective Java 2nd Edition by Joshua Bloch