Saturday, 7 July 2012

Item 49: Prefer primitive types to boxed primitives


Java has a two-part type system, consisting of primitives, such as int, double, and boolean, and reference types, such as String and List. Every primitive type has a corresponding reference type, called a boxed primitive. The boxed primitives corresponding to int, double, and boolean are Integer, Double, and Boolean.

There are three major differences between primitives and boxed primitives. First, primitives have only their values, whereas boxed primitives have identities distinct from their values. In other words, two boxed primitive instances can have the same value and different identities. Second, primitive types have only fully functional values, whereas each boxed primitive type has one nonfunctional value, which is null, in addition to all of the functional values of its corresponding primitive type. Last, primitives are generally more time- and space-efficient than boxed primitives. All three of these differences can get you into real trouble if you aren’t careful.

Consider the following comparator, which is designed to represent ascending numerical order on Integer values.

// Broken comparator - can you spot the flaw?
Comparator<Integer> naturalOrder = new Comparator<Integer>() {
public int compare(Integer first, Integer second) {
return first < second ? -1 : (first == second ? 0 : 1);
}
};

This comparator looks good on the face of it, and it will pass many tests. For example, it can be used with Collections.sort to correctly sort a million-element list, whether or not the list contains duplicate elements. But this comparator is deeply flawed. To convince yourself of this, merely print the value of natural- Order.compare(new Integer(42), new Integer(42)). Both Integer instances represent the same value (42), so the value of this expression should be 0, but it’s 1, which indicates that the first Integer value is greater than the second.

So what’s the problem? The first test in naturalOrder works fine. Evaluating the expression first < second causes the Integer instances referred to by first and second to be auto-unboxed; that is, it extracts their primitive values. The evaluation proceeds to check if the first of the resulting int values is less than the second. But suppose it is not. Then the next test evaluates the expression first == second, which performs an identity comparison on the two object references.
If first and second refer to distinct Integer instances that represent the same int value, this comparison will return false, and the comparator will incorrectly return 1, indicating that the first Integer value is greater than the second. Applying the == operator to boxed primitives is almost always wrong.

The clearest way to fix the problem is to add two local variables, to store the primitive int values corresponding to first and second, and to perform all of the comparisons on these variables. This avoids the erroneous identity comparison:

Comparator<Integer> naturalOrder = new Comparator<Integer>() {
public int compare(Integer first, Integer second) {
int f = first; // Auto-unboxing
int s = second; // Auto-unboxing
return f < s ? -1 : (f == s ? 0 : 1); // No unboxing
}
};

Next, consider this little program:

public class Unbelievable {
static Integer i;
public static void main(String[] args) {
if (i == 42)
System.out.println("Unbelievable");
}
}

No, it doesn’t print Unbelievable—but what it does is almost as strange. It throws a NullPointerException when evaluating the expression (i == 42). The problem is that i is an Integer, not an int, and like all object reference fields, its initial value is null. When the program evaluates the expression (i == 42), it is comparing an Integer to an int. In nearly every case when you mix primitives and boxed primitives in a single operation, the boxed primitive is auto unboxed, and this case is no exception. If a null object reference is auto-unboxed, you get a NullPointerException.

Finally, consider the program from Item 5:

// Hideously slow program! Can you spot the object creation?
public static void main(String[] args) {
Long sum = 0L;
for (long i = 0; i < Integer.MAX_VALUE; i++) {
sum += i;
}
System.out.println(sum);
}

This program is much slower than it should be because it accidentally declares a local variable (sum) to be of the boxed primitive type Long instead of the primitive type long. The program compiles without error or warning, and the variable is repeatedly boxed and unboxed, causing the observed performance degradation.

In summary, use primitives in preference to boxed primitives whenever you have the choice. Primitive types are simpler and faster. If you must use boxed primitives, be careful! Autoboxing reduces the verbosity, but not the danger, of using boxed primitives. When your program compares two boxed primitives with the == operator, it does an identity comparison, which is almost certainly not what you want. When your program does mixed-type computations involving boxed and unboxed primitives, it does unboxing, and when your program does unboxing, it can throw a NullPointerException. Finally, when your program boxes primitive values, it can result in costly and unnecessary object creations.


Reference: Effective Java 2nd Edition by Joshua Bloch