Saturday, 7 July 2012

Item 42: Use varargs judiciously


In release 1.5, varargs methods, formally known as variable arity methods [JLS, 8.4.1], were added to the language. Varargs methods accept zero or more arguments of a specified type. The varargs facility works by first creating an array whose size is the number of arguments passed at the call site, then putting the argument values into the array, and finally passing the array to the method.

For example, here is a varargs method that takes a sequence of int arguments and returns their sum. As you would expect, the value of sum(1, 2, 3) is 6, and the value of sum() is 0:

// Simple use of varargs
static int sum(int... args) {
int sum = 0;
for (int arg : args)
sum += arg;
return sum;
}

Sometimes it’s appropriate to write a method that requires one or more arguments of some type, rather than zero or more.

// The WRONG way to use varargs to pass one or more arguments!
static int min(int... args) {
if (args.length == 0)
throw new IllegalArgumentException("Too few arguments");
int min = args[0];
for (int i = 1; i < args.length; i++)
if (args[i] < min)
min = args[i];
return min;
}

This solution has several problems. The most serious is that if the client invokes this method with no arguments, it fails at runtime rather than compile time. Another problem is that it is ugly. You have to include an explicit validity check on args, and you can’t use a for-each loop unless you initialize min to Integer.MAX_VALUE, which is also ugly.

Luckily there’s a much better way to achieve the desired effect. Declare the method to take two parameters, one normal parameter of the specified type and one varargs parameter of this type. This solution corrects all the deficiencies of the previous one:

// The right way to use varargs to pass one or more arguments
static int min(int firstArg, int... remainingArgs) {
int min = firstArg;
for (int arg : remainingArgs)
if (arg < min)
min = arg;
return min;
}

As you can see from this example, varargs are effective in circumstances where you really do want a method with a variable number of arguments. Varargs were designed for printf, which was added to the platform in release 1.5, and for the core reflection facility (Item 53), which was retrofitted to take advantage of varargs in that release. Both printf and reflection benefit enormously from varargs.

You can retrofit an existing method that takes an array as its final parameter to take a varargs parameter instead with no effect on existing clients. But just because you can doesn’t mean that you should! Consider the case of Arrays.asList. This method was never designed to gather multiple arguments into a list, but it seemed like a good idea to retrofit it to do so when varargs were added to the platform. As a result, it became possible to do things like this:

List<String> homophones = Arrays.asList("to", "too", "two");

This usage works, but it was a big mistake to enable it. Prior to release 1.5, this was a common idiom to print the contents of an array:

// Obsolete idiom to print an array!
System.out.println(Arrays.asList(myArray));

The idiom was necessary because arrays inherit their toString implementation from Object, so calling toString directly on an array produces a useless string such as [Ljava.lang.Integer;@3e25a5. The idiom worked only on arrays of object reference types, but if you accidentally tried it on an array of primitives, the program wouldn’t compile. For example, this program:

public static void main(String[] args) {
int[] digits = { 3, 1, 4, 1, 5, 9, 2, 6, 5, 4 };
System.out.println(Arrays.asList(digits));
}

would generate this error message in release 1.4:

Va.java:6: asList(Object[]) in Arrays can't be applied to (int[])
System.out.println(Arrays.asList(digits));
^

Because of the unfortunate decision to retrofit Arrays.asList as a varargs method in release 1.5, this program now compiles without error or warning. Running the program, however, produces output that is both unintended and useless: [[I@3e25a5]. The Arrays.asList method, now “enhanced” to use varargs, gathers up the object reference to the int array digits into a one-element array of
arrays and dutifully wraps it into a List<int[]> instance. Printing this list causes toString to be invoked on the list, which in turn causes toString to be invoked on its sole element, the int array, with the unfortunate result described above.

If you use Arrays.toString in place of Arrays.asList, the program produces the intended result:

// The right way to print an array
System.out.println(Arrays.toString(myArray));
Instead of retrofitting Arrays.asList, it would have been better to add a new
method to Collections specifically for the purpose of gathering its arguments
into a list:
public static <T> List<T> gather(T... args) {
return Arrays.asList(args);
}

Such a method would have provided the capability to gather without compromising the type-checking of the existing Arrays.asList method. The lesson is clear. Don’t retrofit every method that has a final array parameter; use varargs only when a call really operates on a variable-length sequence of values. Two method signatures are particularly suspect:

ReturnType1 suspect1(Object... args) { }
<T> ReturnType2 suspect2(T... args) { }

Methods with either of these signatures will accept any parameter list. Any compile-time type-checking that you had prior to the retrofit will be lost, as demonstrated by what happened to Arrays.asList.

Exercise care when using the varargs facility in performance-critical situations. Every invocation of a varargs method causes an array allocation and initialization. If you have determined empirically that you can’t afford this cost but you need the flexibility of varargs, there is a pattern that lets you have your cake and eat it too. Suppose you’ve determined that 95 percent of the calls to a method have three or fewer parameters. Then declare five overloadings of the method, one each with zero through three ordinary parameters, and a single varargs method for use when the number of arguments exceeds three:

public void foo() { }
public void foo(int a1) { }
public void foo(int a1, int a2) { }
public void foo(int a1, int a2, int a3) { }
public void foo(int a1, int a2, int a3, int... rest) { }

Now you know that you’ll pay the cost of the array creation only in the 5 percent of all invocations where the number of parameters exceeds three. Like most performance optimizations, this technique usually isn’t appropriate, but when it is, it’s a lifesaver.

The EnumSet class uses this technique for its static factories to reduce the cost of creating enum sets to a bare minimum. It was appropriate to do this because it was critical that enum sets provide performance-competitive replacements for bit fields (Item 32).

In summary, varargs methods are a convenient way to define methods that require a variable number of arguments, but they should not be overused. They can produce confusing results if used inappropriately.



Reference: Effective Java 2nd Edition by Joshua Bloch