Saturday, 7 July 2012

Item 78: Consider serialization proxies instead of serialized instances


As mentioned in Item 74 and discussed throughout this chapter, the decision to implement Serializable increases the likelihood of bugs and security problems, because it causes instances to be created using an extralinguistic mechanism in place of ordinary constructors. There is, however, a technique that greatly reduces these risks. This technique is known as the serialization proxy pattern.

The serialization proxy pattern is reasonably straightforward. First, design a private static nested class of the serializable class that concisely represents the logical state of an instance of the enclosing class. This nested class, known as the serialization proxy, should have a single constructor, whose parameter type is the enclosing class. This constructor merely copies the data from its argument: it need not do any consistency checking or defensive copying. By design, the default serialized form of the serialization proxy is the perfect serialized form of the enclosing class. Both the enclosing class and its serialization proxy must be declared to implement Serializable.

For example, consider the immutable Period class written in Item 39 and made serializable in Item 76. Here is a serialization proxy for this class. Period is so simple that its serialization proxy has exactly the same fields as the class:

// Serialization proxy for Period class
private static class SerializationProxy implements Serializable {
private final Date start;
private final Date end;
SerializationProxy(Period p) {
this.start = p.start;
this.end = p.end;
}
private static final long serialVersionUID =
234098243823485285L; // Any number will do (Item 75)
}

Next, add the following writeReplace method to the enclosing class. This method can be copied verbatim into any class with a serialization proxy:

// writeReplace method for the serialization proxy pattern
private Object writeReplace() {
return new SerializationProxy(this);
}


The presence of this method causes the serialization system to emit a SerializationProxy instance instead of an instance of the enclosing class. In other words, the writeReplace method translates an instance of the enclosing class to its serialization proxy prior to serialization.

With this writeReplace method in place, the serialization system will never generate a serialized instance of the enclosing class, but an attacker might fabricate one in an attempt to violate the class’s invariants. To guarantee that such an attack would fail, merely add this readObject method to the enclosing class:

// readObject method for the serialization proxy pattern
private void readObject(ObjectInputStream stream)
throws InvalidObjectException {
throw new InvalidObjectException("Proxy required");
}

Finally, provide a readResolve method on the SerializationProxy class that returns a logically equivalent instance of the enclosing class. The presence of this method causes the serialization system to translate the serialization proxy back into an instance of the enclosing class upon deserialization.

Here is the readResolve method for Period.SerializationProxy above:

// readResolve method for Period.SerializationProxy
private Object readResolve() {
return new Period(start, end); // Uses public constructor
}

There is another way in which the serialization proxy pattern is more powerful than defensive copying. The serialization proxy pattern allows the deserialized instance to have a different class from the originally serialized instance. You might not think that this would be useful in practice, but it is.

Consider the case of EnumSet (Item 32). This class has no public constructors, only static factories. From the client’s perspective, they return EnumSet instances, but in fact, they return one of two subclasses, depending on the size of the underlying enum type (Item 1). If the underlying enum type has sixty-four or fewer elements, the static factories return a RegularEnumSet; otherwise, they return a JumboEnumSet. Now consider what happens if you serialize an enum set whose enum type has sixty elements, then add five more elements to the enum type, and then deserialize the enum set. It was a RegularEnumSet instance when it was serialized, but it had better be a JumboEnumSet instance once it is deserialized. In fact that’s exactly what happens, because EnumSet uses the serialization proxy pattern. In case you’re curious, here is EnumSet’s serialization proxy. It really is this simple:

// EnumSet's serialization proxy
private static class SerializationProxy <E extends Enum<E>>
implements Serializable {
// The element type of this enum set.
private final Class<E> elementType;
// The elements contained in this enum set.
private final Enum[] elements;
SerializationProxy(EnumSet<E> set) {
elementType = set.elementType;
elements = set.toArray(EMPTY_ENUM_ARRAY); // (Item 43)
}
private Object readResolve() {
EnumSet<E> result = EnumSet.noneOf(elementType);
for (Enum e : elements)
result.add((E)e);
return result;
}
private static final long serialVersionUID =
362491234563181265L;
}

The serialization proxy pattern has two limitations. It is not compatible with classes that are extendable by their clients (Item 17). Also, it is not compatible with some classes whose object graphs contain circularities: if you attempt to invoke a method on an object from within its serialization proxy’s readResolve method, you’ll get a ClassCastException, as you don’t have the object yet, only its serialization proxy.

Finally, the added power and safety of the serialization proxy pattern are not free. On my machine, it is 14 percent more expensive to serialize and deserialize Period instances with serialization proxies than it is with defensive copying.

In summary, consider the serialization proxy pattern whenever you find yourself having to write a readObject or writeObject method on a class that is not extendable by its clients. This pattern is perhaps the easiest way to robustly serialize objects with nontrivial invariants.


Reference: Effective Java 2nd Edition by Joshua Bloch