Saturday, 7 July 2012

Item 67: Avoid excessive synchronization


Item 66 warns of the dangers of insufficient synchronization. This item concerns the opposite problem. Depending on the situation, excessive synchronization can ause reduced performance, deadlock, or even nondeterministic behavior.

To avoid liveness and safety failures, never cede control to the client within a synchronized method or block. In other words, inside a synchronized region, do not invoke a method that is designed to be overridden, or one provided by a client in the form of a function object (Item 21). From the perspective of the class with the synchronized region, such methods are alien. The class has no knowledge of what the method does and has no control over it. Depending on what an alien method does, calling it from a synchronized region can cause exceptions, deadlocks, or data corruption.

To make this concrete, consider the following class, which implements an observable set wrapper. It allows clients to subscribe to notifications when elements are added to the set. This is the Observer pattern [Gamma95, p. 293]. For brevity’s sake, the class does not provide notifications when elements are removed from the set, but it would be a simple matter to provide them. This class is implemented atop the reusable ForwardingSet from Item 16:

// Broken - invokes alien method from synchronized block!
public class ObservableSet<E> extends ForwardingSet<E> {
public ObservableSet(Set<E> set) { super(set); }
private final List<SetObserver<E>> observers =
new ArrayList<SetObserver<E>>();
public void addObserver(SetObserver<E> observer) {
synchronized(observers) {
observers.add(observer);
}
}
public boolean removeObserver(SetObserver<E> observer) {
synchronized(observers) {
return observers.remove(observer);
}
}
private void notifyElementAdded(E element) {
synchronized(observers) {
for (SetObserver<E> observer : observers)
observer.added(this, element);
}
}
@Override public boolean add(E element) {
boolean added = super.add(element);
if (added)
notifyElementAdded(element);
return added;
}
@Override public boolean addAll(Collection<? extends E> c) {
boolean result = false;
for (E element : c)
result |= add(element); // calls notifyElementAdded
return result;
}
}

Observers subscribe to notifications by invoking the addObserver method and unsubscribe by invoking the removeObserver method. In both cases, an instance of this callback interface is passed to the method:

public interface SetObserver<E> {
// Invoked when an element is added to the observable set
void added(ObservableSet<E> set, E element);
}

On cursory inspection, ObservableSet appears to work. For example, the following program prints the numbers from 0 through 99:

public static void main(String[] args) {
ObservableSet<Integer> set =
new ObservableSet<Integer>(new HashSet<Integer>());
set.addObserver(new SetObserver<Integer>() {
public void added(ObservableSet<Integer> s, Integer e) {
System.out.println(e);
}
});
for (int i = 0; i < 100; i++)
set.add(i);
}

Now let’s try something a bit fancier. Suppose we replace the addObserver call with one that passes an observer that prints the Integer value that was added to the set and removes itself if the value is 23:

set.addObserver(new SetObserver<Integer>() {
public void added(ObservableSet<Integer> s, Integer e) {
System.out.println(e);
if (e == 23) s.removeObserver(this);
}
});

You might expect the program to print the numbers 0 through 23, after which the observer would unsubscribe and the program complete its work silently. What actually happens is that it prints the numbers 0 through 23, and then throws a ConcurrentModificati nException. The problem is that notifyElementAdded is in the process of iterating over the observers list when it invokes the observer’s added method. The added method calls the observable set’s removeObserver method, which in turn calls observers.remove. Now we are in trouble. We are trying to remove an element from a list in the midst of iterating over it, which is illegal. The iteration in the notifyElementAdded method is in a synchronized block to prevent concurrent modification, but it doesn’t prevent the iterating thread itself from calling back into the observable set and modifying its observers list.

Now let’s try something odd: let’s write an observer that attempts to unsubscribe, but instead of calling removeObserver directly, it engages the services of nother thread to do the deed. This observer uses an executor service (Item 68):

// Observer that uses a background thread needlessly
set.addObserver(new SetObserver<Integer>() {
public void added(final ObservableSet<Integer> s, Integer e) {
System.out.println(e);
if (e == 23) {
ExecutorService executor =
Executors.newSingleThreadExecutor();
final SetObserver<Integer> observer = this;
try {
executor.submit(new Runnable() {
public void run() {
s.removeObserver(observer);
}
}).get();
} catch (ExecutionException ex) {
throw new AssertionError(ex.getCause());
} catch (InterruptedException ex) {
throw new AssertionError(ex.getCause());
} finally {
executor.shutdown();
}
}
}
});

This time we don’t get an exception; we get a deadlock. The background thread calls s.removeObserver, which attempts to lock observers, but it can’t acquire the lock, because the main thread already has the lock. All the while, the main thread is waiting for the background thread to finish removing the observer, which explains the deadlock.

This example is contrived because there is no reason for the observer to use a background thread, but the problem is real. Invoking alien methods from synchronized regions has caused many deadlocks in real systems, such as GUI toolkits.

Luckily, it is usually not too hard to fix this sort of problem by moving alien method invocations out of synchronized blocks. For the notifyElementAdded method, this involves taking a “snapshot” of the observers list that can then be safely traversed without a lock. With this change, both of the previous examples run without exception or deadlock:

// Alien method moved outside of synchronized block - open calls
private void notifyElementAdded(E element) {
List<SetObserver<E>> snapshot = null;
synchronized(observers) {
snapshot = new ArrayList<SetObserver<E>>(observers);
}
for (SetObserver<E> observer : snapshot)
observer.added(this, element);
}

In fact, there’s a better way to move the alien method invocations out of the synchronized block. Since release 1.5, the Java libraries have provided a concurrent collection (Item 69) known as CopyOnWriteArrayList, which is tailor-made for this purpose. It is a variant of ArrayList in which all write operations are implemented by making a fresh copy of the entire underlying array. Because the internal array is never modified, iteration requires no locking and is very fast. For most uses, the performance of CopyOnWriteArrayList would be atrocious, but it’s perfect for observer lists, which are rarely modified and often traversed.

As a rule, you should do as little work as possible inside synchronized regions.

You should make a mutable class thread-safe (Item 70) if it is intended for concurrent use and you can achieve significantly higher concurrency by synchronizing internally than you could by locking the entire object externally. Otherwise, don’t synchronize internally.

If you do synchronize your class internally, you can use various techniques to achieve high concurrency, such as lock splitting, lock striping, and nonblocking concurrency control.

If a method modifies a static field, you must synchronize access to this field, even if the method is typically used only by a single thread. It is not possible for clients to perform external synchronization on such a method because there can be no guarantee that unrelated clients will do likewise.

In summary, to avoid deadlock and data corruption, never call an alien method from within a synchronized region. More generally, try to limit the amount of work that you do from within synchronized regions. When you are designing a mutable class, think about whether it should do its own synchronization. In the modern multicore era, it is more important than ever not to synchronize excessively. Synchronize your class internally only if there is a good reason to do so, and document your decision clearly (Item 70).


Reference: Effective Java 2nd Edition by Joshua Bloch