Wednesday, October 27, 2010

A little volatile

Time to refresh my memory: what is the volatile modifier for?

The answer given by most Java programmers I've interviewed recently is that it makes the value the reference points to visible to all threads. This is true but not the whole story.

When I first came to appreciate the Java Memory Model I was surprised like almost everybody else to see that the value of a single field may be different for two different threads. This is not something peculiar to Java. Any hardware that conforms to a Von Neumann architecture (which is pretty much every common processor) can run faster if it uses its CPU's registers to store references rather than main memory (RAM).

Making a reference volatile means that it will be published to main memory (accessible by all threads) rather than being stored in a CPU's register (accessible by only that thread). But it also means:

1. Its published state is safe

If a field of an object is being instantiated, potentially other threads have access to it. However, they may reference it before the instantiating thread has left the constructor. If the object is not immutable, this can lead to inconsistent data. Using volatile is one way to eliminate this.

To ensure safe publication, Java code needs to do at least one of the following:
  • initialize the object from a static initializer
  • store the reference in a final field
  • store the reference in a volatile field
  • store the reference in a java.util.concurrent.atomic.AtomicReference
  • guard the reference in an appropriate lock
(See Brian Goetz's excellent Java Concurrency in Practise, p52, for more information).

2. All other variables are flushed to main memory

"When thread A writes to a volatile variable and subsequently thread B reads that same variable, the value of all variables that were visible to A prior to writing to the volatile variable become visible to B after reading the volatile variable."
(Java Concurrency in Practise, p38)


The reason I am re-reading Mr Goetz's excellent book is that I have been asked to diagnose a threading issue in some legacy code where I came across the Double Checked Locking idiom (see this link for what it is and why it's pathological). It's been a long time since I saw DCL code or even the Singleton pattern (singletons are hard to test and, in this day of dependency injection, largely redundant). But something I read that I had forgotten was:

"To ensure that all threads see the most up-to-date values of shared mutable variables, the reading and writing threads must synchronize on a common lock"
(ibid, p37, emphasis mine).

DCL is fixed if one uses volatile, according to Jeremy Manson, co-author of the JSR-133 and JLS that deals with threads and synchronization. Since the old code does not use volatile, I suspect this may be the cause of our problem. However, now I need to prove it - not easy when you're dealing with Heisenbugs.

No comments:

Post a Comment