Monday, May 6, 2013

The Racy Single-Check Idiom

I saw this idiom in some code and looked up what exactly it meant. From Joshua Bloch's Effective Java:

"If you don't care whether every thread recalculates the value of a field, and the type of the field is a primitive other than long or double, then you may choose to remove the volatile modifier from the field declaration in the single-check idiom [calculating the value of a volatile field if it's null]. This variant is known as the racy single-check idiom. It speeds up field access on some architectures, at the expense of additional initializations (up to one per thread that accesses the field). This is definitely an exotic technique, not for everyday use. It is, however, used by String instances to cache their hash codes."

So, it's a race condition that doesn't matter to the results but may increase performance.

Be warned: Jeremy Manson points out a gotcha with this idiom on his blog. The trick, he points out, is that you must read the non-volatile (but shared) reference only once. If you need to read it more than once, you must act upon a reference that points to that shared memory (that act of using a temporary variable meets the reading-the-reference-only-once condition. You can, however, use the temporary variable as often as you like).

The reason this is a problem is that multiple reads may be re-ordered as is the wont in the world of the Java Memory Model. This re-ordering can cause incorrect values to be calculated.

The use of a temporary variable also makes the code (slightly) more efficient as there are fewer main memory reads.

It is used in many places in the Java source code's collection classes (a quick peek produced ConcurrentHashMap.values(), ConcurrentSkipListMap.keySet() etc) where the authors have avoided making fields volatile. Typically, you see code like:


    public NavigableSet navigableKeySet() {
        KeySet ks = keySet;
        return (ks != null) ? ks : (keySet = new KeySet(this));
    }


and you may wonder why they used a temporary variable. It seems harmless but pointless. Really, they are avoiding Manson's gotcha.

Manson describes the pathological case (that is, when not using a temporary variable) as unlikely to happen. Indeed, I couldn't get it to happen on my computer when I tried to deliberately cause it. But the problem is very real. He should know, he wrote the JMM spec.


1 comment: