I l@ve RuBoard

6.9 Living in a Multithreaded World

Singleton has to deal with threads, too. Suppose we have an application that has just started, and two threads access the following Singleton:



Singleton& Singleton::Instance()


{


   if (!pInstance_)                     // 1


   {


      pInstance_ = new Singleton;       // 2


   }


   return *pInstance_;                  // 3


}

The first thread enters Instance and tests the if condition. Because it's the very first access, pInstance_ is null, and the thread reaches the line marked // 2 and is ready to invoke the new operator. It might just happen that the OS scheduler unwittingly interrupts the first thread at this point and passes control to the other thread.

The second thread enters the stage, invokes Singleton::Instance(), and finds a null pInstance_ as well because the first thread hasn't yet had a chance to modify it. So far, the first thread has managed only to test pInstance_. Now say the second thread manages to call the new operator, assigns pInstance_ in peace, and gets away with it.

Unfortunately, as the first thread becomes conscious again, it remembers it was just about to execute line 2, so it reassigns pInstance_ and gets away with it, too. When the dust settles, there are two Singleton objects instead of one, and one of them will leak for sure. Each thread holds a distinct instance of Singleton, and the application is surely slipping toward chaos. And that's only one possible situation—what if multiple threads scramble to access the singleton? (Imagine yourself debugging this.)

Experienced multithreaded programmers will recognize a classic race condition here. It is to be expected that the Singleton design pattern meets threads. A singleton object is a shared global resource, and all shared global resources are always suspect for race conditions and threading issues.

6.9.1 The Double-Checked Locking Pattern

A thorough discussion of multithreaded singletons was first presented by Douglas Schmidt (1996). The same article describes a very nifty solution, called the Double-Checked Locking pattern, devised by Doug Schmidt and Tim Harrison.

The obvious solution works but is not appealing:



Singleton& Singleton::Instance()


{


   // mutex_ is a mutex object


   // Lock manages the mutex


   Lock guard(mutex_);


   if (!pInstance_)


   {


      pInstance_ = new Singleton;


   }


   return *pInstance_;


}

Class Lock is a classic mutex handler (refer to the appendix for details on mutexes). Lock's constructor locks the mutex, and its destructor unlocks the mutex. While mutex_ is locked, other threads that try to lock the same mutex are waiting.

We got rid of the race condition: While a thread assigns to pInstance_, all others stop in guard's constructor. When another thread locks the lock, it will find pInstance_ already initialized, and everything works smoothly.

However, a correct solution is not always an appealing one. The inconvenience is its lack of efficiency. Each call to Instance incurs locking and unlocking the synchronization object, although the race condition occurs only, so to say, once in a lifetime. These operations are usually very costly, much more costly than the simple if (!pInstance_) test. (On today's systems, the times taken by a test-and-branch and a critical-section lock differ typically by a couple of orders of magnitude.)

A would-be solution that attempts to avoid the overhead is presented in the following code:



Singleton& Singleton::Instance()


{


   if (!pInstance_)


   {


      Lock guard(mutex_);


      pInstance_ = new Singleton;


   }


   return *pInstance_;


}

Now the overhead is gone, but the race condition is back. The first thread passes the if test, but just when it is about to enter the synchronized portion of code, the OS scheduler interrupts it and passes control to the other thread. The second thread passes the if test itself (and to no one's surprise, finds a null pointer), enters the synchronized portion, and completes it. When the first thread is reactivated, it also enters the synchronized portion, but it's just too late—again, two Singleton objects are constructed.

This seems like one of those brainteasers with no solution, but in fact there is a very simple and elegant one. It's called the Double-Checked Locking pattern.

The idea is simple: Check the condition, enter the synchronized code, and then check the condition again. By this time, rien ne va plus—the pointer is either initialized or null all right. The code that will help you to understand and savor the Double-Checked pattern follows. Indeed, there is beauty in computer engineering.



Singleton& Singleton::Instance()


{


   if (!pInstance_)           // 1


   {                          // 2


      Guard myGuard(lock_);   // 3


      if (!pInstance_)        // 4


      {


         pInstance_ = new Singleton;


      }


   }


   return *pInstance_;


}

Assume the flow of execution of a thread enters the twilight zone (commented line 2). Here several threads may enter at once. However, in the synchronized section only one thread makes it at a time. On line 3, there's no twilight anymore. All is clear: The pointer has been either fully initialized or not at all. The first thread that enters initializes the variable, and all others fail the test on line 4 and do not create anything.

The first test is quick and coarse. If the Singleton object is available, you get it. If not, further investigation is necessary. The second test is slow and accurate: It tells whether Singleton was indeed initialized or the thread is responsible for initializing it. This is the Double-Checked Locking pattern. Now we have the best of both worlds: Most of the time the access to the singleton is as fast as it gets. During construction, however, no race conditions occur. Awesome. But . . .

Very experienced multithreaded programmers know that even the Double-Checked Locking pattern, although correct on paper, is not always correct in practice. In certain symmetric multiprocessor environments (the ones featuring the so-called relaxed memory model), the writes are committed to the main memory in bursts, rather than one by one. The bursts occur in increasing order of addresses, not in chronological order. Due to this rearranging of writes, the memory as seen by one processor at a time might look as if the operations are not performed in the correct order by another processor. Concretely, the assignment to pInstance_ performed by a processor might occur before the Singleton object has been fully initialized! Thus, sadly, the Double-Checked Locking pattern is known to be defective for such systems.

In conclusion, you should check your compiler documentation before implementing the Double-Checked Locking pattern. (This makes it the Triple-Checked Locking pattern.) Usually the platform offers alternative, nonportable concurrency-solving primitives, such as memory barriers, which ensure ordered access to memory. At least, put a volatile qualifier next to pInstance_. A reasonable compiler should generate correct, nonspeculative code around volatile objects.

I l@ve RuBoard