Imperfect C++ Practical Solutions for Real-Life Programming By Matthew Wilson
	Table of Contents

	Chapter 11. Statics

11.3. Function-Local Static Objects

In the last two sections we looked at nonlocal static objects. In this one we'll look at local-static objects, which are those defined at function-scope, such as



Local &GetLocal()


{


  static Local local;


  return local;


}

The crucial difference between nonlocal and local static objects is that local static objects are created when they are needed, that is to say the first time the function is called. Subsequent calls simply use the already-constructed instance. Naturally there needs to be some mechanism to record that an instance has been constructed, and so an implementation will use an unseen initialization flag.

Although the mechanism of flagging the initialization is not specified by the standard, the implication is pretty clear. Section (C++-98: 6.7;4) states that "an object is initialized the first time control passes through its declaration; . . . If the initialization exits by throwing an exception, the initialization is not complete, so it will be tried again the next time control enters the declaration. If control re-enters the declaration (recursively) while the object is being initialized, the behaviour is undefined." Hence, the previous function actually looks much like the following, under the covers:

Listing 11.10.



Local &GetLocal()


{


  static bool  __bLocalInitialized__  = false;


  static byte  __localBytes__[sizeof(Local)];


  if(!__bLocalInitialized__)


  {


    new(__localBytes__) Local();


    __bLocalInitialized__ = true;


  }


  return *reinterpret_cast<Local*>(__localBytes__);


}

The problem with this situation is that in multithreaded environments it is subject to a race condition. Two or more threads could come in and see that __bLocalInitialized__ is false, and simultaneously go on to construct local. This could result in a leak, or it could crash the process, but whatever does happen it's undesirable.

One might naively^[10] suppose that using the volatile keyword on the static declaration might help. After all, the C standard says (C99-5.1.2.3;2) that "accessing a volatile object, modifying an object, modifying a file, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment. . . . At certain specified points in the execution sequence called sequence points, all side effects of previous evaluations shall be complete and no side effects of subsequent evaluations shall have taken place. (A summary of the sequence points is given in annex C.)" The C++ standard says much the same thing in (C++-98: 1.9;7).

^[10] I'm talking about myself here. Ah, bittersweet memories....

Alas, the fact that the standards say nothing about threading means that one cannot rely on an implementation supporting our presumption. In practice, the use of volatile does nothing to ensure that the use of an object is thread safe. Thus, volatile is essentially a mechanism for dealing with hardware, although it's occasionally useful for other things (see section 12.5).

Imperfection: Function-local static instances of classes with nontrivial constructors are not thread-safe.

11.3.1 Sacrifice Lazy Evaluation

One way to obviate this risk is to use the Schwarz counter to ensure that all local static instances are initialized prior to going into multithreaded mode (with the caveat being that threads are not initiated in any global object constructors, as mentioned in section 11.1). This is effective, but it does negate most of the purpose of having the object local. Furthermore, it is quite possible that some functions containing local statics would behave inappropriately if called too early; we could be back to the global object problems again.

11.3.2 Spin Mutexes to the Rescue

The race condition inherent in the initialization of thread-local objects is very important, so it cannot be ignored. However, it is also very rare. We can trade on this rarity and use a wonderfully elegant solution,^[11] based on spin mutexes, which are themselves dependent on atomic operations, both of which were examined at length in Chapter 10.

^[11] Reader hint: Anytime I refer to a solution as wonderfully elegant, you can be sure it's one that I think I've invented.



Local &GetLocal()


{


  static int              guard; // Will be zeroed at load time


  spin_mutex              smx(&guard); // Spin on "guard"


  lock_scope<spin_mutex>  lock(smx);   // Scope lock of "smx"


  static Local            local;


  return local;


}

This all works because the static guard variable is set to zero during the zero-initialization phase of the process initialization. The non-static spin_mutex instance smx operates on guard, and is itself locked by a parameterization of the lock_scope template, also non-static. Therefore, the only way to get to the test of the unseen initialization flag for local, and to local itself, is guarded by the efficient guard mechanism of the spin mutex.

There's a cost to this, to be sure, but, as we saw in section 10.3, spin mutexes are very low cost except where there is a high degree of concurrent contention around the guarded section, or the guarded section is long, or both. Since in all but the first case the guarded section will consist of one compare (of the unseen initialization flag) and one move (returning the address of local), the section itself is very low cost. And it is hard to conceive of any client code where several threads will be contending for the Local singleton with such frequency that the likelihood of them wasting cycles in the spin will be appreciable. Therefore, this is a very good solution for guarding local statics against races.