Imperfect C++ Practical Solutions for Real-Life Programming By Matthew Wilson
	Table of Contents

	Chapter 10. Threading

10.4. Multithreading Extensions

Now that we've looked at a few issues pertaining to multithreading, it may have occurred to you that it would be useful for the language to provide built-in support for multithreading operations. Indeed, several languages do provide multithreading constructs. The C++ tradition is to favor the addition of new libraries rather than new language elements. We'll take a look at a couple of potential areas, see how the language might provide them, and how we can implement them using libraries (and a little bit of preprocessor trickery).

10.4.1 synchronized

D and Java have the synchronized keyword, which can be used to guard a critical region, as in:



Object obj = new Object();


. . .


synchronized(obj)


{


  . . . // critical code


}

One way to incorporate a synchronized keyword into the language would be to automatically translate the above code as follows:



{ __lock_scope__<Object> __lock__(obj);


{


  . . . // critical code


}


}

The __lock_scope__ would be to all intents similar to the lock_scope template described in section 6.2. This would be pretty easy to do, and having an associated std::lock_traits template would enable an instance of any traits-able type to be synchronized in this way, which would not necessarily translate to a synchronization object lock.

This one is not a really strong contender for language extension, however, since with a modicum of macros we can achieve the same thing. Basically, all that is needed is the following two macros:



#define SYNCHRONIZED_BEGIN(T, v) \


  {                              \


  lock_scope<T> __lock__(v);


#define SYNCHRONIZED_END() \


  }

The only slight loss is that the type of the object would not be deduced for us, and also that the code looks somewhat less pretty:^[13]

^[13] It could be argued that the uglification is actually a benefit, since it increases the profile of the synchronized status of the critical region, which is a pretty important thing for anyone reading the code to take notice of.



SYNCHRONIZED_BEGIN(Object, obj)


{


  . . . // critical code


}


SYNCHRONIZED_END()

If you don't like the SYNCHRONIZED_END() part, you can always get a little bit trickier with your macro and define SYNCHRONIZED() macro as follows:



#define SYNCHRONIZED(T, v)                            \


  for(synchronized_lock<lock_scope<T> > __lock__(v);   \


      __lock__; __lock__.end_loop())

The synchronized_lock<> template class is only there to define a state^[14] and to terminate the loop, since we can't declare a second condition variable within the for statement (see section 17.3). It is a bolt-in class (see Chapter 22) and looks like:

^[14] It doesn't really define an operator bool(). We'll see why they're not used, and how to do them properly, in Chapter 24.

Listing 10.5.



template <typename T>


struct synchronized_lock


  : public T


{


public:


  template <typename U>


  synchronized_lock(U &u)


    : T(u)


    , m_bEnded(false)


  {}


  operator bool () const


  {


    return !m_bEnded;


  }


  void end_loop()


  {


    m_bEnded = true;


  }


private:


  bool  m_bEnded;


};

There's another complication (of course!). As was described in section 17.3, compilers have different reactions to for-loop declarations, and if we were to have two synchronized regions in the same scope, some of the older ones would complain.



SYNCHRONIZED(Object, obj)


{


  . . . // critical code


}


  . . . // non-critical code


SYNCHRONIZED(Object, obj) // Error: "redefinition of __lock__"


{


  . . . // more critical code


}

Thus, a portable solution needs to ensure that each __lock__ is distinct, so we have to get down and dirty with the preprocessor.^[15]

^[15] I'll leave it up to you to do a little research as to why the double concatenation is required.



#define concat__(x, y)            x ## y


#define concat_(x, y)             concat__(x, y)


#define SYNCHRONIZED(T, v)                     \


  for(synchronized_lock<lock_scope<T> >        \


      concat_(__lock__, __LINE__) (v);         \


      concat_(__lock__, __LINE__);             \


      concat_(__lock__, __LINE__)  .end_loop())

It's ugly, but it works for all the compilers tested. If you don't need to be concerned with anachronistic for behavior, then just stick to the simpler version. The full versions of these macros and the classes are included on the CD.

10.4.2 Anonymous synchronized

There's a twist on the object-controlled critical region, which is that sometimes you don't have an object that you want to use as a lock. In this case, you can either just declare a static one in the local scope or, preferably, one in (anonymous) namespace scope in the same file as the critical region. You could also build on the techniques for the SYNCHRONIZED() macro, and produce a SYNCHRONIZED_ANON() macro that incorporates a local static, but then you run into a potential race condition whereby two or more threads might attempt to perform the one-time construction of the static object simultaneously. There are techniques to obviate this, as we'll see when we discuss statics in the next chapter, but it's best to avoid the issue. The namespace scope object is the best option in these cases.

10.4.3 atomic

Getting back to my favorite synchronization issue, atomic integer operations, one possible language extension would be to have an atomic keyword to support code such as the following:



atomic j = ++i; // Equivalent to j = atomic_preincrement(&i)

or, using the XOR exchange trick,^[16]

^[16] This is an old hacker's delight [Dewh2003], and frequent interview question. Test it out—it works, although I think it's not guaranteed to be portable!



atomic j ^= i ^= j ^= i; // Equiv. to j = atomic_write(&i, j);

It would be the compiler's responsibility to ensure that the code was translated into the appropriate atomic operation for the target architecture.^[17] Unfortunately, the differences between processor instruction sets would mean that we'd either have to live with nonportable code, or that only a very few operations would be eligible for atomic decoration. We certainly would not want the compiler to use lightweight measures where it could and silently implement other operations by the locking and unlocking of a shared mutex: better to have these things expressly in the code as we do now.

^[17] Note that I'm suggesting the keyword would apply to the operation, not the variable. Defining a variable atomic and then 50 lines down relying on that atomic behavior is hardly a win for maintainability. The clear intent, and grepability, of atomic_* functions is much preferable to that.

It would be nice to have the atomic keyword for C and C++, for the limited subset of atomic integer operations that would be common to all architectures. However, using the atomic_* functions is not exactly a hardship, and it's certainly as readable—possibly more so—than the keyword form. Their only real downside is that they're not mandatory for all platforms; hopefully a future version of the C/C++ standard(s) will prescribe them.