Imperfect C++ Practical Solutions for Real-Life Programming By Matthew Wilson
	Table of Contents

	Chapter 8. Objects Across Borders

8.2. Portable vtables

Okay, that's enough dubious hackery. Let's get back to trying to find a portable vtable approach. The problem with the approach developed so far is that there is equivocation on the definition of the members and packing of vtables between compilers. This basically renders the technique platform-specific at best, useless at worst, since it is subject to any change in compilers' vtable representation schemes.

Fortunately, in one of those Eureka "why-didn't-I-think-of-that-five-years-ago?" moments, I worked out a simple way around this: instead of deducing the format of most compilers' vtables and working around that, why not define our own vtable format and make all compilers work with that? The result doesn't look too much different from what we've already seen, but it works for all compilers for a given platform. I should warn you, though: even though it's conceptually nice, it's not pretty to look at. Consider the fully portable version of our IObject interface in Listing 8.3.

Listing 8.3.




#include <poab/poab.h>


#include <poab/pack_warning_push.h>


#include <poab/pack_push_ambient.h>


struct IObject;


struct IObjectVTable


{


  void (*SetName)(struct IObject *obj, char const *s);


  char const *(*GetName)(struct IObject const *obj);


};


struct IObject


{


  struct IObjectVTable *const vtable;


#ifdef _ _cplusplus


protected:


  IObject(struct IObjectVTable *vt)


    : vtable(vt)


  {}


  ~IObject()


  {}


public:


  inline void SetName(char const *s)


  {


    assert(NULL != vtable);


    vtable->SetName(this, s);


  }


  inline char const *GetName() const


  {


    assert(NULL != vtable);


    return vtable->GetName(this);


  }


private:


  IObject(IObject const &rhs);


  IObject &operator =(IObject const &rhs);


#endif /* _ _cplusplus */


};


#include <poab/pack_pop_ambient.h>


#include <poab/pack_warning_pop.h>

Irrespective of whether the compilation unit is C or C++, the interface and its vtable are defined as C-compatible structures. This is where we gain control over the packing. The formats of the structures are precisely packed via the inclusion of the files <poab/pack_push_ambient.h> and <poab/ pack_pop_ambient.h>. These files contain compiler-specific packing pragmas, for example, #pragma pack(4) for Borland. The two other warning inclusions are there to suppress and re-express some compilers' warnings about including files containing packing pragmas. These warnings are pretty important, so it's unwise to just switch them off wholesale. The solution is to surround such included files with warning suppressions. Naturally, the warning suppressions/expressions cannot be inside the "offending" files, as they would not then operate in the included file.

There are three significant aspects to note. First, the interface is given a protected constructor. Since the C++ compiler will not be setting up our vtable, we need to do it ourselves. Naturally, this is something only a derived class can do, so the constructor forces the derived class to provide one. Note that the copy constructor and copy assignment operator are private. You might wonder why we would not provide a copy constructor like the following:



IObject(IObject const &rhs)


  : vtable(rhs.vtable)


{}

If the instance being copied lived in another module (i.e., a dynamic library) that might be subsequently unloaded during the lifetime of the copy, you could end up with a correctly constructed object that pointed to code that no longer existed in the process address space. Not pretty! This issue of disappearing code is dealt with in more detail in Chapter 9.

Second, the destructor is defined protected, to prevent client code from calling delete on an instance of the interface. It is also non-virtual. Both of these issues are discussed in more detail in section 8.2.6.

Third, as a convenience to C++ client code, there are two inline methods defined on IObject. This means that C++ client code can use a normal syntax, as in:



IObject *obj = . . .


obj->SetName("Scott Tracy");

Since it is usually the case that there is more client code than server implementation in such infrastructures, this is an important boon to usability. This is also important for another reason, which we'll see in the next section.

8.2.1 Simplifying Macros

The downside of this approach is that it's very verbose. Very verbose. In a sense, that's just hard luck; you play the cards you're dealt. Nonetheless, a practical objection to this is that it's too verbose to use. Frankly, I think that's a furphy.^[3] This stuff is not complex, and easily cranked out by a code generator, or a wizard plug-in to your favorite IDDE. There are plenty frameworks out there that require more complex and arcane necromancy than this. Furthermore, all the complexity lies on the server side. On the client side the code looks exactly as it would when using a normal C++ class virtually.

^[3] I'm always attempting to broaden my skills in the language of my adopted country: furphy is an Australian word meaning "an absurd or false report, or rumor."

Nonetheless, as an aid to acceptance, you can, if you choose, wrap this all up in macros.^[4] I've done an example in the code available on the CD, which looks like:

^[4] It also relies on some funky inclusion recursion, so you might check it out just to amuse yourself with the arcane nature of it all.



#include "poab_gen.h"





STRUCT_BEGIN(IObject)


  STRUCT_METHOD_1_VOID(IObject, SetName, char const *)


  STRUCT_METHOD_0_CONST(IObject, GetName, char const *)


STRUCT_END(IObject)

8.2.2 Compatible Compilers

We know that the vtable format we're using reflects that used by some compilers for their vtable implementations. For those compilers it is possible to refine the interface definition as:

Listing 8.4.



#if defined(_ _cplusplus) && \


    defined(POAB_COMPILER_HAS_COMPATIBLE_VTABLES)


struct IObject


{


  virtual void SetName(char const *s) = 0;


  virtual char const *GetName() const = 0;


};


#else /* ? _ _cplusplus */


. . .  // The previous definition (Listing 8.3)


#endif /* C++ && portable vtables */

When using such a compiler, the interface is a C++ virtual class. When using another compiler, it uses the portable definition. This is the case on client and/or server sides. This can also be encapsulated within the macros.

Now we can see why providing the convenient inlines in our interface is more than just a convenience. It enables client code to be written for compatible and incompatible compilers alike.

8.2.3 Portable Server Objects

We've seen the portable interface and some simple client code. Obviously the bulk of the complexity of this technique is going to lie on the server side, so let's look at just how bad it is. Listing 8.5 shows the first half of the implementation of a class Object, which implements the interface IObject.

Listing 8.5.



class Object


  : public IObject


{


public:


  virtual void SetName(char const *s)


  {


    m_name = s;


  }


  virtual char const *GetName() const


  {


    return m_name.c_str();


  }


#ifndef POAB_COMPILER_HAS_COMPATIBLE_VTABLES


  . . . // POAB gunk


#endif /* !POAB_COMPILER_HAS_COMPATIBLE_VTABLES */


private:


  std::string m_name;


};

For compilers that have a vtable layout that is compatible with our portable vtable format, that's the entirety of the implementation. Note that SetName() and GetName() are both defined virtual; we'll see why in a moment.

For compilers that require portable vtables we need to use the code between the pre-processor conditionals, which is shown in Listing 8.6.

Listing 8.6.



#ifndef POAB_COMPILER_HAS_COMPATIBLE_VTABLES


public:


  Object()


    : IObject(GetVTable())


  {}


  Object(Object const &rhs)


    : IObject(GetVTable())


    , m_name(rhs.m_name)


  {}


private:


  static void SetName_(IObject *this_, char const *s)


  {


    static_cast<Object*>(this_)->SetName(s);


  }


  static char const *GetName_(IObject const *this_)


  {


    return static_cast<Object const*>(this_)->GetName();


  }


  static vtable_t *GetVTable()


  {


    static vtable_t s_vt = MakeVTable();


    return &s_vt;


  }


  static vtable_t MakeVTable()


  {


    vtable_t vt = { SetName_, GetName_ };


    return vt;


  }


#endif /* !POAB_COMPILER_HAS_COMPATIBLE_VTABLES */

The first thing to note is that both default constructor and copy constructor initialize the vtable member by calling a static method GetVTable(). This method contains a local static instance of the vtable_t member type, for Object it is IObjectVTable. The initialization of the static instance is by copying the instance of vtable_t returned by another static method MakeVTable(). As we'll see in Chapter 11, such local static objects are a pretty bad idea, especially in multithreaded contexts. However, in this case there are no issues, because the pointer returned from GetVTable() will always be the same within any link unit (see Chapter 9). Because SetName_() and GetName_() have fixed addresses within a link-unit, the return value from MakeVTable() will always contain the same values. Any way the concurrency works out, therefore, the static vtable s_vt will always have the same values; the only side effects from any race conditions will be that it may be initialized more than once, and this will be vanishingly rare.

The virtual methods themselves are actually the static methods SetName_() and GetName_(). They each take the this pointer of the given instance as their first argument, this_.^[5] It's important to note that this is then downcast to type Object. If this were not done, we'd find ourselves in an infinite loop, as they would call the inline methods defined in IObject.

^[5] I know, I know. Underscore crazy. Again. (I'm not being very "Chapter 17".)

Let's look now at the reason why the SetName() and GetName() accessor methods are defined virtual in the class. With compatible compilers, which basically inherit from the bona fide C++ abstract class, these methods would be virtual, since Object would inherit from an IObject that would define them as virtual. Failure to make them virtual for all compilers could represent a serious inconsistency, if you are deriving subclasses from Object, which are then passed to out to client code via the IObject interface. By having them virtual, we can implement the external virtual behavior—what the client sees—in terms of internal virtual behavior.

The downside is that if the compiler cannot determine whether optimizing out the virtual call of the accessor method into the static method is applicable, we pay the cost of two indirections rather than one. I think that in most cases it's a cost worth paying to get our compiler-independence. If it concerns you, you are free to implement your servers using a compatible compiler, or you can, based on your own judgment, make the accessor methods nonvirtual in which case the efficiency of the method call is identical to that of a normal virtual method.

8.2.4 Simplifying the Implementation of Portable Interfaces

The main problem with the approach so far outlined, as least in my opinion, is the appearance of the infrastructure gunk in the concrete classes implementing the portable interfaces. Thankfully, this is easily rectified by placing it in a related class, which is then used as the base for any concrete class implementations. Hence, we can define a class IObjectImpl in the same header file as IObject, which contains the functions and vtable code that we have placed in Object. For compatible compilers, IObjectImpl will just be a typedef to IObject. Now the whole picture is far more attractive:

Listing 8.7.



// In IObject.h


struct IObject


{


  . . .


};


#if defined(__cplusplus) && \


    defined(POAB_COMPILER_HAS_COMPATIBLE_VTABLES)


typedef IObject IObjectImpl;


#else /* ? __cplusplus */


. . .  The previous definition (Listing 8.3)


class IObjectImpl


   : public IObject


{


  . . . // function definitions and vtables


};


#endif /* C++ && portable vtables */

Now the definition of any derived class is simple and uncluttered by infrastructure, as in:



class Object


  : public IObjectImpl


{


public:


  virtual void SetName(char const *s);


  virtual char const *GetName() const;


private:


  std::string m_name;


};

Since interfaces are very important things that are designed and implemented with care, and take considerable time, I personally don't think there's an issue with crafting an associated "impl" class, or enhancing your code generator to build one for you, so I think this represents an eminently practical technique.

8.2.5 C Client Code

Just as we saw in the last chapter with the handle (API + wrapper) approach, with the Objects across Borders technique we can allow C client code to manipulate our servers. There are a great many C programmers out there, and that picture's not likely to change for a very long time. Thus, anything that broadens the appeal of your libraries can't be a bad thing.

If you're a C++ type of person,^[6] it's pretty rare you'd need to call your interfaces from C, although it can happen. You may be enhancing some existing C code, and want to use a particular C++ library.

^[6] As I guess most of you are; I can't imagine many "++-ophobes" will have made it through the previous deferential chapters, to get to the C++ bashing herein.

Whatever the reason, calling your C++ ABI from C is very straightforward, although it's not terse:



// C client code


IObject *obj;





obj->vtable->SetName(obj, "Archie Goodwin");


printf("Name: %s\n", obj->vtable->GetName(obj));

8.2.6 OAB Constraints

So far I've painted a rosy picture, albeit that it's a non-trivial amount of effort. But before you dash off to take a sharp knife to all your existing projects, I need to confess the several drawbacks to the technique.

Naturally there's a raft of C++ features that are not covered. Since cooperating link-units in your executable may be built with different compilers, any aspects of C++ that are not explicitly covered in the technique are very likely to have different characteristics. For a start, any run time type information (RTTI) mechanism [Lipp1996] will probably be different. The same goes for exception throwing/catching, which relies on the RTTI mechanism. You cannot use typeid on a pointer to a portable interface, nor can you throw an exception from a call on a portable interface method.

Non C++-specific constraints regarding resource handling between link-units (discussed in detail in Chapter 9) all apply here. For example, if your interface allocates some memory for the client from its internal heap, say via operator new, then it is not valid for the client to delete it. It must be returned via some deallocation method on the same interface or on another object acquired via that interface. This is all standard resource-consumer good citizenship stuff.

The interface does not contain a virtual destructor. There is actually no problem with providing a virtual destructor method in and of itself. To do so in a portable way is eminently simple:



struct IObjectVTable


{


  void (*Destructor)(struct IObject *obj);


};

The problem comes when it is used. First, there is no way to provide an "inline" destructor method, so you have no means of calling it from the client side as part of a delete statement, except for compatible compilers that are seeing the interface as a C++ abstract class. But this is no bad thing really: We would not want delete to be able to be called in any circumstances anyway, because it is unlikely that the client code and the server code would share the same new/delete operator implementations and a crash would follow. So we don't want a destructor as part of the interface.

A further point is that we don't want a virtual destructor at all, whether client code can access it or not. The reason is that we may want to derive from a given interface,^[7] and having a virtual destructor would mean that the layouts for compatible and incompatible compilers would differ. We'd be required to insert a stub field (like those void*s in the Solaris vtable, from section 8.1.1) in order to keep them aligned.

^[7] Yes, that's right. You can also have inheritance in your C++ ABI, albeit that it's only single inheritance.

Speaking of inheritance, you can't have multiple inheritance with this technique. However, that rarely presents a problem, given the other C++ privations one must bear.

There are other issues with the specific example I've used. For instance, the notion of ownership is not specified, but these are merely the result of the simplicity of the example, rather than flaws in the portable vtables technique. The fundamental basis of the technique is sound, and you can use it to build any required level of complexity.

Regarding ownership, there are two approaches generally used. Both use a factory function, such as make_Object(). In one, a corresponding destroy_Object() function is called to return the instance when it is no longer needed. This is a simple and workable model, but it lends itself more to single-use/large-object scenarios, for example, a compiler plug-in. The other approach is to use reference counting. Generally, this is the best technique to go with the portable vtables.