Imperfect C++ Practical Solutions for Real-Life Programming By Matthew Wilson
	Table of Contents

	Chapter 8. Objects Across Borders

8.1. Mostly Portable vtables?

Anyone who has done any COM programming is certain to know that polymorphism in a compiler-independent fashion is feasible. Indeed, COM is a language-independent technology, and it's common to write COM components in C++, D, VB,^[1] and even C. Doing it in C is the hard way, and can get you extra points in interviews as well as a "very good" for understanding how COM (and C++) works, but in the real world you're better off letting your compiler synthesize your virtual function tables (vtables) [Stro1994] if you can.

^[1] Not that one would choose to use VB, you understand.

Without any further preamble, I'll just show you how easy it can be to pass C++ across C ABIs:



#define OBJ_CALLCONV  . . . // consistent within each OS





struct IObject


{


  virtual void       OBJ_CALLCONV  SetName(char const *s) = 0;


  virtual char const *OBJ_CALLCONV GetName() const = 0;


};





extern "C" int make_Object(IObject **pp);

Let's look at some client code (Listing 8.1):

Listing 8.1.



int main()


{


  IObject  *pObject;





  if(make_Object(&pObject))


  {


    pObject->SetName("Reginald Perrin");


    std::cout << pObject->GetName() << std::endl;


  }


  return 0;


}

Dynamic libraries containing an implementation of the factory [Gamm1995, Sutt2000] function make_Object() and executables containing main() were created for each of our six compilers. In this case there's no need of a compiler comparison table, because all 36 permutations ran perfectly. How's that for some C++ ABI?

It would appear that we are able to support runtime polymorphism by using an object via a fully abstract class: an interface. The extension of this is to select different make_Object() functions, or to have it return instances of different types dependent on arguments, or other criteria.

8.1.1 vtable Layouts

You're probably looking at the code and wondering about the assumptions on which it relies regarding the C++ object-model, and you'd be right to do so. There is no stipulation in the C++ standard as to how a class's vtable is accessed with respect to a given instance. This is not a problem, so long as we restrict ourselves to interfaces that contain no member data; whether the pointer to the vtable goes at the front or the back is immaterial since it will be the only member.

More significantly, there is no stipulation as to how a class's vtable is represented itself. In fact, the standard does not even stipulate that virtual functions are to be implemented using a vtable. It's just that all compilers use them. There's nothing stopping some brilliant person inventing a completely different implementation. The issue seems clearer if we look at it expressed in terms of C structure layouts. The equivalent of IObject in C might look that shown in Listing 8.2.

Listing 8.2.




struct IObject;


struct IObjectVTable


{


  void (*SetName)(struct IObject const *obj, char const *s);


  char const *(*GetName)(struct IObject const *obj);


};





struct IObject


{


  struct IObjectVTable *const vtable;


};

If you're not experienced with COM, this may look pretty foreign to you, but it's actually quite straightforward. It works because fundamentally a class is just a structure, and if it defines virtual functions (or inherits from one that does), then it contains a hidden member called a vptr. The vptr is a pointer to a table (usually shared between all instances of the class) that contains pointers to all the class (virtual) member functions, called a vtable. In this case, the vtable is of type struct IObjectVTable, which contains pointers to the SetName() and GetName() methods. It's like any function pointer table except that the first parameter to all functions is a pointer to the interface structure—the this pointer in C++. The interface structure struct IObject has a single member vtable which points to its vtable—an instance of struct IObjectVTable.

As I said, there is perfect adherence among our six Win32 compilers to this layout. It's probably not a coincidence that Win32 compilers support this layout, since that is the layout that COM uses, and no Win32 compiler is going to very popular these days unless it can support COM.

While we can accept that this is certainly an obvious and efficient object layout model, there are compilers that choose to do things differently. One Win32 compiler that does not support this is GCC 2.95 (version 3.2 was used in the tests). A lot of messing around using unprintable techniques reveals that it uses a vtable layout like the following:



struct IObjectVTable


{


  uint32_t    v1;  /* Always 0 */


  void        *v2; /* Some unknown function */


  uint32_t    v3;  /* Always 0 */


  void        (*SetName)(struct IObject *, char const *s);


  uint32_t    v4;  /* Always 0 */


  char const  *(*GetName)(struct IObject const *);


};

The values of v1, v3, and v4 are zero, so I presume this is a packing issue. v2 appears to be a function whose actual address is very close to that of SetName() and GetName(), but I don't know its precise nature. GCC 2.95 is not the only compiler to differ. Sun's C++ compiler^[2] uses a layout along the lines of:

^[2] Many thanks to Gary here for being my Solaris avatar.



struct IObjectVTable


{


  void        *v1; /* Some unknown function */


  void        *v2; /* Some unknown function */


  void        (*SetName)(struct IObject *, char const *s);


  char const  *(*GetName)(struct IObject const *);


};

So we have a somewhat unpleasant choice. One option would be to accept this partial solution, at least on Win32, since every modern compiler appears to support it. Working on other platforms might reveal similar uniformities of representation, in which case we could take the same position.

Flummery! We're imperfect practitioners, and this just doesn't cut the mustard. We need to find a complete solution.

8.1.2 Dynamic Manipulation of vtables

Before we try to work out our fully portable solution, I want to play the irresponsible host for a moment, and show you some dodgy-but-informative techniques for messing around inside the C++ object layout.

You may be looking at all this stuff in horror, worrying about whether you have to define those C vtables yourself. Normally you don't. That's one of the things that the C++ compiler takes care of quite nicely.

It's a very bad thing to go messing around with the vtables of any C++ compiler-generated classes, since any changes are likely to be reflected in all instances of that class for the duration of your process, but there's nothing stopping you from manipulating your own vtables in C. This can, in very rare circumstances, be used to change the nature of objects at run time. It's not something I'd recommend, and it's mainly useful for learning about C++ implementations rather than a technique one would wish to use in production software, but it's good to know how to do it.

Basically, you need three things. First, you have to allocate your own vtable. That's done from within C, and is simply a matter of allocating memory to hold the vtable contents. Second, you need to copy an existing vtable from a valid object. This is done from within C, but on an object created in a C++ compilation unit. Finally, you can change the members of your new vtable, and set it onto the object you wish to mess around with. This is done from within C, but it can be done on an object that was created from within a C++ compilation unit. This can all be wrapped up in one function:



void AlterObject(Thing *thing)


{


  typedef struct ThingVTable  vtable_t;


  vtable_t *vt  = (vtable_t*)malloc(sizeof(vtable_t));


  *vt           = *thing->vtable;


  vt->Method    = someOtherFunction;


  thing->vtable = vt;


};

I'll leave it as an exercise for your coding skills to fill in the error handling and the cleanup of the vtable, and an exercise for your judgment to determine whether you'd ever want to do such a thing. I do know of one vendor that uses these techniques to efficiently push dynamic behavior on variant types by switching vtables and vtable entries, but you didn't hear it from me!