Gotcha #53: Virtual Base Default Initialization

A virtual base subobject is laid out differently from a nonvirtual base subobject. A nonvirtual base class is typically laid out as if it were a simple data member of a derived class object, as in Figure 5-1. It may therefore occur more than once within an object:



class A { members }; 


class B : public A { members };


class C : public A { members };


class D : public B, public C { members };

Figure 5-1. Likely object layout under multiple inheritance without virtual inheritance. A `D` object has two `A` subobjects.

graphics/05fig01.jpg

A virtual base class occurs only once within an object, even if it occurs many times in the class lattice (hierarchy structure) of the complete object (as in Figure 5-2):



class A { members }; 


class B : public virtual A { members };


class C : public virtual A { members };


class D : public B, public C { members };

Figure 5-2. Likely object layout under multiple inheritance with virtual inheritance. A `D` object has a single `A` subobject.

graphics/05fig02.jpg

For ease of illustration, we've shown a rather outmoded pointer implementation of virtual base classes. In the location where a nonvirtual base class A would appear in the complete object, we have instead a pointer to the shared storage for a single A subobject. More typically, the link to the shared virtual base subobject would be accomplished with an offset or with information stored in the virtual function table. However, the discussion that follows applies to any implementation.

Typically, the storage for the shared virtual base subobject is appended to the complete object. In the example above, the complete object is of type D, and the storage for A is appended after any D data members. An object whose "most derived class" is B would have a different storage layout.

A moment's reflection will convince you that only the most derived class knows precisely where the storage for a virtual base subobject is located. An object of type B may be a complete object, or it may be embedded as a subobject in another object. For this reason, it's the task of the most derived class to initialize all the virtual base subobjects in the class lattice as well as the mechanism used to access those subobjects.

In the case of an object whose most derived type is B, as in Figure 5-3, the B constructors will initialize the A subobject and set the pointer to it:



B::B( int arg ) 


   : A( arg ) {}

Figure 5-3. Likely layout of an object under single inheritance with a virtual base class. A `B` object has a single `A` subobject, but it must still be referenced indirectly.

graphics/05fig03.jpg

In the case of an object whose most derived type is D, as in Figure 5-2, the D constructors will initialize the A subobject and the pointers to it in B and C as well as D's immediate base classes:



D::D( int arg ) 


   : A( arg ), B( arg ), C( arg+1 ) {}

Once the A subobject is initialized by D's constructor, it will not be reinitialized by B's or C's constructor. (One way the compiler might accomplish this is to have the D constructor pass a flag or A pointer to the B and C constructors that says "Oh, by the way, don't initialize A." Nothing mystical here.) Let's look at another constructor for D:



D::D() 


   : B( 11 ), C( 12 ) {}

This is a common source of misunderstanding and bugs in the use of virtual base classes. The D constructor still initializes the virtual A subobject, but it does so implicitly, by calling A's default constructor. When D's constructor invokes the constructor for the B subobject, it doesn't reinitialize A, and therefore the explicit call to A's nondefault constructor doesn't take place.

For simplicity, it's best to use virtual base classes only when a design clearly indicates their use. (By the same token, virtual bases should not be avoided when a design clearly indicates their use.) In addition, it's usually simplest to design classes used as virtual bases as "interface classes." Interface classes have no data, generally all their member functions (except perhaps the destructor) are pure virtual, and they typically have no declared constructor or a simple default constructor:



class A { 


 public:


   virtual ~A();


   virtual void op1() = 0;


   virtual int op2( int src, int dest ) = 0;


   // . . .


};


inline A::~A()


   {}

Following this advice will help avoid bugs in the implementation not only of constructors but also of assignment. In particular, the standard specifies that a compiler-provided version of copy assignment may, or may not, assign multiple times to a virtual base subobject. If all virtual base classes are interface classes, then assignment is a no-op (remember that class mechanism, like virtual function table pointers, is not affected by assignment, only by initialization), and multiple assignment does not pose a problem.

General solutions to implementing assignment in a hierarchy containing virtual base classes usually involve imitating, in some sense, the semantics of construction of objects that contain virtual base class subobjects.

Consider the first implementation of class D above, shown in Figure 5-1, which contains two (nonvirtual) A subobjects. In this case, as with D's constructor, a programmer-supplied copy assignment operator can be implemented entirely in terms of its immediate base classes:

gotcha53/virtassign.cpp



D &D::operator =( const D &rhs ) { 


   if( this != &rhs ) {


       B::operator =( *this ); // assign B subobject


       C::operator =( *this ); // assign C subobject


       // assign any D-specific members . . .


   }


   return *this;


}

This assignment makes the reasonable assumption that the B and C base classes will perform an appropriate assignment of their (nonvirtual) A subobjects. As with construction, this simple, layered approach to assignment does not hold up under virtual inheritance. As with construction, the most derived class should assign the virtual base subobjects and somehow prevent intermediary base class subobjects from reassigning:

gotcha53/virtassign.cpp



D &D::operator =( const D &rhs ) { 


   if( this != &rhs ) {


       A::operator =( *this ); // assign virtual A


       B::nonvirtAssign( *this ); // assign B, except A part


       C::nonvirtAssign( *this ); // assign C, except A part


       // assign any D-specific members . . .


   }


   return *this;


}

Here, we've introduced special assignment-like member functions in B and C. They perform identically to their copy assignment operators but don't perform assignment on any virtual base subobjects. This is effective but clearly complex and requires that D be intimately aware of the structure of the hierarchy beyond its immediate base classes. Any change to that structure will require reimplementation of D. As mentioned above, it's generally best that classes used as virtual bases be interface classes.

One implication of the layout of virtual base class subobjects is that it's illegal to perform a static downcast from a virtual base class to one of its derived classes:



A *ap = gimmeanA(); 


D *dp = static_cast<D *>(ap); // error!


dp = (D *)ap; // error!

It is possible to perform a reinterpret_cast from a virtual base to one of its derived classes. As shown in Figure 5-4, this will probably result in a bad address and so is not of much use. The only reliable way to perform a downcast from a virtual base pointer or reference is to use a dynamic_cast (but see Gotcha #45):



if( D *dp = dynamic_cast<D *>(ap) ) { 


 // do something with dp . . .


}

Figure 5-4. Likely effect of static and dynamic casting under multiple inheritance with virtual base classes. Under this implementation, a `D` object has three valid addresses, and a correct cast depends on knowledge of the offsets of the various subobjects within the complete object.

graphics/05fig04.jpg

However, systematic use of dynamic_cast may indicate a poor design. (See Gotchas #98 and #99.)

[ Team LiB ]

Gotcha #53: Virtual Base Default Initialization

Figure 5-1. Likely object layout under multiple inheritance without virtual inheritance. A D object has two A subobjects.

Figure 5-2. Likely object layout under multiple inheritance with virtual inheritance. A D object has a single A subobject.

Figure 5-3. Likely layout of an object under single inheritance with a virtual base class. A B object has a single A subobject, but it must still be referenced indirectly.

Figure 5-4. Likely effect of static and dynamic casting under multiple inheritance with virtual base classes. Under this implementation, a D object has three valid addresses, and a correct cast depends on knowledge of the offsets of the various subobjects within the complete object.

Figure 5-1. Likely object layout under multiple inheritance without virtual inheritance. A `D` object has two `A` subobjects.

Figure 5-2. Likely object layout under multiple inheritance with virtual inheritance. A `D` object has a single `A` subobject.

Figure 5-3. Likely layout of an object under single inheritance with a virtual base class. A `B` object has a single `A` subobject, but it must still be referenced indirectly.

Figure 5-4. Likely effect of static and dynamic casting under multiple inheritance with virtual base classes. Under this implementation, a `D` object has three valid addresses, and a correct cast depends on knowledge of the offsets of the various subobjects within the complete object.