Previous section   Next section

Imperfect C++ Practical Solutions for Real-Life Programming
By Matthew Wilson
Table of Contents
Chapter 7.  ABI


7.4. I Can C Clearly Now

Having looked at static and dynamic linking for both C and C++, it's pretty clear that the only thing that we could reasonably call compiler-independent entails dynamic linking to code with a C interface.

We're now going to look at a way we can have our objects in a compiler-independent fashion, but first we should do a quick recap of C and C++ compatibility.

7.4.1 extern "C"

As a near superset of C, there naturally has to be a way for C++ to interact with C functions. Since C++ mangles every function, in case it is overloaded, there needs to be a mechanism to tell it not to mangle functions that are implemented in C. This is done using extern "C" as in:



void cppfunc(int);


extern "C" void cfunc(int);



Now you can use cfunc() in your C++ source, and the compiler and linker will ensure that the references will be resolved to an unmangled symbol (e.g., _cfunc). Uses of the C++ function cppfunc() will resolve to the symbol ?cppfunc@@YAXH@Z for Visual C++ compatible compilers/linkers.

Naturally, once a function has been declared to have C linkage, it is illegal to overload it with another of C linkage:



void cppfunc(int);


extern "C" void cfunc(int);


extern "C" void cfunc(char);  // Error: "more than one instance of "cfunc" has


"C" linkage"



But—and this surprises a lot of interview candidates[6]—you can overload a function declared as extern "C" as many times as you like, as long as none of the overloads is also declared extern "C". This can actually be very handy. Consider that you have a set of overloaded C++ functions, of which all but the first are forwarding functions for the user's convenience, as shown in Listing 7.2.

[6] And a few interviewers, at that!

Listing 7.2.


// ConnApi.h


struct conn_info_t *conn_handle_t;


conn_handle_t CreateConnection( char const *host


                               , char const *source


                               , int flags, unsigned *pid);


conn_handle_t CreateConnection(char const *host, int flags);


conn_handle_t CreateConnection( char const *host, int flags


                               , unsigned *pid);





// ConnApi.cpp


conn_handle_t CreateConnection( char const *host


                               , char const *source


                               , int flags, unsigned *pid)


{


  . .


}


conn_handle_t CreateConnection(char const *host, int flags)


{


  return CreateConnection(host, host, flags, NULL);


}


conn_handle_t CreateConnection(char const *host, int flags


                               , unsigned *pid)


{


  return CreateConnection(host, host, flags, pid);


}



In order to house this API in a compiler-agnostic dynamic library, you could simply declare the first variant of the function extern "C", and define the others as overloads within the C++ compilation units only, as shown in Listing 7.3.

Listing 7.3.


// ConnApi.h


struct conn_info_t *conn_handle_t;


#ifdef _ _cplusplus


extern "C" {


#endif /* _ _cplusplus */


conn_handle_t CreateConnection( char const *host


                               , char const *source


                               , int flags, unsigned *pid);





#ifdef _ _cplusplus


} /* extern "C" */





inline conn_handle_t CreateConnection( char const *host


                                      , int flags)


{


  return CreateConnection(host, host, flags, NULL);


}


inline conn_handle_t CreateConnection( char const *host


                                      , int flags


                                      , unsigned *pid)


{


  return CreateConnection(host, host, flags, pid);


}


#endif /* _ _cplusplus */





// ConnApi.cpp


conn_handle_t CreateConnection(char const *host, char const *source, int flags, unsigned *pid)


{


  . .


}



A nice side effect is that this is now usable by C programmers. Since there are still a lot of them out there, it's nice to be able to increase the potential user-base of your code.

Obviously, this only works when your library is using functions, and where all its types are expressible in C, or at least have an identical layout in all the potential C++ compilers for the given operating system.

Types that are expressible in C essentially mean POD types (see Prologue), which are defined in C++ in order to allow interconvertibility between C and C++. When using C-linkage, one normally tends to restrict one's parameter types to POD types. However, extern "C" really only means "no mangling"; it does not mean "C only." Thus it is quite feasible to define the following:



class CppSpecific


{


  int CppSpecific::*pm;


};


extern "C" void func(CppSpecific const &);



Even though it is feasible to build your shared library around C linkage functions that manipulate C++ classes, it is dangerous to do so, because many compilers have different object layout models [Lipp1996], as we'll see in Chapter 12. In practice, it is wise to stick to POD types.

However, that's not the end of the portable C++ story, as we'll see later in this chapter.

7.4.2 Namespaces

When I mentioned earlier that we're not able to portably export overloaded functions or class methods, I didn't mention namespaces. That was not an oversight.

In [Stro1994] Bjarne Stroustrup discusses the restriction of a maximum of one instance of extern "C" functions per name in a link unit, irrespective of any namespace context. He describes it as a "compatibility hack," and indeed it is. However, it is a hack we can be thankful for, since it provides us with a degree of flexibility when defining portable functions.

Basically, if an extern "C" function is defined within a namespace, the namespace is omitted from the symbol name. Hence the following still has the symbol name ns_func:



namespace X


{


  extern "C" void ns_func();


}



This is how the standard library can place C standard library functions in the std namespace without breaking linkage to it from C++ programs.

We can turn this to our advantage, since we can define our portable functions within a namespace to be good C++ citizens, and yet still use them from within C code. However, it's a double-edged sword, since there can only be one binary form of a function with C linkage. In other words, if you're defining portable functions and you link to something that has done the same thing, you will have a linker clash, irrespective of whether you defined your functions in different C++ namespaces.

In practice, I've never encountered this problem, but it's not to be dismissed. It's better to opt for some C-style disambiguation in the form of the API_Function naming convention, for example, Connection_Create().

7.4.3 extern "C++"

A curious converse to extern "C", seldom used, is extern "C++". This declares a function, or a block of functions as being of C++ linkage. Since this is only valid within C++ code, you might wonder when you'd ever have cause to use such a thing.

Consider the following header file:



// extern_c.h


#ifdef _ _cplusplus


extern "C" {


#endif /* _ _cplusplus */


  int func1();


  int func2(int p1, int p2, int p3 /* = -1 */);


#ifdef _ _cplusplus


} /* extern "C" */


  inline int func2(int p1, int p2)


  {


    return func2(p1, p2, -1);


  }


#endif /* _ _cplusplus */



func2() has a third parameter that, when not specified to a meaningful value, should be defaulted to –1. Since C does not support default parameters, the appropriate thing to do is to provide a C++-only overload, which we do by overloading the function for C++ compilation (inside the second #ifdef _ _cplusplus block).

Now consider another header file, written purely for use by client code:



// no_extern_c.h


int func1();


int func2(int p1, int p2, int p3);



If this file provides the declarations for files compiled in C, then they will have C linkage, and be unmangled in the object/library file. Using them from within C++ would therefore result in a linker error, as the compiled code would tell the linker to look for mangled names. In order to guard against this, the advice [Stro1994] is to surround the inclusion with extern "C", as in:



// cpp_src.cpp


extern "C"


{


#include "no_extern_c.h"


}


int main()


{


  return func2(10, 20, -1);


}



Unfortunately, if you do the same with a file such as extern_c.h, you'll be informed that you cannot have a second overload of func2() with C linkage. Hence, it's not good enough to simply declare your C++-only overloads outside the extern "C" block; you also need to enclose them within an extern "C++" block, which tells the compiler to give them C++ linkage (i.e., to use mangling), as in Listing 7.4.

Listing 7.4.


// extern_c.h


#ifdef _ _cplusplus


extern "C" {


#endif /* _ _cplusplus */


  int func1();


  int func2(int p1, int p2, int p3 /* = -1 */);


#ifdef _ _cplusplus


  extern "C++"


  {


    inline int func2(int p1, int p2)


    {


      return func2(p1, p2, -1);


    }


  } /* extern "C++" */


} /* extern "C"" */


#endif /* _ _cplusplus */



Etiquette requires one to refrain from enclosing any pure C++ headers (sometimes denoted by having .hpp or .hxx extensions, or all those daft no-extension standard library headers). But mixed headers are almost always given the .h extension, and so if you have C++ code in there, you should protect it accordingly. That's the way to write robust headers to be compatible with both C and C++.

There are occasions when you want to declare, and even define, C++ functions in contexts that are going to be surrounded with extern "C" automatically. A good example of this is when using COM IDL. It is not uncommon to define some C++ helper functions, or simple classes, inside the IDL, since in this way the separation between your code and the types and interfaces it uses is minimized. In this case, wrap the code inside a conditionally defined extern "C++" in an order that they will survive the MIDL compiler, which surrounds the entire translations of the interface definitions in extern "C".

There's a related side note I should make before we finish this topic. Several old compilers have problems instantiating template parameterizations made inside functions that are declared extern "C", giving confusing error messages:



extern "C" void CreateSomeObject(SomeObject *pp)


{


  *pp = new Concrete<SomeObject>(); // Error: template Concrete cannot be


defined extern "C"


}



The simple answer here is to provide forwarding functions:



SomeObject *makeSomeObject()


{


  return new Concrete<SomeObject>();


}


extern "C" void CreateSomeObject(SomeObject *pp)


{


  *pp = makeSomeObject();


}



7.4.4 Getting a Handle on C++ Classes

Thus far we've got a fair degree of portability, but at a severe sacrifice of C++ expressiveness. Thankfully that's not the end of the story. Few of us want to implement our client code in C. Having the option is nice, as there are plenty of C programmers out there who may want to use our libraries, even if it is written in that inefficient, new-fangled object-oriented stuff. But we want C++ on the client side, if only for our RAII (see section 3.5), so what can we do to make our portable code more C++-friendly?

Well, just as we can deconstruct the public interface of a class behind a C API we can reconstruct it on the client side, in the form of a wrapper class, which conveniently can handle our resource management for us via RAII. Of course, this will rankle the efficiency instincts, but it's often a sacrifice worth making. The wrapper class can also handle the initialization and release of the library, so it can be entirely self-contained.

In some circumstances you can enhance the technique to genuinely, albeit circuitously, export classes. Let's look at one of my favorite classes, the Synesis BufferStore class, which is implemented according to the technique I'll be describing. Logically, it has the following form:



class BufferStore


{


public:


  BufferStore(size_t cbBuffer, unsigned cBuffers);


  ~BufferStore();


public:


  unsigned Allocate(void **ppBuffers, unsigned cBuffers);


  unsigned Share( void const **ppSrcBuffers


                , void **ppDestBuffers, unsigned cBuffers);


  void Deallocate(void **ppBuffers, unsigned cBuffers);


};



It creates a set of shareable buffers, which it can then allocate, deallocate, and share in a highly efficient manner. It's ideal for implementing networking services. It is made portable behind a C API as follows:



// MLBfrStr.h


_ _SYNSOFT_GEN_OPAQUE(HBuffStr); // Generates a unique handle


HBuffStr BufferStore_Create(Size siBuffer, UInt32 cBuffers);


void     BufferStore_Destroy(HBuffStr hbs);


UInt32   BufferStore_Allocate( HBuffStr hbs, PPVoid buffers


                             , UInt32 cRequest);


UInt32   BufferStore_Share( HBuffStr hbs, PPVoid srcBuffers


                          , PPVoid destBuffers, UInt32 cShare);


void     BufferStore_Deallocate( HBuffStr hbs, PPVoid buffers


                               , UInt32 cReturn);


. . .



This API is implemented by translating HBuffStr handles to pointers to an internal class BufferStore_,[7] as in:

[7] It would be better named as BufferStoreImpl, but I'm an underscore addict. Don't copy me!



// In MLBfrStr.cpp


UInt32 BufferStore_Allocate( HBuffStr hbs, PPVoid buffers


                           , UInt32 cAllocate)


{


  BufferStore_ *bs = BufferStore_::HandleToPointer(hbs);


  return bs->Allocate(buffers, cAllocate);


}



This is a lot of brain dead boilerplate, and it also has a slight cost in efficiency. But it allows us to implement the class in C++, while maintaining a portable C interface. We can also use it in C++ form, because the header file also contains the code shown in Listing 7.5.

Listing 7.5.


// MLBfrStr.h


#ifdef _ _cplusplus


extern "C++" {


#endif /* _ _cplusplus */





class BufferStore


{


  . . .


  void  Deallocate(PPVoid buffers, UInt32 cReturn)


  {


    BufferStore_Deallocate(m_hbs, buffers, cReturn);


  }


  . . .


private:


  HbuffStr m_hbs;


};





#ifdef _ _cplusplus


} /* extern "C++" */


#endif /* _ _cplusplus */



Now we have C++ on both sides, communicating across a portable C interface. It's actually the Bridge pattern [Gamm1995], so you can convince yourself that you're being terribly modern if it makes up for the drudge of implementing it.

The cost is a small time penalty due to the indirection of the external class holding a handle to the internal class. However, in the main this technique is reserved for meatier classes, and therefore that small cost is not significant. (I accept that this may be a self-fulfilling prophecy; I must confess that the ABI issue has percolated a lot of my thinking on C++ over the last decade.)

7.4.5 Implementation-defined Pitfalls

So far, the picture painted is reasonably rosy, but there are still a couple of things that can spoil the party.

First, on some operating systems, there can be different calling conventions. A full discussion of these calling conventions is outside the scope of this book, but it should be clear that if a function has a different binary name, or uses the stack in a different way, that it will break the ABI techniques that we've developed. Therefore, where necessary, the function calling conventions must form part of the ABI specification. Thus, you'd expect to see something like that shown in Listing 7.6.

Listing 7.6.


// extern_c.h


#ifdef WIN32


# define MY_CALLCONV  _ _cdecl


#else /* ? operating system */


# define MY_CALLCONV


#endif /* operating system */





#ifdef _ _cplusplus


extern "C" {


#endif /* _ _cplusplus */


  int MY_CALLCONV func1();


  int MY_CALLCONV func2(int p1, int p2, int p3 /* = -1 */);


#ifdef _ _cplusplus


} /* extern "C" */


. . .


#endif /* _ _cplusplus */



A similar situation exists with respect to the sizes of the types used by your ABI functions. You have no guarantee that the sizes of types in one compiler will be the same as those for another. If you use a type, say long, that is interpreted differently by different compilers for the same operating system, you are in trouble. This is where fixed-sized types (see Chapter 13) are very useful, and my policy is to only use fixed-sized types in ABI functions.

In practice, this problem is rare with integral types, but it is very common for compilers to differ with respect to floating-point or character types. For example, there is considerable disagreement between compilers on the size of the long double type. Some (e.g., Borland, Digital Mars, GCC, and Intel) conform to the IEEE 754 standard [Kaha1998] and define it as an 80-bit type, whereas others define it to be 64-bits (and thus the same size as double). In the absence of a guarantee, you are strongly advised to err on the side of caution.

A related problem which you're much more likely to fall foul of is the packing of structures. Since different compilers can have different default packing behaviour, we must explicitly stipulate the packing size for any structures that will be shared across our ABI. As with the calling conventions, this can involve a lot of pre-processor gunk, using common, but non-standard, packing #pragmas, as shown in Listing 7.7.

Listing 7.7.


#if defined(ACMELIB_COMPILER_IS_ABC)


# pragma packing 1


#elif defined(ACMELIB_COMPILER_IS_IJK)


# pragma pack(push, 1)


#elif . . .


  . . .


struct abi_struct


{


  int   i;


  short s;


  char  ar[5];


};


#if defined(ACMELIB_COMPILER_IS_ABC)


# pragma packing default


#elif defined(ACMELIB_COMPILER_IS_IJK)


# pragma pack(pop)


#elif . . .



A good way to get around this, if all compilers are suitably similar in packing semantics, is to define the pushing and popping pragmas in their own include files, which can help maintainability and readability:



#include <acmelib_pack_push_1.h>


struct abi_struct


{


  . . .


};


#include <acmelib_pack_pop_1.h>



There are certainly a lot of practical problems when trying to get our ABI, but we're imperfect practitioners, and we're not going to lie down in the face of a bit of adversity. In the next chapter we're going to see how we can support more of C++ in a portable fashion.


      Previous section   Next section