Gotcha #32: Misunderstanding Pointer-to-Pointer-to-Const Conversion

The happily simple state of conversions that holds for pointer to const does not hold in the case of pointer to pointer to const. Consider an attempt to convert a pointer to a pointer to a char to a pointer to a pointer to a const char (that is, to convert char ** to const char **):



char **ppc; 


const char **ppcc = ppc; // error!

It looks harmless, but, like many harmless-looking conversions, it opens the door to a subversion of the type system:



const T t = init; 


T *pt;


const T **ppt = &pt; // error, fortunately


*ppt = &t;  // put a const T * into a T *!


*pt = value; // trash t!

This compelling subject is treated in section 4.4 of the standard, under "Qualification Conversions." (Technically, const and volatile are known in C as "type-qualifiers," but the C++ standard tends to refer to them as "cv-qualifiers." I tend to refer to them as type-qualifiers.) There we find the following simple rules for determining convertibility:

A conversion can add cv-qualifiers at levels other than the first in multi-level pointers, subject to the following rules:

Two pointer types T1 and T2 are similar if there exists a type T and integer n > 0 such that:

T1 is cv1 , 0 pointer to cv1 , 1 pointer to … cv1 , n - 1 pointer to cv1 , n T

and

T2 is cv2 , 0 pointer to cv2 , 1 pointer to … cv2 , n - 1 pointer to cv2 , n T

where each cvi , j is const, volatile, const volatile, or nothing.

In other words, two pointers are similar if they have the same base type and have the same number of *'s. So, for example, the types char * const ** and const char ***const are similar, but int * const * and int *** are not.

The n-tuple of cv-qualifiers after the first in a pointer type, e.g., cv1 , 1 , cv1 , 2 , … , cv1 , n in the pointer type T1, is called the cv-qualification signature of the pointer type. An expression of type T1 can be converted to type T2 if and only if the following conditions are satisfied:

The pointer types are similar.

The for every j > 0, if const is in cv1 , j then const is in cv2 , j , and similarly for volatile.

The if the cv1 , j and cv2 , j are different, then const is in every cv2 , k for 0 < k < j.

Armed with these rules—and a little patience—we can determine the legality of pointer conversions such as the following:



int * * * const cnnn = 0; 


   // n==3, signature == none, none, none


int * * const * ncnn = 0;


   // n==3, signature == const, none, none


int * const * * nncn = 0;


   // signature == none, const, none


int * const * const * nccn = 0;


   // signature == const, const, none


const int * * * nnnc = 0;


   // signature == none, none, const





// examples of application of rules


ncnn = cnnn; // OK


nncn = cnnn; // error!


nccn = cnnn; // OK


ncnn = cnnn; // OK


nnnc = cnnn; // error!

These rules may seem esoteric, but their use arises fairly often. Consider the following common situation:



extern char *namesOfPeople[];


for( const char **currentName = namesOfPeople; // error!


        *currentName; currentName++ ) // . . .

In my experience, the typical response to this error is to file a bug report with the compiler vendor, cast away the error, and dump core later on. As usual, the compiler is right and the developer is not.

Let's reconsider a more specific version of our earlier example:



typedef int T; 


const T t = 12345;


T *pt;


const T **ppt = (const T **)&pt; // an evil cast!


*ppt = &t;  // put a const T * into a T *!


*pt = 54321; // trash t!

The truly tragic aspect of this code is that the bug may remain undetected for years before manifesting itself under simple maintenance. For example, we can use the value of t:



cout << t; // output 12345, probably

Because the compiler may freely substitute the initializer of a constant for the constant itself, this statement is likely to output the value 12345 even after the value of the constant has been changed to 54321. Later, a slightly different use of t will unveil the bug:



const T *pct = &t; 


// . . .


cout << t; // output 12345


cout << *pct; // output 54321!

It's often better design to avoid the complexities of pointers to pointers through use of references or the standard library. For example, it's common in C to pass the address of a pointer (that is, a pointer to a pointer) to modify the value of the pointer:

gotcha32/gettoken.cpp



// get_token returns a pointer to the next sequence of 


// characters bounded by characters in ws.


// The argument pointer is updated to point past the


// returned token.


char *get_token( char **s, char *ws = " \t\n" ) {


   char *p;


   do


       for( p = ws; *p && **s != *p; p++ );


   while( *p ? *(*s)++ : 0 );


   char *ret = *s;


   do


       for( p = ws; *p && **s != *p; p++ );


   while( *p ? 0 : **s ? (*s)++ : 0 );


   if( **s ) {


       **s = '\0';


       ++*s;


   }


   return ret;


}





extern char *getInputBuffer();


char *tokens = getInputBuffer();


// . . .


while( *tokens )


   cout << get_token( &tokens ) << endl;

In C++, we prefer to pass the pointer argument as a reference to non-constant. This cleans up the implementation of the function somewhat and, more important, makes its use less clumsy:

gotcha32/gettoken.cpp



char *get_token( char *&s, char *ws = " \t\n" ) { 


   char *p;


   do


       for( p = ws; *p && *s != *p; p++ );


   while( *p ? *s++ : 0 );


   char *ret = s;


   do


       for( p = ws; *p && *s != *p; p++ );


   while( *p ? 0 : *s ? s++ : 0 );


   if( *s ) *s++ = '\0';


   return ret;


}


// . . .


while( *tokens )


   cout << get_token( tokens ) << endl;

Our original example can be more safely rendered with standard library components:



extern vector<string> namesOfPeople;

[ Team LiB ]