Previous section   Next section

Imperfect C++ Practical Solutions for Real-Life Programming
By Matthew Wilson
Table of Contents
Chapter 1.  Enforcing Design: Constraints, Contracts, and Assertions


1.2. Compile-Time Contracts: Constraints

This section is devoted to compile-time enforcements, commonly called constraints. Unfortunately, C++ does not provide direct support for constraints.

Imperfection: C++ does not provide direct support for constraints.


Because C++ is an extremely powerful and flexible language, many proponents (including some of the most august C++ thinkers in the business) consider that the implementation of such constraints as described in this section is sufficient. However, as a big fan both of C++ and of constraints, I must demur, for an eminently prosaic reason. I don't buy much of the criticisms espoused by proponents of other languages, but I do think there's no running away from the (sometimes extreme) difficulty in ploughing through the varying and often agonizingly esoteric messages produced by compilers when constraints are violated. If you're the author of the code that's "fired" the constraint, you are often okay, but understanding a many-level template instantiation that has failed from the messages of even very good compilers is next to impossible. In the remainder of this section we look at some constraints, the messages produced when they fail, and some of the measures we can take to make those messages a bit more understandable.

1.2.1 must_have_base()

This one is lifted, almost verbatim, from a comp.lang.c++.moderated newsgroup post by Bjarne Stroustrup, although he called his constraint Has_base. It's also described in [Sutt2002], where it's called IsDerivedFrom. I like to name constraints beginning with must_, so I call it must_have_base.

Listing 1.1.


template< typename D


        , typename B


        >


struct must_have_base


{


  ~must_have_base()


  {


    void(*p)(D*, B*) = constraints;


  }


private:


  static void constraints(D* pd, B* pb)


  {


    pb = pd;


  }


};



It works by requiring that a pointer to the template parameter D can be assigned to a pointer to the template parameter B. It does this in a separate static function, constraints(), in order that it will never be called, so there's no generated code and no runtime cost. The destructor declares a pointer to this function, thereby requiring the compiler at least to evaluate whether the function, and the assignment within it, is valid.

In fact, the constraint is somewhat ill named. If D and B are the same type, then the constraint is still satisfied, so it should probably be called must_have_base_or_be_same_type, or something like that. An alternative would be to further refine must_have_base to reject parameterization when D and B are the same type. Answers on a postcard, please.

Also, if D is not publicly derived from B, then the constraint will fail. In my opinion, the problem here is one of naming, not an inadequacy in the constraint, since the only time I've needed this constraint is for publicly inherited types.[3]

[3] This smells like a circular argument, but I promise it's not.

Because this constraint attempts, within its definition, to perform an action directly representing the semantic of the constraint, the error messages produced by a constraint failure are reasonably straightforward. In fact, all our compilers (see Appendix A) provide quite meaningful messages on failure, either mentioning that the types in the constraint are not related by inheritance or that the pointer types cannot be interconverted or similar.

1.2.2 must_be_subscriptable()

Another useful constraint is to require that a type be subscriptable (see section 14.2), and it's a no-brainer:



template< typename T>


struct must_be_subscriptable


{


  . . .


  static void constraints(T const &T_is_not_subscriptable)


  {


    sizeof(T_is_not_subscriptable[0]);


  }


  . . .



In an attempt to help out with readability, the variable is called T_is_not_subscriptable, which should hopefully give a clue to the hapless victim of the constraint failure. Consider the following example:



struct subs


{


public:


  int operator [](size_t index) const;


}


struct not_subs


{};





must_be_subscriptable<int[]>    a; // int* is subscriptable


must_be_subscriptable<int*>     b; // int* is subscriptable


must_be_subscriptable<subs>     c; // subs is subscriptable


must_be_subscriptable<not_subs> d; // not_subs isn't: compile error



Borland 5.6 gives the incredibly dissembling: "'operator+' not implemented in type '<type>' for arguments of type 'int' in function must_be_subscriptable<not_subs>::constraints(const not_subs &)". When you're fifteen levels deep in a template instantiation, you'd have precious little chance of surviving this without brain strain!

Digital Mars is more correct, but still not very helpful: "Error: array or pointer required before '['; Had: const not_subs".

Some of the others do include the variable name T_is_not_subscriptable. The best message is probably Visual C++, which offers: "binary '[' : 'const struct not_subs' does not define this operator or a conversion to a type acceptable to the predefined operator while compiling class-template member function 'void must_be_subscriptable<struct not_subs>::constraints (const struct not_subs &)".

1.2.3 must_be_subscriptable_as_decayable_pointer()

In Chapter 14 we will take an in-depth look at the relationship between arrays and pointers, and learn that the arcane nature of pointer offsetting results in the legality of offset[pointer] syntax, which is entirely equivalent to the normal pointer[offset] syntax. (This may make Borland's perplexing error message for must_be_subscriptable seem that little bit less nonsensical, but that's going to be of much help to us in tracking down and understanding the constraint violation.) Since this reversal is not valid for class types that overload the subscript operator, a refinement of the must_be_subscriptable type can be made, which can then be used to constrain templates to pointer types only.

Listing 1.2.


template <typename T>


struct must_be_subscriptable_as_decayable_pointer


{


  . . .


  static void constraints(T const &T_is_not_decay_subscriptable)


  {


    sizeof(0[T_is_not_decay_subscriptable]);


  }


  . . .


};



It is axiomatic that anything subscriptable by offset[pointer] will also be subscriptable by pointer[offset], so there's no need to incorporate must_be_subscriptable within must_be_subscriptable_as_decayable_pointer. Where the constraints have different ramification, though, it can be appropriate to use inheritance to bring two constraints together.

Now we can discriminate between pointers and other subscriptable types:



must_be_subscriptable<subs>                      a; // ok


must_be_subscriptable_as_decayable_pointer<subs> b; // compile error



1.2.4 must_be_pod()

We'll see the use of must_be_pod() in a few places throughout the book (see sections 19.5, 19.7, 21.2.1, 32.2.3). This was my first constraint, and was written long before I ever knew what constraints were, or even what POD meant (see Prologue). It's very simple.

The standard (C++-98: 9.5;1) states that "an object of a class with a non-trivial constructor, a non-trivial copy constructor, a non-trivial destructor, or a non-trivial assignment operator cannot be a member of a union." This pretty much serves our requirements, and we could imagine that this would be similar to the constraints we've already seen, with a constraints() method containing a union:



struct must_be_pod


{


  . . .


  static void constraints()


  {


    union


    {


      T   T_is_not_POD_type;


    };


  }


  . . .



Unfortunately, this is an area in which compilers tend to have slightly strange behaviour, so the real definition is not so simple, requiring a lot of preprocessor effort (see section 1.2.6). But the effect is the same.

In section 19.7 we see this constraint used in conjunction with a more specialized one: must_be_pod_or_void(), in order to be able to check that the base types of pointers are not nontrivial class types. This relies on specialization [Vand2003] of the must_be_pod_or_void template, whose general definition is identical to must_be_pod:



template <typename T>


struct must_be_pod_or_void


{


  . . . // Same as must_be_pod


};





template <>


struct must_be_pod_or_void<void>


{


  // Contains nothing, so doesn't trouble the compiler


};



Once again, the messages produced by the compilers when the must_be_pod / must_be_pod_or_void constraints are fired vary considerably:



class NonPOD


{


public:


  virtual ~NonPOD();


};





must_be_pod<int>      a; // int is POD (see Prologue)


must_be_pod<not_subs> b; // not_subs is POD (see Prologue)


must_be_pod<NonPOD>   c; // NonPOD isn't: compile error



In this case, Digital Mars's customary terseness lets us down, since all we get is "Error: union members cannot have ctors or dtors" reported for the offending line in the constraint. Used in a significant project, it would be extremely difficult to track down the location of the offending instantiations. Arguably the most information for the smallest error message in this case was Watcom, with: "Error! E183: col(10) unions cannot have members with constructors; Note! N633: col(10) template class instantiation for 'must_be_pod<NonPOD>' was in: ..\constraints_test.cpp(106) (col 48)".

1.2.5 must_be_same_size()

The last constraint, must_be_same_size(), is another one used later in the book (see sections 21.2.1 and 25.5.5). The constraint class just uses the static assertion invalid array size technique that we'll see shortly (section 1.4.7) to ensure that the sizes of the types are the same.

Listing 1.3.


template< typename T1


        , typename T2


        >


struct must_be_same_size


{


  . . .


private:


  static void constraints()


  {


    const int T1_not_same_size_as_T2


                           = sizeof(T1) == sizeof(T2);


    int       i[T1_not_same_size_as_T2];


  }


};



If the two sizes are not the same, then T1_not_same_size_as_T2 evaluates to the constant value 0, which is an illegal size for the array i.

We saw with must_be_pod_or_void that we need to be able to apply the constraint in circumstances where one or both types might be void. Since sizeof(void) is not a valid expression, we must provide some extra compile-time functionality.

If they are both void it's easy, since we can specialize the template thus:



template <>


struct must_be_same_size<void, void>


{};



To make it work where only one of the types is void, however, is less straightforward. One option would be to use partial specialization [Vand2003], but not all compilers currently in wide use support it. Furthermore, we'd then need to provide the template, that is, one full specialization and two partial specializations—one where the first type is specialized to void, and the other where the second type is—and we'd also have to dream up some way to provide even a half-meaningful compile-time error message. Rather than resort to that, I decided to make void size_of-able. This is extremely easy to do, and does not require partial specialization:

Listing 1.4.


template <typename T>


struct size_of


{


  enum { value = sizeof(T) };


};


template <>


struct size_of<void>


{


  enum { value = 0 };


};



All we now need do is use size_of instead of sizeof in must_be_same_size:



template< . . . >


struct must_be_same_size


{


  . . .


  static void constraints()


  {


    const int T1_must_be_same_size_as_T2


                     = size_of<T1>::value == size_of<T2>::value;


    int       i[T1_must_be_same_size_as_T2];


  }


};



Now we can verify the size of any types:



must_be_same_size<int, int>   a; // ok


must_be_same_size<int, long>  b; // depends on arch/compiler


must_be_same_size<void, void> c; // ok


must_be_same_size<void, int>  d; // compiler error: void "is" 0



As with previous constraints, there is considerable equivocation between the compilers regarding the quantity of information provided to the programmer. Borland and Digital Mars strike out again, with little or no contextual information. In this case I think Intel provides the best output, stating that "zero-length bit field must be unnamed", showing the offending line and providing the two immediate call contexts including the actual types of T1 and T2, all in four lines of compiler output.

1.2.6 Using Constraints

I prefer to use my constraints via macros, whose names take the form constraint_<constraint_name>,[4] for example, constraint_must_have_base(). This is useful in several ways.

[4] For the reasons described in section 12.4.4, it's always a good reason to name macros in uppercase. The reason I didn't do so with the constraint macros is that I wanted to keep the case the same as the constraint types. In hindsight it seems less compelling, and you'd probably want to use uppercase for your own.

First, they're easy to search for unambiguously. To be sure, I reserve must_ for constraints, so it could be argued that this requirement is already met. But it's also a bit more self-documenting. Seeing constraint_must_be_pod() in some code is pretty unambiguous to the reader.

The second reason is that using the macro form provides consistency. Although I've not written any nontemplate constraints, there's nothing preventing anyone from doing so. Furthermore, I find the angle brackets don't add anything but eyestrain to the picture.

The third reason is that if the constraints are defined within a namespace, using them will require tedious qualification. This can be easily hidden within the macro, saving any users of the constraints from the temptation to use naughty using directives (see section 34.2.2).

The last reason is eminently practical. Different compilers do slightly different things with the constraints, which can require a lot of jiggery-pokery. For example, depending on the compiler, the constraint_must_be_pod() is defined in one of three forms:



do { must_be_pod<T>::func_ptr_type const pfn =


                     must_be_pod<T>::constraint(); } while(0)



or



do { int i = sizeof(must_be_pod<T>::constraint()); } while(0)



or



STATIC_ASSERT(sizeof(must_be_pod<T>::constraint()) != 0)



Rather than clutter constraint client code with a lot of evil-looking nonsense, it's easier and neater to just use a macro.

1.2.7 Constraints and TMP

One of my reviewers commented that some of these constraints could have been done via template meta-programming (TMP) techniques,[5] and he's quite correct. For example, the must_be_pointer constraint could be implemented as a static assertion (see section 1.4.7) coupled with the is_pointer_type trait template used in section 33.3.2, as follows:

[5] I've deliberately left TMP techniques out of this book as much as possible because it's a huge subject in itself, and is not directly related to any of the imperfections I've been discussing. You can find plenty of examples in the Boost, Rangelib, and STLSoft libraries included on the CD, as well as in several books [Vand2003, Alex2001].



#define constraint_must_be_pointer(T) \


  STATIC_ASSERT(0 != is_pointer_type<T>::value)



There are several reasons why I do not take this approach. First, the codification of a constraint is always a straightforward matter, because a constraint is merely a simple emulation of the normal behavior to which the type is to be constrained. The same cannot be said of TMP traits, some of which can be exceedingly complex. So constraints are very easy to read.

Second, in many cases, though by no means all, it is easier to persuade compilers to produce moderately digestible messages with constraints than with traits and static assertions.

Finally, there are some cases in which a TMP trait is impossible, or is at least only definable on a small subset of compilers. Perversely, it can seem that the more simple the constraint, the more complex would be an equivalent implementation based on TMP traits—must_be_pod is an excellent example of this.

Herb Sutter demonstrates a combination of constraints and traits in [Sutt2002], and there's no reason why you should not do that in your own work for many concepts; I just prefer to keep them simple and separate.

1.2.8 Constraints: Coda

The constraints presented in this chapter in no way encompass the full range of constraints available. However, they should give you a good idea of the kinds of things that are achievable. The downside to the use of constraints is the same as it is for static assertions (see section 1.4.8), which is that the error messages produced are not particularly easy to understand. Depending on the specific mechanism of the constraint, you can have the reasonable "type cannot be converted" to the downright perplexing "Destructor for 'T' is not accessible in function <non-existent-function>".

Where possible you can ameliorate the distress of unhappy erroneous parameterizers of your code by choosing appropriately named variable and constant names. We've seen examples of this in this section—T_is_not_subscriptable, T_is_not_POD_type and T1_not_same_size_as_T2. Just make sure that the names you choose reflect the failure condition. Pity the poor soul who falls afoul of your constraint and is informed that T_is_valid_type_for_constraint!

There's one very important aspect about this that can't be stressed too much: we have the latitude to upgrade the constraints as we learn more about compile time and/or TMP. You'll probably notice that I'm not exactly a guru of TMP from some of the components described throughout the book, but the point is that, by designing the way in which constraints are represented and used in client classes, we can upgrade the constraints seamlessly when we learn new tricks. I'm not ashamed to admit that I've done this many times, although I probably would be ashamed to show you some of my earlier attempts at writing constraints. (None of these appear in Appendix B because they're not daft enough—not quite!)


      Previous section   Next section