Previous section   Next section

Imperfect C++ Practical Solutions for Real-Life Programming
By Matthew Wilson
Table of Contents
Chapter 2.  Object Lifetime


2.3. MILs and Boon

There are seven different types of things that you might want to initialize from within a constructor body. They are:

  1. Immediate parent classes.

  2. Virtual base classes.[2]

    [2] But there's no way you'd ever have a virtual base that had any data or constructors, right? [Meye1998, Dewh2003, Stro1997].

  3. Constant member variables.

  4. Reference member variables.

  5. Non-const, nonreference member variables of user-defined type that have nondefault constructors.

  6. Non-const, nonreference member scalar variables; we would think of these as "normal" member variables.

  7. Array member variables.

Of these seven, only the last, array member variables, cannot be initialized in a member initializer list (or MIL), and the first five must be. Normal, non-const, nonreference, scalar member variables can be "initialized" within either the initializer list or within a constructor body. In fact, they undergo assignment, rather than initialization in this case. Although this may appear as initialization to client code of the class, and will not incur any additional instructions in the case of scalar types, the "initialization" will not be done in the same order as the members are declared. In rare cases where you might be relying on member declaration/initialization ordering—which is a bad thing to do (see section 2.3.2)—then you could be in for a nasty surprise. Irrespective of correctness issues in such border cases, there are good reasons to use the initializer list [Stro1997], since it is possible that you may be expensively assigning to something that has already had a nontrivial construction cost, thereby wasting speed for no gain whatsoever.

If you share my preference for const variables, you'll no doubt be making much use of initializer lists as it is, but I would advise everyone to do so at all times. Not only does it help you to avoid making inappropriate use of member variables of nontrivial types (i.e., costly default-construction-plus-assignment), but also it improves consistency: a much undervalued aspect of software development, albeit a distinguishing feature of well-developed software [Kern1999]. It also helps in the presence of exceptions, as we'll see in section 32.2.

In terms of hairshirt programming, MILs also facilitate our preference for const members, as described in the previous subsection. Furthermore, you can avoid code such as the example shown in Listing 2.2, based on a real code-base that I was given to "improve."

Listing 2.2.


class Abc


  : public Base


{


// Members


protected:


  CString  m_str1;


  int      m_int1;


  int      m_int2;


  CString  m_str2;


  CString  m_str3;


  int      m_int3;


  CString  m_str4;


  int      m_int4;


  CString  m_str5;


  int      m_int5;


  int      m_int6;


  . . . // and so it went on


};





Abc::Abc(int i1, int i2, int i3, int i4, int i4


        , LPCTSTR pcsz1, LPCTSTR pcsz2


        , LPCTSTR pcsz3, LPCTSTR pcsz4)


  : Base(int i)


  , m_str1(pcsz1)


  , m_str2(pcsz2)


{


  m_str3 = pcsz3;


  m_int1 = i1;


  m_int2 = i2;


  m_int3 = i3;


  m_str2 = pcsz2;





  . . . // many lines later





  m_int3 = i3;


  m_str2 = pcsz4; // Eek!


  m_int2 = i2;


  m_int6 = i6;





  . . . // and on it went


}



There were over 20 redundant and potentially harmful assignments within the constructor body. In the remainder of the class's implementation, some of the member variables were never altered after construction. Changing them to const immediately found some of these issues. Having highlighted the problem, I then moved everything into the initializer list (using an automated tool), and the compiler nicely pointed out the duplicates. That took all of 10 minutes, and the class saved itself from a significant amount of accreted waste, not to mention a couple of errors.

2.3.1 Getting a Bigger Playpen

One argument against using member initializer lists is that it is a pretty small playpen within which to work. A regular complaint is that it is hard to carry out efficient and effective argument validation/manipulation within the restrictions of member initializer lists. This can be a justification for eschewing both initializer lists and const/reference members. However, I think this is largely incorrect, and with a little imagination we can provide robust, suitably constrained (i.e., we can wear our hairshirt), and efficient initialization. Consider the following class:



class String


{


// Construction


public:


  String(char const *s);





// Members


private:


  char *m_s;


};



In his book C++ Gotchas [Dewh2003], Steve Dewhurst distinguishes the following two possibilities for the implementation of this constructor (within a section where he otherwise broadly espouses the preference of member initialization over assignment):



String::String(char const *s)


  : m_s(strcpy(new char[strlen(s ? s : "") + 1], s ? s : ""))


{}





String::String(char const *s)


{


  if(s == NULL)


  {


    s = "";


  }





  m_s = strcpy(new char[strlen(s) + 1], s);


}



Steve states that the first form is taking things too far—with which I think most would agree—and favors the second. Frankly, I wouldn't write either. Why not put on our hairshirt (as the first makes an attempt to do), but give ourselves a break while we're at it. The solution (Listing 2.3) is very simple.

Listing 2.3.


String


{


  . . .


// Implementation


private:


  static char *create_string_(char const *s);





// Members


private:


  char const *const m_s;


};





/* static */ char *String::create_string_(char const *s)


{


  if(s == NULL)


  {


    s = "";


  }


  return strcpy(new char[strlen(s) + 1], s);


}





String::String(char const *s)


  : m_s(create_string_(s))


{}



Rather than having an unintelligible mess of gobbledegook, or creeping into bad practice, we can achieve clarity of expression and correct initialization by placing the logic into a private static helper function. Seems simple now, doesn't it? Note also that the exception behavior of String's constructor is not changed. This is an important aspect of using this technique.

The String instance may be constructed from either a pointer to a string, or a literal empty string. (As we will see in section 15.4.3, literal constants may or may not be folded—identical literals are merged into one by the linker—so this implementation is partial, and a fuller one would have to address this potential problem.)

Also note the change to the definition of m_s. Since Steve's example didn't show any mutating operations of his String class we can widen the scope of our "privations" to make m_s a constant pointer to a constant string. If we subsequently change the class's definition to including mutating operations to either the contained buffer, or to the buffer pointer, the compiler will remind us that we're violating our initial design decisions. That's perfectly okay; getting such an error doesn't mean we're doing something wrong, simply that we may be doing something that's challenging the original design decisions. The point is that we will be forced to think about it, which can only be a good thing.

There are two other advantages to the new form. A minor one is that it looks cleaner; we can plainly see the intent of the constructor, which is to "create_string_()."

A more significant advantage is that it centralizes the creation of the string contents, which could conceivably be used elsewhere in an expanded definition of String. (Indeed, in real string classes that I've written there are often several constructors that use a single static creation function. Naturally this increases maintainability and size efficiency, without affecting speed efficiency.) It is all too common to see the same logic appear in numerous constructors of the same class. This is sometimes centralized into an Init() method that gets called from within each constructor body, but that then misses the efficiency of initializer lists and reduces the scope for using const/reference members. Using this static-based helper function technique facilitates the centralization of constructor-common operations, while sacrificing neither efficiency nor safety. This technique in effect can be used to simulate the Java facility for calling one constructor from within another.

Note that the method does not have to be static in order for it to achieve the desired effect. However, if it is non-static, it opens the possibility of using member state in the method, such state being undefined because the instance is partway through construction. It is better, therefore, to use a little more hairshirt and always favor the static helper.

We will see some more sophisticated examples of its application in Chapter 11 when we look into adaptive code techniques.

2.3.2 Member-Ordering Dependencies

One of the caveats to using initializer lists is that member variables are initialized in the order of their declaration, regardless of their order in the list. Naturally, the advice ([Stro1997, Dewh2003]) is to list them in the same order to their declaration within the class. Indeed, you should expend diligent efforts in your maintenance work to ensure that changes in the declaration are reflected in the initializer list, and you should check those of other authors when asked to make changes. The possibilities for trouble are boundless [Dewh2003, Meye1998], and all guaranteed to bring unhappiness.



struct Fatal


{


  Fatal(int i)


    : y(i)


    , x(y * 2)


  {}


  int x;


  int y;


};



Despite the seemingly innocuous looking initializer list, instances of Fatal will have arbitrary garbage in their x members, because at the time x is initialized y has not been. You should avoid such dependencies as a rule. Only GCC detects this and issues a warning (when the -Wall option is used).

Notwithstanding that sound advice, this is a book about survival in the real world, wherein it is occasionally appropriate to do such dangerous things. What can the imperfect practitioner do to protect his/her code, when it must rely on the order of member variables, from dangerous changes in maintenance? The answer is to use compile-time assertions (section 1.4). An example of this protection of member ordering was to be found in the original implementation of the auto_buffer template, which we'll see in detail in section 32.2. The constructor contained the following protective assertion:



auto_buffer:: auto_buffer(size_type cItems)


  : m_buffer((space < cItems)


               ? allocator_type::allocate(cItems, 0)


               : m_internal)


  , m_cItems((m_buffer != 0) ? cItems : 0)


{


  STATIC_ASSERT( offsetof(class_type, m_buffer)


                < offsetof(class_type, m_cItems));


  . . .



This code would not compile if the order of the m_buffer and m_cItems members were changed in the class definition. This means that the dubious practice of relying on member initialization order was rendered safe, and the class implementation robust and portable.

Alas, this particular class is not the best example of when it is appropriate to thumb our nose at the law, because it is achieves const member variables by using the offsetof macro, which is itself, in this circumstance, nonstandard, as we discuss in section 2.3.3.

The latest version of this class is resizable, so the advantage of the constness of the m_cItems is moot. Therefore it's probably better to rewrite the constructor as:



auto_buffer:: auto_buffer(size_type cItems)


  : m_buffer((space < cItems)


               ? allocator_type::allocate(cItems, 0)


               : m_internal)


{


  m_cItems = (m_buffer != 0) ? cItems : 0;


  . . .



Now there's no need to police a specific member ordering, and therefore no use of offsetof() in a less-than-legal guise.

2.3.3 offsetof()

The offsetof macro is used to deduce a compile-time constant representing the number of bytes offset of a structure member from the start of the structure. The canonical implementation is as follows:



#define offsetof(S, m)   (size_t)&(((S*)0)->m)



It's an extremely useful thing, and without it, we'd have all kinds of difficulty doing many clever and useful things, for example, the technique for providing properties in C++ (see Chapter 35) would not be as efficient as it is.

Alas, its use is only legal when applied to POD types: the standard states the type to which it is applied "shall be a POD structure or a POD union" (C++-98: 18.1). This means that, as with auto_buffer, using it in class types is not valid, and has potentially undefined behavior. Nonetheless, it is used widely, including several popular libraries, and its use with types such as auto_buffer is perfectly reasonable. The main reason that the standard says it may only be used with POD types is because it would be impossible to have it yield a correct compile-time value when used in multiple virtual inheritance. The current rules are probably overstrict, but even then it is up to an implementation as to how it lays out the memory of types.

Since I'm a pragmatist, I'll continue to use it where I think it's necessary and appropriate, although I'll make sure to take defensive measures: run time and static assertions (see Chapter 1) and testing. If you choose to do the same, just be mindful of the caveats, so that when a language lawyer trumpets your non-conformant code to your colleagues, you can disarm him with your acknowledgment of its non-conformance, and then bamboozle him with the rationale for your using it in the given instance and the list of tested environments on which you've proved your work correct.

2.3.4 MIL: Coda

Apart from cases where the degree of hoop jumping borders on the ridiculous, the advice is to prefer initializer lists wherever appropriate, which is, arrays aside, just about everywhere. I'm always astonished when engineers justify their inconsistent use of assignment over initialization by saying that they want to be consistent with their development-tool's (poorly written) wizard. It seems a great irony, and not a little sad, that a generation of software developers have grown up with a bad habit that has arisen out of a drawback in tools designed to simplify and enhance their practice.


      Previous section   Next section