Gotcha #42: Temporary Initialization of Formal Arguments

Consider a String class with equality operators:



class String { 


 public:


   String( const char * = "" );


   ~String();


   friend bool operator ==( const String &, const String & );


   friend bool operator !=( const String &, const String & );


   // . . .


 private:


   char *s_;


};


inline bool


operator ==( const String &a, const String &b )


   { return strcmp( a.s_, b.s_ ) == 0; }





inline bool


operator !=(const String &a, const String &b )


   { return !(a == b); }

Notice that this particular design employs a nonexplicit single-argument constructor and non-member equality operators. We are, therefore, inviting our users to take advantage of implicit conversions to simplify their code:



String s( "Hello, World!" ); 


String t( "Yo!" );


if( s == t ) {


   //  . . .


}


else if( s == "Howdy!" ) { // implicit conversion


   //  . . .


}

The first condition, s == t, is efficient. The two reference formal arguments of operator == are initialized with s and t, and strcmp is used to perform the comparison. If the compiler chooses to inline the call to operator == (it probably will, unless a heavy-duty debugging flag is turned on), the runtime effect will be a simple call to strcmp.

The second condition, s == "Howdy!", is less efficient, though correct. To initialize the second argument of the call to operator ==, the compiler must create a temporary String object and initialize it with the character string literal "Howdy!". This temporary is then used to initialize the argument. After the function returns, the temporary must be destroyed. The effect of the call is something like this:



String temp( "Howdy!" ); 


bool result = operator ==( s, temp );


temp.~String();


if( result ) {


   // . . .


}

In this case, the convenience of the implicit conversion may well be worth the extra expense, since its presence renders both the code that implements the String class and the user code short and clear.

However, the implicit conversion is not acceptable on at least two occasions. The first is, of course, the case where the conversions are heavily used and are causing significant size or speed problems. The second is when the availability of an implicit conversion from a const char * to a String is causing ambiguity and complexity elsewhere in the use of Strings, and the designer of the String class wishes to address these problems by making the String constructor explicit.

Overloading the String equality operators easily solves this problem:



class String { 


 public:


   explicit String( const char * = "" );


   ~String();


   friend bool operator ==( const String &, const String & );


   friend bool operator !=( const String &, const String & );


   friend bool operator ==( const String &, const char * );


   friend bool operator !=( const String &, const char * );


   friend bool operator ==( const char *, const String & );


   friend bool operator !=( const char *, const String & );


   // . . .


};

Now any legal combination of arguments for the operation will result in an exact match, and the compiler will generate no temporary String objects. Unfortunately, the String class is now larger and harder to understand, so this approach to optimization is usually appropriate only after profiling reveals the need.

A common error committed by C++ novices is to pass class objects by value when passing by reference would be preferable. Consider a function that takes a String argument:



String munge( String s ) { 


   // munge s . . .


   return s;


}


// . . .


String t( "Munge Me" );


t = munge( t );

It's hard to find anything nice to say about this code, yet such code is common in many novice attempts to use C++. The call to munge requires a copy construction of the s formal argument as well as a copy construction of the return value and a destruction of the local s. Since we're assigning the munged t back to itself, we might expect that the assignment operator will recognize that and perform a no-op. No such luck. The compiler is required to dump the return value of munge into a temporary (which must be destroyed later), so the assignment will not be optimized. So we're looking at a total of six function calls.

A better approach is to rewrite the munge function to use an alias for the String it will munge:



void munge( String &s ) { 


   // munge s . . .


}


// . . .


munge( t );

One function call. The two functions have slightly different meanings, in that any munging performed on s is reflected immediately in the actual argument t rather than on return. (This difference might be noticeable if an exception or interrupt were to occur within munge or if munge should call another function that referenced t.) However, the overall complexity is reduced, and the code is smaller and faster.

Passing by reference is particularly important when implementing templates, since it's not possible to predict in advance the expense of argument passing for a particular instantiation:



template <typename T> 


bool operator >( const T &a, const T &b )


   { return b < a; }

Passing an argument by reference has a low, fixed cost that doesn't vary from argument to argument. It may be that some arguments, such as predefined types and small, simple class types, are more efficiently passed by value. If these cases are important, the template can be overloaded (if it's a function template) or specialized (if it's a class template).

Additionally, convention sometimes encourages passing by value. For example, in the C++ standard template library, it's conventional to pass "function objects" by value. (A function object is an object of a class that overloads the function call operator. It's just a class object like any other class object, but it allows one to use it with function call syntax.)

For example, we can declare a function object to serve as a "predicate": a function that answers a yes-or-no question about its argument:



struct IsEven : public std::unary_function<int,bool> { 


   bool operator ()( int a )


       { return !(a & 1); }


};

An IsEven object has no data members, no virtual functions, and no constructor or destructor. Passing such an object by value is inexpensive (and often free). In fact, it's considered good form when using the STL to pass function objects as anonymous temporaries:



extern int a[n]; 


int *thatsOdd = partition( a, a+n, IsEven() );

The expression IsEven() creates an anonymous temporary object of type IsEven, which is then passed by value to the partition algorithm (see Gotcha #43). Of course, this convention presumes the additional convention that function objects used with the STL will be small and efficiently passed by value.

[ Team LiB ]