Previous section   Next section

Imperfect C++ Practical Solutions for Real-Life Programming
By Matthew Wilson
Table of Contents
Chapter 17.  Syntax


17.4. Variable Notation

This section is much less a didactic diatribe than an opportunity to explain my class member notation, so that it'll not be an issue through the rest of the book.

17.4.1 Hungarian Notation

I'm going to be as brief as possible since this subject is like opening Pandora's box! I don't believe that decorating a variable with its type is particularly helpful, and I do think that it is a porting nightmare. Hungarian is intended to provide readers of code with rich information regarding the variable, as in:



short      sMaxHandleIndex; // s prefix denotes short


char const **ppcszEnvBlock; // ppcsz denotes pointer to pointer to const


                            // char representing a zero terminated string



What happens when the code containing sMaxHandleIndex is ported to another architecture on which the maximum number of handles can be a larger range than can be expressed with short? The type might be changed to long, in which case the variable name decoration is now an outright lie.



long sMaxHandleIndex;



Clearly porting undermines the whole raison d'être of Hungarian notation. The other part of it is that knowing every little bit about a variable's type is often highly irrelevant, and can be quite detrimental to readability.



typedef map<string, map<string, int> >  string_2_string_2_int_map_map_t;





string_2_string_2_int_map_map_t         s2s2immIncludesDependencyTree;



Don't laugh; I've seen this (with actual names changed, of course) in real code! s2s2imm is not really a help here, is it? This is incredibly brittle, not to mention almost impossible to read. Much better as



typedef map<string, map<string, int> >  string_2_string_2_int_map_map_t;





string_2_string_2_int_map_map_t         includesDependencyTree;



or



typedef map<string, map<string, int> >  IncludeDependencyTree_t;





IncludeDependencyTree_t                 includesDependencyTree;



Most people now completely eschew any form of Hungarian notation as a consequence of these issues. However, being an upstream swimmer, I actually use a restricted form of prefixing in my own code, which you'll see throughout the book. I'm not going to recommend that you do the same, so this item is merely an explanation, and an attempt to save you the effort of e-mailing me to tell me how old-hat my code is.

Remember I said that knowing the type of a variable is superfluous and nonportable. However, I think it's often valuable to know the purpose of a variable. For example, when dealing with character strings, one can be dealing with different character types, for example, char and wchar_t. Most often, when quantifying the sizes of buffers of such types, functions take or return values measured in terms of the number of characters. However, sometimes they need to measure in terms of the number of bytes. Failure to be mindful of the difference between these two concepts can be very costly in terms of program (and career) crashes. Hence I use the prefixes cb and cch, which denote count-of-bytes and count-of-characters, respectively. The prefixes in no way indicate the type of such variables—they could be int, short, long, whatever—so there are no porting issues, but they do indicate their purpose, which improves readability.

I don't expect you to agree with this viewpoint and adopt these conventions, and I'm not even suggesting that they are "the best approach." Wherever possible, it's better to have the variable's type itself denote its purpose (e.g., I prefer byte_t over unsigned char; section 13.1), but there are times when a bit more information is needed.

17.4.2 Member Variables

Most people decorate member variables in some fashion to distinguish them from nonmember variables. However, there are several schemes employed.

The most basic form is to not decorate at all, as in:



class X


{


public:


  void SetValue(int value)


  {


    this->value = value;


  }


private:


  int value;


};



I know several chaps who like this form, but I think it's an accident waiting to happen. It's far too easy to omit the this in code, and have situations such as:



void SetValue(int value)


{


  value = value; // Does nothing; "this" is unchanged


}



Since assignment of a variable to itself is allowed (except where it's prevented by proscribing access to the copy assignment operator; see section 14.2.4), the compiler will happily do what you say, and your code is buggy. Of course, it is possible to make the arguments const in the method implementation, as in:



void SetValue(int const value)


{


  value = value; // Compile error


}



But support for this from compilers is not universal, and it's really just answering an awful problem with an awkward solution.

I occasionally use the undecorated form in trivial structures where all the members are public and there are very few and simple methods (perhaps just a constructor), but in class types of any sophistication I would not do so.

There are four other schemes that I know of. The first is to prefix the member variables with an underscore, as in:



void SetValue(int value)


{


  _value = value;


}



However, the standard reserves identifiers with a prefix underscore in the name global namespace (C++-98: 17.4.3.1.2) and ::std, so I think using them anywhere just represents a bad habit (albeit one I've still not completely broken). An alternative to this is to postfix an underscore, as in:



void SetValue(int value)


{


  value_ = value;


}



which is legal. However, both these forms are too subtle for my tastes. I prefer the convention popularized by MFC,[7] which is to prefix with m_, as in:

[7] You see, there is something good about MFC!



void SetValue(int value)


{


  m_value = value;


}



This is similar to the old C struct member tagging, for example, struct tm's members tm_hour, tm_wday, and so on, which were introduced because in early versions of C all member names were in a single namespace. However, it is generic, and therefore consistent. In fact, I like it so much that I use variations on the theme, sm_ for static member, g_ for global variables,[8] and s_ for static variables, as in:

[8] Of course, global variables are so evil that I only get to use this one about once a year.



int InitOnce(int val)


{


  static int s_val = val;


  return s_val;


}





class Y


{


  . . .


  static int sm_value;


};



These are the notations you'll see used in code throughout the book.


      Previous section   Next section