Pointers and Arrays

Arrays are passed around as pointers. In fact, a C++ array is implemented as a pointer to the first element of the array. In practical terms it means that we can access a pointer as if it were an array--that is using an index. Conversely, we can use a pointer to access elements of an array. We can increment a pointer to move from one element of the array to the next. Of course, in both cases it is our responsibility to make sure that the pointer actually points to an element of an array.

A string is a good example of an array. There is a function in the standard library called strlen that calculates the length of a null terminated string. Let’s write our own implementation of this function, which we will call StrLen

int StrLen (char const str [] ) { for (int i = 0; str [i] != '\0'; ++i) continue; return i; }

The continue keyword is used here instead of an empty body of the loop. It's less error prone this way.

Here’s the main procedure that passes an array to StrLen:

int main () { char aString [] = "the long string"; int len = StrLen (aString); cout << "The length of " << aString << " is " << len; }

We are scanning the string for a terminating null and returning the index of this null. Pretty obvious, isn’t it?

Here’s a more traditional "optimized" version:

int StrLen (char const * pStr) { char const * p = pStr; while (*p++); return p - pStr - 1; }

We initialize p to point to the beginning of the string. The while loop is a little cryptic. We dereference the pointer, test it for Boolean truth and post-increment it, all in one statement. If the character obtained by dereferencing the pointer is different from zero (zero being equivalent to Boolean false) we will continue looping. The post-increment operator moves the pointer to the next position in the array, but only after it has been used in the expression (yielding true of false).

Figure 2. The pointer p initially points at the first character of the string "Hi!" at address 0xe04 (I use hexadecimal notation for addresses). Subsequent increments move it through the characters of the string until the null character is reached and processed.

By the way, there is also a pre-increment operator that is written in front of a variable, ++p. It increments the variable before its value is used in the expression.

Increment Operators (acting on i)
i++ post-increment
++i pre-increment
Decrement Operators (acting on i)
i-- post-decrement
--i pre-decrement

In the spirit of terseness, I haven’t bothered using the continue statement in the empty body of the loop—the semicolon (denoting the empty statement) is considered sufficient for the old-school C programmers.

Finally, we subtract the two pointers to get the number of array elements that we have gone through; and subtract one, since we have overshot the null character (the last test accesses the null character, and then it increments the pointer anyway). By the way, I never get this part right the first time around. I had to pay the penalty of an additional edit-compile-run cycle. If you have problems understanding this second implementation, you’re lucky. I won’t have to convince you any further not to write code like this.

The question is: Will you prefer to write the simpler and more readable index implementation of procedures like StrLen after seeing both versions? If you have answered yes, you may go directly to the following paragraph. If you want more gory details, click here and see some assembly code.

Soapbox

The art of programming is in a very peculiar situation. It is developing so fast, that people who started programming when C was in its infancy are still very active in the field. In other sciences a lot of progress was made through natural attrition. The computer revolution happened well within one generation. Granted, a lot of programmers made enough money to be able to afford doing volunteer work for the rest of their lives. Still, many others are carrying around their old (although only a few years old) bag of tricks that they learned when they were programming XT’s with 64k memory.

New programmers learn programming from the classics like Kernighan and Ritchie’s "The C programming language." It’s a great book, don’t get me wrong, but it teaches the programming style of the times long gone.

The highest authority in algorithms and data structures is Donald Knuth’s great classic "The Art of Programming." It’s a beautiful and very thorough series of scientific books. However, I’ve seen C implementations of quicksort that were based on the algorithms from these books. They were pre-structured-programming monstrosities.

If your compiler is unable to optimize the human readable, maintainable version of the algorithm, and you have to double as a human compiler-- buy a new compiler! Nobody can afford human compilers any more. So, have mercy on yourself and your fellow programmers who will have to look at your code.

Don’t use pointers where an index will do.