Imperfect C++ Practical Solutions for Real-Life Programming By Matthew Wilson
	Table of Contents

	Chapter 14. Arrays and Pointers

14.7. Cannot Have Multidimensional Arrays

This one's an overt part of the language, so I'm not going to try and soften you up before declaring:

Imperfection: C++ does not support multidimensional arrays.^[10]

^[10] The C99 standard introduced Variable Length Arrays (VLAs) into the C language, which support multidimensional arrays whose dimension extents can be determination at run time. We look in detail at the reasons why VLAs are not terribly useful for C++ in section 32.2, not the least of which is that they are not part of the standard.

Okay, okay, I've done a tabloid-journalism-101 on you, omitting some of the truth for a good headline. To be precise, both C and C++ do support multidimensional arrays. The standard (C++-98 8.3.4.3) states, "[W]hen several array...specifications are adjacent a multidimensional array is created." However, it also says, "[T]he constant expressions that specify the bounds...can be omitted only for the first [dimension]," so all but the left-most dimensions must have a constant (at compile-time) extent. Hence:



void process3dArray(int ar[][10][20]); // Ok


void process3dArray(int ar[][][]);     // Error! Confused compiler

For me, this is a bona fide imperfection, since I don't think I've ever wanted to use a multidimensional array in a serious application where I didn't want flexibility in more than the most significant dimension. In such cases, one either has to find workarounds (manual pointer calculation hackery, in other words), or stipulate maximum fixed dimension extents within which the actual dynamic extents must lie, or use multidimensional array classes.

Built-in arrays in C and C++ use a contiguous layout scheme whereby the storage allocated for each element of a given dimension contains the elements of the next most significant dimension in ascending order. An array a3[2][3][4] actually has the following shown in Figure 14.1.

Figure 14.1.



Subscripts     Address


a3[0][0][0]    a3 + 0


a3[0][0][1]    a3 + 1


a3[0][0][2]    a3 + 2


a3[0][0][3]    a3 + 3


a3[0][1][0]    a3 + 4


a3[0][1][1]    a3 + 5


a3[0][1][3]    a3 + 7


a3[0][2][0]    a3 + 8


a3[0][2][1]    a3 + 9


a3[0][2][2]    a3 + 10


a3[0][2][3]    a3 + 11


a3[1][0][0]    a3 + 12


a3[1][0][1]    a3 + 13


a3[1][0][2]    a3 + 14


a3[1][0][3]    a3 + 15


a3[1][1][0]    a3 + 16


a3[1][1][1]    a3 + 17


a3[1][1][2]    a3 + 18


a3[1][1][3]    a3 + 19


a3[1][2][0]    a3 + 20


a3[1][2][1]    a3 + 21


a3[1][2][2]    a3 + 22


a3[1][2][3]    a3 + 23

This contiguous layout^[11] is very useful, because it allows array slices (subarrays) to be passed simply by taking the address of the requisite part. For example, some code that manipulates a two-dimensional array of dimensions [3][4] can also manipulate a3[0] or a3[1], as shown in Listing 14.7.

^[11] Note that this ordering is incompatible with Fortran ordering, which uses right-most-significant ordering. (See section 33.2.3.)

Listing 14.7.



void print_array(int (*pa2)[3][4]);





int    a3[2][3][4]; // A 3-D array of extents (2,3,4)


int    a2[3][4];    // A 2-D array of extents (3,4)


int (*pa2)[3][4]; // Ptr to a 2-D array of extents (3,4)





pa2 = &a2;        // Point to the 2-D array


pa2 = &a3[0];     // Point to a slice of the 3-D array





print_array(pa2);     // passing pointer-to-array


print_array(&a2);     // a2


print_array(&a3[0]);  // A slice of a3


print_array(&a3[1]);  // Another slice of a3

This example shows one case where an array does not break down into a pointer when being passed to a function (see section 14.4), but that's because the function parameter is de fined as a pointer to an array rather than an array that, as we now know, can be treated as a pointer. Confused yet?

The array layout scheme explains why multidimensional arrays must have all but the most significant dimension fixed: the compiler needs to know the sizes of all other dimensions, so that it can generate the correct offsets when translating from the subscript form (see section 14.2.1) to calculate the actual location to manipulate. For example, the address of element a3[1][0][3] is actually decomposed as follows, where the notional constants D1 and D2 represent the extents of the two least significant dimensions:



a3[1][0] + 3


(a3[1] + 0 * D2) + 3


((a3 + 1 * D1 * D2) + 0 * 4) + 3

Unless the values of these extents are known to the compiler, it cannot calculate the actual offset, which in this case is:



a3 + 15

We saw in section 14.2 that arrays decay into pointers. We need to refine that rule somewhat. The most significant dimension of an array is almost always interpreted as a pointer by the compiler. I grant you this is a bit chicken-and-egg, but that's our funky language(s).

Okay, so the most significant dimension decays, hence we can rewrite the first form as:



void process3dArray(int (*ar)[10][20]);

which says that process3dArray() takes a pointer to a two-dimensional array of dimensions 10 and 20. The compiler clearly still knows the dimensions of the second and third dimensions, and can use that array without getting confused. However, the second, illegal, form decays into:



void process3dArray(int (*ar)[][]);

which equally clearly contains no information about the extents of the two least significant dimensions. The compiler would have to be clairvoyant to be able to deduce the calling context's array dimensions and to manipulate ar appropriately. The only way to make use of it as a three-dimensional array is to provide process3dArray() with the extents of the three (or at least two, anyway) dimensions. This can be done at compile time, with constants, but then one might as well just use the first form with the explicit dimensions. Alternatively, it can be supplied by global variables, which is very poor form, so we'd use additional parameters, as in:



void process3dArray( int *ar, size_t extent0, size_t extent1


                   , size_t extent2);

Hardly succinct is it? Nor is it respectful of encapsulation, or particularly maintainable. Thankfully, it's reasonably straightforward to come up with a very nice solution. I won't say it's easy, because balancing expressiveness, flexibility, and efficiency for such things is no mean feat. Naturally, we'll define a template class. The details are to be found in Chapter 33—I've got to keep some powder dry for now, or you'll never make it to the back cover—but it's worth pointing out here that it relies on the ability to meaningfully take N dimensional slices from N+1 dimensional arrays, as shown earlier.