Previous section   Next section

Imperfect C++ Practical Solutions for Real-Life Programming
By Matthew Wilson
Table of Contents
Chapter 33.  Multidimensional Arrays


33.4. Block Access

One thing we commonly do with built-in arrays is to treat them en-bloc, and perform many operations on the elements of an array in a single statement or in a small number of statements. For example, it's easy to initialize them to zero by using memset(), as in:



byte_t  ar[10][10];





memset(&ar[0][0], 0, sizeof(ar));



or the lazy



memset(ar, 0, sizeof(ar));



or the dubious



memset(&ar, 0, sizeof(ar));



or the very dubious



memset(&ar[0], 0, sizeof(ar));



Unfortunately, it's all too easy to do this with class types that provide array semantics, as in:



fixed_array_2d<byte_t, . . . > fa2(10, 10);


boost::multi_array<byte_t, 3>  bma3(boost::extents[10][10][10]);





memset(&fa2[0][0], 0, sizeof(fa2)); // Wrong size!


memset(&bma3, 0, sizeof(bma3));     // Wrong ptr; wrong size!


memset(&fa2[0], 0, sizeof(fa2));    // Wrong ptr; Wrong size!



Table 33.1 shows the permutations of using memset() and sizeof()for various one- and two-dimensional array types. A blank entry denotes that compilation and execution was correct. DNC denotes that the combination did not compile, which is a good thing. E denotes that it compiled and ran, but experienced erroneous behavior, either by writing too many or too few elements, or by overwriting the other member variables of the array instance; whatever the actual problem, E denotes a very bad thing.

Table 33.1. Compatibility of array types with memset() and sizeof().
 

1-dimension

2-dimensions

Array type

ar

&ar

&ar[0]

ar

&ar

&ar[0]

&ar[0][0]

built-in

       

boost::array

DNC

  

DNC

   

static_array

DNC

  

DNC

 

E(O)

 

boost::multi_array

DNC

E(O)

E(S)

DNC

E(O)

E(O)

E(S)

fixed_array

DNC

E(O)

E(S)

DNC

E(O)

E(O)

E(S)


There are two separate problems here. For one thing, it is possible to pass inappropriate things to memset(); since all it needs is a void*, it's all too easy to pass the wrong thing. This is especially dangerous since passing &ar actually works correctly in the case of a built-in array, but results in the overwriting of member data for class type instances.

It's no great surprise to learn that the dynamically sized array classes—boost::multi_array and fixed_array—experience run time problems when used with memset() and sizeof(). When their address is passed to memset(), the member variables are overwritten—E(O). When the address of their first element—&ar[0] for one dimension; &ar[0][0] for two dimensions—is passed, the wrong number of elements are written because the size of the instance does not (in the majority of cases) match the size of the managed elements—E(S).

If we want to write generic code that contains en-bloc manipulation of array types, we clearly have some work ahead of us.

33.4.1 Using std::fill_n()

Let's deal with the inappropriate pointer problem first. We can avoid this problem by using the more type-safe (and size-safe) standard library algorithm std::fill_n() instead of memset(). Rewriting our several examples, this roots out many of the problematic expressions.



byte_t                         ar[10][10];


fixed_array_2d<byte_t, . . . > fa2(10, 10);


boost::multi_array<byte_t, 3>  bma3(boost::extents[10][10][10]);





fill_n(&ar[0][0], dimensionof(ar), 0);    // Ok


fill_n(ar, dimensionof(ar), 0);           // Compile error!





fill_n(&ar, dimensionof(ar), 0);          // Compile error!


fill_n(&ar[0], dimensionof(ar), 0);       // Compile error!


fill_n(&fa2[0][0], dimensionof(fa2), 0)); // Wrong size!


fill_n(&bma3, dimensionof(bma3), 0);      // Compile error!


fill_n(&fa2[0], dimensionof(fa2), 0);     // Compile error!



Table 33.2 shows the permutations of using std::fill_n() and dimensionof()for various one- and two-dimensional array types. Note that we use dimensionof() (see section 14.3) rather than sizeof() because fill_n() takes the number of elements to modify, rather than the number of bytes.

Table 33.2. Compatibility of array types with fill_n() and dimensionof().
 

1

2

Array type

ar

&ar

&ar[0]

ar

&ar

&ar[0]

&ar[0][0]

built-in

 

DNC

 

DNC

DNC

DNC

 

boost::array

DNC

DNC

 

DNC

DNC

DNC

 

static_array

DNC

DNC

 

DNC

DNC

DNC

 

boost::multi_array

DNC

DNC

E(S)

DNC

DNC

DNC

E(S)

fixed_array

DNC

DNC

E(S)

DNC

DNC

DNC

E(S)


You shouldn't worry about a loss of efficiency with std::fill_n(), because good standard library implementations will specialize for use of memset() with single byte types, and we can't use memset()—except where setting all bytes to 0—for larger types anyway.

With std::fill_n(), we turn almost all of the overwrite run time errors into compile-time errors. This is a compelling example of why we should prefer this to memset() as a general rule.

However, we are still specifying the wrong sizes for our dynamically sized arrays, boost::multi_array and fixed_array, which means either too few or too many bytes are likely to be overwritten—an error in either case.

33.4.2 array_size Shim

What we need is a single mechanism to determine the number of elements in any array type. The solution is an attribute shim (see section 20.2), the cunningly entitled array_size(). As with most shims, there are general definitions that handle most cases, along with specific definitions to handle the specific cases. The general definitions of array_size() are



template <typename T>


size_t array_size(T const &)


{


  return 1; // Not an array, so only one element


}


template <typename T, size_t N>


size_t array_size(T (&ar)[N])


{


  return N * array_size(ar[0]); // N * number in next dimension


}



These two handle the potentially infinite dimensionality of built-in arrays and nonarray types. Hence, applying array_size() to int ai[10][30][2][5][6] will result in five calls to the second overload, and one call, the terminating case, to the first overload.

You may be wondering why we're not using compile-time techniques to work out the number of elements. The answer is that we also need to be able to apply to types whose dimensionality is not known until run time. Anyway, it's not a concern unless you want to use the value at compile time, because all decent compilers make mincemeat of the application of the shim to built-in arrays, and simply convert the run time result into a constant in optimized code.

So let's look at how we extend the shim to other types. Defined alongside the fixed_array and static_array classes are the requisite overloads of the shims:



template< typename T, typename A, typename P, bool R>


size array_size(fixed_array_4d<T, A, P, R> const &ar)


{


  return ar.size();


}


template< typename T, size_t N0, typename P, typename M>


size_t array_size(static_array_1d<T, N0, P, M> const &ar)


{


  return N0;


}



We can provide similar overloads for any other classes we wish to use, such as the Boost array classes:



template <typename T, size_t N>


size_t array_size(boost::array<T, N> const &ar)


{


  return N * array_size(ar[0]);


}


template <typename T, size_t N>


size_t array_size(boost::multi_array<T, N> const &ar)


{


  // NOTE: size() only returns most significant dimension


  return ar.num_elements();


}



Now we have a simple and general way to determine the number of elements in any array. The only gotcha is if you do not have a definition of the shim for your array type. But you'll pick this up in your comprehensive testing, won't you?

When used with memset(), we loose all the size-mismatch problems, although the address problems remain, as shown in Table 33.3.

Table 33.3. Compatibility of array types with memset() and array_size().
 

1-dimension

2-dimensions

Array type

ar

&ar

&ar[0]

ar

&ar

&ar[0]

&ar[0][0]

built-in

       

boost::array

DNC

  

DNC

   

static_array

DNC

  

DNC

 

E(O)

 

boost::multi_array

DNC

E(O)

 

DNC

E(O)

E(O)

 

fixed_array

DNC

E(O)

 

DNC

E(O)

E(O)

 


But, as shown in Table 33.4, when used with std::fill_n(), we achieve a perfect set of permutations. All the dubious pointer conversions do not compile, and the rest that do compile produce the correct results.

Table 33.4. Compatibility of array types with std::fill_n() and array_size().
 

1-dimension

2-dimensions

Array type

ar

&ar

&ar[0]

ar

&ar

&ar[0]

&ar[0][0]

built-in

 

DNC

 

DNC

DNC

DNC

 

boost::array

DNC

DNC

 

DNC

DNC

DNC

 

static_array

DNC

DNC

 

DNC

DNC

DNC

 

boost::multi_array

DNC

DNC

 

DNC

DNC

DNC

 

fixed_array

DNC

DNC

 

DNC

DNC

DNC

 


Recommendation: An array size shim should always be used when determining the sizes of arrays, as all other approaches do not generalize for both built-in and user defined array types.



      Previous section   Next section