Imperfect C++ Practical Solutions for Real-Life Programming By Matthew Wilson
	Table of Contents

	Chapter 33. Multidimensional Arrays

33.2. Sized at Run Time

Although we'll see some classes that provide multidimensional arrays of statically determined sizes later in the chapter, the need for such classes is not that great because built-in arrays provide most aspects of such classes for free. The most desirable types of multidimensional arrays are ones whose dimensions are specified at run time.

This is where C/C++'s array layout model serves us well, since all the intermediate instances can be implemented as slice proxies, which are quite efficient: there is no heap allocation, and only the initialization of a few member variables for each subdimension. The alternative would be for each intermediate instance to allocate their own storage, and copy in the values. Not only would this be horribly inefficient, it would also prevent the use of array elements as lvalues.

33.2.1 Variable Length Arrays

As was mentioned in section 32.2.2, Variable Length Arrays (VLAs) provide dynamically sized built-in arrays. However, we learned that only three of our ten compilers (see Appendix A) support VLAs in any form, and only one of those, GCC, supports them for non-POD types.

Unless and until VLAs become a part of standard C++, we cannot count on them to answer our requirements. Even when they do, we'll have a long period where backward compatibility will require that we do not rely on them.

33.2.2 vector< ... vector<T> ... >

One solution I've seen used is to use std::vector to emulate the multiple dimensions, as in:



typedef std::vector<std::vector<int > >   intarray_2d_t;

Unfortunately, this does not create a two-dimensional array; it actually creates an array of arrays, which is quite a different beast. To create an array of 2 x 3, we would have to do the following:



intarray_2d_t  ar(2);


ar[0].resize(3);


ar[1].resize(3);

If we forget to do the additional two resize() calls, then any attempt to access the elements will result in undefined behavior. Worse than this, there's nothing stopping anyone from using different sizes and creating a jagged array:



intarray_2d_t  ar(2);


ar[0].resize(3);


ar[1].resize(2); // Better not use ar[1][2]!!

Clearly, this is an unacceptably fragile solution. Furthermore, it should be equally clear that there are multiple memory allocations occurring: one for the array itself and one for each of its array elements. This can have an important effect on performance, as we'll see in section 33.4.

The final nail in its coffin is that it is not possible to provide STL-like iteration throughout the entire range of elements, making it very difficult to apply all our treasured STL algorithms.

If we wanted to enumerate the contents, we'd have to write a special for_each, like:



template<typename T, typename A1, typename A2, typename F>


F for_each( std::vector<std::vector<T, A1>, A2>::iterator from


          , std::vector<std::vector<T, A1>, A2>::iterator to


          , F                                             fn)


{


  for(; from != to; ++from)


  {


    for_each((*from).begin(), (*from).end(), fn);


  }


  return fn;


}

How d'ya like them apples?

Frankly, this is where the mathematician's instincts need to kick in very strongly: it's not beautiful^[2] and we can confidently deduce that it's bad. We're going to need some types that are specifically written for this purpose.

^[2] And we've not even written the const_iterator version, which would also have to be explicitly provided!

33.2.3 boost::multi_array

The Boost libraries have a dynamically sized array class, called multi_array.^[3] It uses some heavy-duty template mechanisms to provide the ability to represent multiple dimensions within a single template.

^[3] To be honest, I was going to submit the fixed and static arrays described in this chapter to Boost some time ago, but I dallied, and multi_array got in there first. This is a good lesson to anyone who wants to contribute to the development of open source libraries in C++, or indeed in any language; since it's just about impossible to have a novel thought in a community of about 3 million people, speed is of the essence.

Using multi_array requires that you instantiate the template with the element type and the number of dimensions, and then construct instances of your parameterization with the extent of the dimensions, as in:



boost::multi_array<int, 2>  intarray_2d_t(boost::extents[2][3]);

As we'll see in section 33.5, the performance of multi_array is less than optimal, and this probably is an inevitable cost of its complexity and generality.

multi_array suffers from the same range iteration problem as does the vector<…, vector<T> …> solution. For me, this is a serious flaw, although I should concede that from some perspectives it may be that one would wish to deal with a multidimensional array each dimension at a time. Nonetheless, I think it would be better to have a separate mechanism for this, and that begin() and end() would return iterators to the entire range of managed elements, as they do with all the standard library containers, not just the sequence ones.

The news is not all bad, of course. Parameterizing the template is conceptually pleasing from the perspective of client code, insofar as one merely specifies the element type and the number of dimensions. The specification of the array extents is a little less elegant, in that it requires the use of the global extents object, but since it's wrapped up in a namespace and use of it does not result in any race conditions by virtue of its per-instance state, the usual gripes with global variables don't trouble us. multi_array is also resizeable.

Furthermore, the template can support Fortran ordering. In C the left-most dimension represents the largest stride in the array element memory; in Fortran the left-most dimension represents the smallest stride in the array element memory [Lind1994]. This is useful where one may wish to interface with Fortran link-units. The custom array classes I introduce shortly do not do that, although it would be a pretty simple addition.

I've not tested the performance of the Fortran ordering because there's (currently) nothing to test it against, and I don't really think it's likely to bring up significant differences in any case; the costs of reversing the offset calculations should be the same for multi_array as for any other multi-dimensional array class, unless the authors of one of them had a brain-freeze.

33.2.4 fixed_array_1/2/3/4d

Not surprisingly, the motivating factors that made me originally choose to write my own multidimensional arrays as a suite of cooperating template classes, one for each dimensionality, are precisely the things I disagree with in the design of Boost's multi_array. Not that I'm not saying that the approach of multi_array is wrong, merely that for my multidimensional requirements, and for my way of thinking, it is suboptimal. No doubt the authors of multi_array might have equally legitimate misgivings about my approach.

There are four templates, with startlingly original names: fixed_array_1d, fixed_array_2d, fixed_array_3d, and fixed_array_4d.

They're called "fixed_array_*" because each type implements a specific, fixed dimensionality. Naturally, this lends itself to a much simpler implementation than writing a single template to cater for any numbers of dimensions. (Not to mention being more efficient, as we'll see in section 33.5.).

The other functionality that multi_array has over fixed_array is that it is resizable. Again, this is not something that would be too difficult to add to the fixed_array classes, and it's very unlikely it would affect the performance of non-resizing operations, but I've not done so because I've simply never had call to resize a multidimensional array at run time. I'd rather not have to address the conceptually challenging, albeit technically simple, issues of what to do with new or discarded elements and determining what represents a legal resize and what does not.

There are no higher templates because I've never needed any higher. To be honest, I've never used fixed_array_4d in anger, but thought it prudent to go one higher than my needs, in case it's needed; it wouldn't be terribly good to introduce these "great fast, multidimensional array classes" to a client's project, and then to extend the component suite on site, only to bring down the system with untested code. Such is not the way to further your reputation, or the cause of open-source software.

There are several significant features of the fixed_array templates. First, each template maintains a pointer to a single-dimension block that contains its N-dimensional contents. We'll see how this is made to work safely and efficiently shortly.

Second, each class has a dimension_element_type member type. For the one-dimension case, this is the same as the value_type. However, for higher dimensions, it is a type defined from the template of the next lower dimension, as shown in Listing 33.1.

Listing 33.1.



template< typename T          // Value type


        , typename A = . . .  // Allocator


        , typename P = . . .  // Construction policy


        , bool     R = true   // Owns data?


        >


class fixed_array_3d


  : public A


{


public:


  typedef fixed_array_3d<T, A, P, R>     class_type;


  typedef fixed_array_2d<T, A, P, false> dimension_element_type;


  typedef A                              allocator_type;


  . . .


private:


  fixed_array_3d( pointer data, index_type d0, index_type d1


                                             , index_type d2);


public:


  fixed_array_3d(index_type d0, index_type d1, index_type d2);


  fixed_array_3d(index_type d0, index_type d1, index_type d2


                                         , value_type const &t);


  ~fixed_array_3d();


  . . .

Third, the begin() and end() methods return iterators that represent the bounding range of the entire collection of controlled elements. That means that you can apply the same algorithm, for example, for_each(), to any of the array classes, irrespective of dimensionality. If you want to treat the arrays in per-dimension blocks that's easily achieved by taking a slice and applying standard algorithms to the slice.

Fourth, robustness and maintainability are maximized by using the same templates for the subdimensional slices as for the whole arrays. This means that all the subscript calculations, creation of new subdimensional slices, enumeration, and all other methods are shared. The only difference in behavior is how the memory is allocated and managed. If the ownership template parameter—R in the above declaration—is true then the standard constructor allocates the memory, and the destructor releases it. If it is false, then the slice constructor merely copies the pointer it's been given. Note that this behavior is selected at compile time, so there are no efficiency losses by testing for ownership at run time. Listing 33.2 shows two constructors and the destructor for fixed_array_3d, and also shows how static assertions are used to prevent misuse of the two flavors of the templates.

Listing 33.2.



template<typename T, typename A, typename P, bool R>


fixed_array_3d<T, A, P, R>::fixed_array_3d( pointer data


                                          , index_type d0


                                          , index_type d1


                                          , index_type d2)


    : m_data(data)


    , m_d0(d0), m_d1(d1), m_d2(d2)


{


  STATIC_ASSERT(!R);


}


template <typename T, typename A, typename P, bool R>


fixed_array_3d<T, A, P, R>::fixed_array_3d( index_type       d0


                                          , index_type       d1


                                          , index_type       d2


                                          , value_type const &t)


    : m_data(allocator_type::allocate(d0 * d1 * d2, NULL))


    , m_d0(d0), m_d1(d1), m_d2(d2)


{


  STATIC_ASSERT(R);


  array_initialiser<T, A, P>::construct(*this, m_data, size());


}


template <typename T, typename A, typename P, bool R>


fixed_array_3d<T, A, P, R>::~fixed_array_3d()


{


  if(R)


  {


    array_initialiser<T, A, P>::destroy(*this, m_data, size());


    allocator_type::deallocate(m_data, size());


  }


}

Another aid to performance is that the construction policy parameter—P in the above code—is responsible for the determination for how to initialize the memory. This improves performance for basic, that is, POD, types in two ways, by using memset() rather than std::fill_n() to perform the initialization and by omitting the call of "destructors" prior to deallocation of the memory. Both these decisions are carried out inside the array_initialiser helper class, which is shared with the static_array templates we'll meet in section 33.3.2.^[4]

^[4] The source for all the fixed_array classes, the helper class, and the static_array classes discussed in the next section are all included on the CD.

The final aid to efficiency recognizes that even though the overloaded subscript operators are very efficient, there is still the non-zero cost of creating slice proxy objects involved. Therefore, each template is equipped with at() and at_unchecked() methods with the requisite number of parameters for each dimension.



template< . . . >


class fixed_array_3d


{


  . . .


public:


  reference at(index_type i0, index_type i1, index_type i2);


  reference at_unchecked(index_type i0, index_type i1


                                      , index_type i2);


  . . . // and const overloads

The at() methods follow the standard library practice of validating the index and throwing std::out_of_range if invalid. The at_unchecked() methods, however, provide a single unchecked method for accessing an element.^[5] Rather than having three offset calculations and generating two intermediate slice object instances, these methods conduct one offset calculation. Naturally, this proves to be a very efficient mechanism, as we'll see in section 33.5.

^[5] Like operator [], it performs DbC (see section 12.3) precondition testing via assertions in debug builds.

Thus, all statements in the inner loop in Listing 33.3 are equivalent in semantics, but different in cost.

Listing 33.3.



fixed_array_2d<int, . . .>  ar(10, 20); // Construct 10x20 array


int                         v = 0;


for(size_t i = 0; i < ar.dim0(); ++i)


{


  for(size_t j = 0; j < ar.dim1(); ++j, ++v)


  {


    ar[i][j]              = v;


    ar.at(i, j)           = v;


    ar.at_unchecked(i, j) = v;


  }


}