Previous section   Next section

Imperfect C++ Practical Solutions for Real-Life Programming
By Matthew Wilson
Table of Contents
Chapter 27.  Subscript Operators


27.2. Handling Errors

There are two things we need to do with indexing errors: detect them and handle them. But before we decide how to do either of these things, we need to decide whether to do them at all.

The standard library deque, basic_string, and vector sequence containers provide two ways of getting at individual elements: the subscript operator(s) and the at() method(s).[2] Given an invalid index, at() will throw a std::out_of_range exception, whereas the subscript operator has "undefined behavior," which usually means that it does no checking.[3] In other words, it's your responsibility to pass a valid index to the subscript operator.

[2] There are const and non-const versions of each. There are also front() and back() methods, which have the same behavior as the subscript operators.

[3] I'm not aware of a standard library implementation that provides equivalent behavior for subscript operator(s) and at() method(s), but since it's implementation defined, you'd be perfectly within your rights to do so in your own containers.

Clearly, there is a trade-off here. Detecting the error will incur a performance cost. Not detecting it places the burden on users of the classes to ensure that they supply valid indexes. (We'll also see an interesting twist on this tale in Chapter 33 when we need to efficiently acquire the elements of multidimensional array classes.)

For my part, I agree with the approach taken by the standard library and prefer to have the subscript operator maximally efficient. Naturally, even if I didn't agree, structural conformance (see section 20.9) would have molded the expectations of users of my containers such that not conforming to the STL container semantics would only result in their misuse, followed swiftly by disuse.

But there's more to it than throwing in at() and doing nothing in operator [](). We can use an assertion to check in operator[]() in debug mode, in which we're not concerned with the performance costs:



double &DoubleContainer::operator [](size_type index)


{


  assert(index < size());


  return m_buffer[index];


}



Look through the implementation of your favorite compiler vendor's standard library, however, and you're very unlikely to see such a thing. I think the reason for this is that one often wishes to pass an asymmetric range representing the managed sequence to an external function, as in:



void DumpDoubles(double const *first, double const *last);


. . .


DoubleContainer dc;



The most syntactically succinct and, in my opinion, elegant way to do this is as follows:



DumpDoubles(&dc[0], &dc[dc.size()]);



There are other ways to do it, such as the turgid:



DumpDoubles(&dc[0], &dc[0] + dc.size());



or, the indigestible:



DumpDoubles(&dc[0], (&dc[0])[dc.size()]);



but I'd not wish to see either of those. Hence, the reason I suspect standard library containers eschew assertion-based index validation is to avoid mistakenly reporting errors in one-off-the end indexing. Since the second address is never dereferenced, the index is not actually invalid.

You might wonder why they don't check that the index is either valid, or equal to the one-off-the-end index, and I've wondered this myself. When writing container libraries, I always do exactly that, as in:[4]

[4] The only reason I don't write index <= size() is that I'm obsessed with only using the < operator. Reducing one's dependence down to this one operator for all manner of comparison operations can be important in a minority of cases. Alas, I've just absorbed the rule and forgotten the motivating cases, as is my wont, so you'll have to research this yourself if you're motivated to find them.



double &DoubleContainer::operator [](size_type index)


{


  assert(!(size() < index));


  return m_buffer[index];


}



Tellingly, CodeWarrior's excellent MSL standard library implementation has a check in its basic_string::operator []() that provides this very semantic, so I figure I'm in good company.

27.2.1 Subscript Operators versus Iterators

As you hopefully well know [Meye2001], it is illegitimate to use the begin() method of a sequence container to attempt to obtain a pointer to the managed sequence of elements. Although it may work with some containers, for example, vector, on most implementations, it is not guaranteed to work with all, since the sequence containers are free to provide iterators of class types.

In the above example, the DumpDoubles() function takes a pointer range. This is a conceptually different beast from one that takes an iterator range, which, assuming Double Container provided a const_iterator member type, would be expressed as:



void DumpDoubles( DoubleContainer::const_iterator first


                , DoubleContainer::const_iterator last);



The confusion arises when the const_iterator is defined as value_type const*—that is, double const*—since there will, in that case, be only one DumpDoubles(). This is an inconsistency that you, as the user of the DoubleContainer() will have to live with. Just don't be seduced into using begin() (and end()), when you are conceptually dealing with pointers, rather than iterators. It will give you bad habits that will turn on you at inopportune moments (see section 14.2.2).

By the way, use of the end() method does not have the same potential problems as &c[c.size()], since end() is explicitly intended to refer to the one-off-the-end element.


      Previous section   Next section