Previous section   Next section

Imperfect C++ Practical Solutions for Real-Life Programming
By Matthew Wilson
Table of Contents
Chapter 31.  Return Value Lifetime


31.3. Solution 1—integer_to_string<>

The basis of all five techniques is the suite of integer_to_string() template functions,[2] which are overloaded to select the appropriate implementation template functions signed_integer_to_string() and unsigned_integer_to_string():

[2] These are part of the STLSoft libraries. The full implementation, along with the test programs, supporting headers and full results, is provided on the CD.



template <typename C>


C const *integer_to_string(C *buf, size_t cchBuf, sint8_t i)


{


  return signed_integer_to_string(buf, cchBuf, i);


}


. . . // and uint8_t, sint16_t, etc.


template <typename C>


C const *integer_to_string(C *buf, size_t cchBuf, uint64_t i)


{


  return unsigned_integer_to_string(buf, cchBuf, i);


}



Separate implementation functions are provided because signed conversion needs to take account of processing negative numbers and is therefore slightly less efficient than unsigned conversion, which does not. The unsigned version is shown in Listing 31.1.

Listing 31.1.


template< typename C


        , typename I


        >


const C *unsigned_integer_to_string(C       *buf,


                                    size_t  cchBuf,


                                    I       i)


{


  C *psz  = buf + cchBuf - 1;     // Set psz to last char


  *psz = 0;                       // Set terminating null


  do


  {


    unsigned    lsd = i % 10;     // Get least significant digit


    i /= 10;                      // Prepare for next most


                                  // significant digit


    —psz;                        // Move back


    *psz = get_digit_character<C>()[lsd]; // Write the digit


  } while(i != 0);


  return psz;


}



The functions work by writing backward into a caller-supplied character buffer and returning a pointer to the converted form within the buffer. The least significant digit is calculated and written to the current end point within the buffer. Each digit is converted into its equivalent character value via the lookup table contained within get_digit_character()[3] shown in Listing 31.2.

[3] As well as providing flexibility beyond the decimal digits (in anticipation of supporting the planned nonbase 10 implementations), this implementation will also work with any character encoding schemes that do not have contiguous ordering of the characters '0', '1' – '9'.

Listing 31.2.



template <typename C>


const C *get_digit_character()


{


  static const C  s_characters[19] =


  {


      '9', '8', '7', '6', '5', '4', '3', '2', '1'


    , '0'


    , '1', '2', '3', '4', '5', '6', '7', '8', '9'


  };


  static const C  *s_mid  =   s_characters + 9;


  return s_mid;


}



The convertee value is divided by 10, the end point moved backward, and the cycle repeated until 0 is reached, and the current end point is returned as a pointer to the converted string. Note that this may not be the start of the given buffer. For signed integers with a negative value, a minus sign is then prepended, and the function returns a pointer to that character. Using the functions is very simple:



uint64_t      i = . . .


wchar_t       buf[21];


wchar_t const *s = integer_to_string(buf, dimensionof(buf), i);



Because it does not use an internal buffer, the technique is thread safe (see Chapter 10). It is type safe and works with any integer type for which integer_to_string() overloads are defined, and with any character type. And it is very fast—as low as 10% of the cost of using sprintf() (see section 31.8).

However, there are two criticisms to be made. First, it is not very succinct: one needs to supply the length of the buffer along with the buffer pointer and the integer arguments.

Second, and more serious, it is possible to supply an incorrect value for the buffer length. This value—which represents length in characters rather than size in bytes—is used to determine the end point of the resultant string form, at which point the reverse writing begins. Since the implementation only does a debug run time assertion, it is possible for a buffer underrun to occur. It is beholden on the programmer to provide a buffer of sufficient length; this is one of the ways in which the technique derives its extra speed. Naturally, you'll do like I do and use dimensionof() (see section 14.3) or an equivalent mechanism to avoid any problems, but the fragility is there nonetheless.

Although the required sizes for buffers of the various types are both small and constant (see Table 31.1) and although developers who've used the function suite readily comprehend the idiom, it still has an uneasy sense of fragility. Furthermore, though no one who's used it has reported any error, there are many complaints regarding the verbosity of the client code.

Table 31.1. Required buffer sizes for integer conversion.

Type

Size (in chars), including null terminator

Example

8-bit signed

5

"-127"

8-bit unsigned

4

"255"

16-bit signed

7

"-32768"

16-bit unsigned

6

"65535"

32-bit signed

12

"-2147483628"

32-bit unsigned

11

"4294967295"

64-bit signed

21

"-9223372036854775808"

64-bit unsigned

21

"18446744073709551615"


There are a number of different options by which we can extend the technique to address these criticisms.

31.3.1 RVL

Solution 1 does not have any issues with respect to RVL-LS or RVL-PDP. It is susceptible to RVL-LV, but only in the same way as any other function—for example, memset(), strcpy(), and so forth—that returns a pointer that it is passed.


      Previous section   Next section