Previous section   Next section

Imperfect C++ Practical Solutions for Real-Life Programming
By Matthew Wilson
Table of Contents
Chapter 31.  Return Value Lifetime


31.5. Solution 3—Extending RVL

The advantages of the TSS-based approach of Solution 2 are that it is thread safe, works with any character type, and does not require a caller-supplied buffer. Naturally, it does not need a caller-supplied buffer length either, so there is no possibility of an insufficient buffer being passed through to integer_to_string().

One slight inconvenience is that there is no longer any character-based parameter from which the compiler can deduce the character type, which means that the template function must be explicitly parameterized (C++-98: 14.8.1):



uint64_t      i = . . .


wchar_t const *result = int_to_string<wchar_t>(i);



However, there is another, more significant, drawback, which we're going to examine, and attempt to address in Solution 3. Consider the following example, in light of the implementation of Solution 2:



printf("%s %s", int_to_string<char>(5)


              , int_to_string<char>(10));



With our current int_to_string() function implementation, we may get "5 5" or "10 10", but there's no way we're going to get the intended "5 10". This is because the value returned by the two calls to int_to_string() are the same, that is, a pointer to the thread-specific buffer for the particular combination of character and integer type.

This is a twist on the RVL-LS problem, which can also occur in subtler cases:



int some_func(int, char **);





printf("%s %d\n", int_to_string<char>(argc)


                , some_func(argc, argv);



If some_func() calls int_to_string<char>(int), whether directly or indirectly, then we're back to undefined behavior in our output.

For efficiency reasons the conversion functions return C-strings, rather than instances of std::basic_string() or similar. The problem is that a C-string is not a value type; it is a pointer type whose value is the address of the integer's string representation laid out in memory. This problem is not unique to int_to_string(): any function that returns a pointer to a structure can suffer from this, irrespective of whether or not they are, like int_to_string(), thread safe.

31.5.1 Solving Intrathread RVL-LS?

So what can we do about it? Let's assume that we are adamant that we want to return pointers to C-strings. Clearly we want to be able to return distinct buffers from a parameterization of i2str_get_tss_buffer() when the corresponding parameterization of int_to_string() is called multiple times within a single expression. Unfortunately, I think that that's pretty much impossible, or at least would have a heavy run time cost.

However, we don't actually need to know whether successive calls are from within a single expression; one option is simply to make sure that the likelihood is very low. Because of the nature of integer to string conversion—that is, there are fixed maximum lengths to the converted string form—we can approximate "impossible" by changing the implementation of i2str_get_tss_buffer() to the following:[9]

[9] Note that this is only the implementation for the __declspec(thread) version. As described in section 31.4, __declspec(thread) is suitable only for a limited number of development scenarios, so you'd probably have to use another form of TSS. In performance terms, both the __declspec(thread) version of this solution and the one based on the Tss Library (see section 10.5.4) have performances that are indistinguishable from their single-buffer variants described in section 31.4.

Listing 31.7.


template< typename C


        , size_t   CCH


        >


C *i2str_get_tss_buffer()


{


  const size_t                     DEGREE  =   32;


  __declspec(thread) static C       s_buffers[DEGREE][CCH];


  __declspec(thread) static size_t  s_index;


  s_index = (s_index + 1) % DEGREE;


  return s_buffers[s_index];


}



By picking a number that we believe is large enough, we reduce the likelihood of overwrites. 32 buffers of thread-specific storage, each of size CCH (the size adequate for a converted integer), are declared along with a thread-specific indexing variable. Upon each call, the indexer is incremented, and is cycled back to 0 when it reaches 32. Thus, each of the 32 buffers is used in turn, this cycling occurring on a thread-specific basis.

Naturally 32 is a guess at the maximum number of integer to string conversions (remember that this is per-integer type, i.e., there are 32 for uint32_t, 32 for int16_t, etc.), and represents a compromise between desired "safety" and stack size. You would choose your own limit.

31.5.2 RVL

So we've ameliorated the RVL-LS problem. Sadly, we have not removed it, and I hope it is clear to you that it is theoretically impossible to do so. Of course we can practically remove it by selecting a sufficiently large degree of buffer array, but this is hackery gone nuts. Don't get me wrong; there are some circumstances where it's valid to go with practical, but not theoretical, correctness. I just don't think this is one of them. Interestingly, Bjarne Stroustrup discusses a similar use for this technique in a scenario in which failure to provide uniqueness would also represent a nonbenign condition. In his customary understated way, he observes that you'd be in trouble if you encountered conditions that precipitated nonuniqueness. I'd put it a lot more strongly than that.

I would suggest that this solution is actually less desirable than Solution 2, since at least in that case there is no attempt to give users of the function a false sense of security; any multiple-use of the return value will result in erroneous results. That's preferable to using a library that promises that "you're unlikely to encounter an error." This solution must be rejected completely.[10]

[10] Actually, an approach such as this might be appropriate when the (rare) repeated use of an object would result in loss of efficiency, rather than breaking correctness, so it's not entirely without its place.

If we want to return a pointer to a buffer, it's looking like we're going to have to pass in the buffer ourselves.


      Previous section   Next section