Imperfect C++ Practical Solutions for Real-Life Programming By Matthew Wilson
	Table of Contents

	Chapter 32. Memory

32.1. A Taxonomy of Memory

We talked about some of the details of (local and nonlocal) static memory in Chapter 11, but we've not looked in any detail at the range of memory mechanisms supported by C and C++.

32.1.1 Stack and Static Memory

Stack variables are allocated from the executing thread's stack in the scope of the function within which they are defined by adjustment of the stack pointer on entering the scope of their declaration. Static variables are fixed in program global memory, allocated by reservation of space in the global memory area. For the purposes of this section, I focus on the implications of the use of stack memory, although some of the issues discussed also apply to global memory.

Because the allotment of memory for global and stack variables is carried out at compile time, there are both advantages and disadvantages. The main advantage is that there is no "allocation" actually happening, merely a manipulation of pointers or addresses; to all intents and purposes, the memory already exists. Consequently, this form of memory allocation is extremely efficient; in fact it is the most efficient form of memory allocation. An additional minor advantage is that one can determine, at compile time, the size of the allocated memory, by use of the sizeof operator.

The downside is that the memory can only be of fixed, predetermined, size. (The slight exception to this is the alloca() technique, which we cover in section 32.2.1.) This is often perfectly acceptable when, for example, dealing with file-system entity names, which have a fixed maximum on many platforms. When writing such code one may simply declare a buffer of the maximum potential size, confident that passing it to any function will not result in buffer overwrites. However, when dealing with APIs whose functions may use buffers of any lengths (e.g., the Win32 Registry API), one can never guarantee to have a fixed buffer of sufficient size.^[1]

^[1] I am sure most of us have written RegXxx() code passing buffers of _MAX_PATH size, nonetheless!

As was mentioned in Chapter 11, static variables have their storage initialized to 0. Stack variables are not automatically initialized and will contain random values until explicitly initialized.

32.1.2 Stack Expansion

I said above that stack memory already exists. However, this is only true as far as the C/C++ notional execution environment is concerned. In reality, the stack memory for a process may be as ephemeral as any other part of its virtual address space. It is up to the operating system to ensure that the stack memory exists when you need it.

On Win32 operating systems,^[2] stack memory is committed on use on a per-page basis [Rich1997]. What this means is that if the current area of stack memory has not yet been committed, and an instruction touches that memory, the operating system will (attempt to) commit the page and then reexecute the instruction accessing the now valid memory. The next uncommitted page is called the guard page and bears a special guard attribute to facilitate stack expansion. However, the guard page attribute is only attached to the first uncommitted page, so that the touching of any other uncommitted pages beyond that results in a simple, terminal, access violation. Other operating systems operate analogous mechanisms.

^[2] On Solaris, a similar scheme is operated: the stack is memory mapped using the MAP_NORESERVE flag; virtual memory is not allocated until the page is used. There is only one redzone per stack.

The real-world problem for stack-memory (including alloca(); see section 32.2.1) is when the combined size of all variables within the local scope exceeds, or may potentially exceed, the system page threshold. In that instance the compiler is required to insert code in order to ensure that the stack memory is valid. The reason for this can be demonstrated with a simple example. Consider the following code:



void stack_func(size_t index)


{


    char    stack_buffer[4097];


    buffer[index] = '\0';


}

The code demonstrates the possibility of skipping a page in the manner described. If the page size is 4096 and the first byte of stack_buffer falls on the first byte of the guard page, then it is possible, with index equal to 4096, to skip the guard page and access the next uncommitted page. Since that page will not have the guard attribute, this will cause an access violation, and your process will be terminated. Although this example is contrived, it is common to see scenarios where several local buffers are declared and their total can even exceed the size of one or more pages, making the skipping of the guard page much more likely.

In order to ensure that all pages between the current last page and any and all required in any next block are valid, compilers must step in and shoulder some of the burden. The Visual C++ compiler inserts calls to the chkstk() function, which ensures that any pages that might slip through this window are touched in the correct order, thereby bringing them into committed memory in a coherent manner. Incurring this insertion has the two related disadvantages that it causes linking to the C run time library (which may be undesirable) and incurs a modest performance hit in the calling and execution of the chkstk() function.

32.1.3 Heap Memory

Heap memory is the opposite of stack memory: it is obtained from the heap (sometimes referred to as the free store [Stro1994]), or from one of a set of heaps, at run time. Every heap API requires a function to allocate memory (e.g., malloc()) and, except where using garbage collection, a corresponding function to return it (e.g., free()).The advantage of heap memory is that the size of the buffer can be any practical size, within the limitations of the run time system (although some older memory APIs restrict the maximum size of individual buffers).

There are several disadvantages to the use of heap memory. First, heap allocations are considerably slower than stack/global allocations (due to the complexity of implementing the memory allocation schemes to reclaim and defragment freed memory). Second, it is possible that the request may not actually be satisfiable at run time, requiring client code to manage that eventuality (whether through exception handling or through testing the returned value against NULL). Third, you must explicitly return your memory when you no longer need it. If you forget to free allocated chunks, your process will likely die a slow death through memory exhaustion.

Additionally, heavy use of any heap can lead to fragmentation, whereby the free parts of the heap are spread around among many allocated sections. This can increase the likelihood of allocation failure, and can degrade performance due to the need to search through the free list to find areas of the appropriate size to match requests.