1.8.1 Analyzing the Prime Example

The nonmathematically inclined reader may want to skip the following analysis of primes (Figure 1.3), which demonstrates how to go about informally analyzing a program and shows how trade-offs arise. Even such simple programs do not necessarily have simple analyses.

When a statement is said to take constant time, it means that the statement is always executed in the same amount of time, whatever the current values of the variables it processes. Clearly, the execution time of primes is given by the sum of the execution times of its three components. In what follows it is assumed that insert, belongs, and omit all take constant time (notice that for the array implementation for candidates (Section 1.3.2) this will be true, but set takes time proportional to n).

Create executes set once and insert exactly n - 1 times. Similarly, print executes its if statement n - 1 times. The amount of time the if statement takes depends on whether belongs is true or false but requires, at most, the time for belongs plus the time for printf.

Remove is more complex. Its initial assignment statement takes constant time, to which must be added the loop time. The loop is executed times, so the loop condition is evaluated times. Assuming, for simplicity, that sqrt (n) takes constant time C, then is the total time for the evaluation of the condition. If the loop task took constant time, then the total loop time would be the product of this constant and . Unfortunately delete's time varies, depending on the value of factor. Nonetheless, the total time for delete can still be derived.

Delete's own loop is executed for each multiple of factor between 2 x factor and m x factor, where m is the largest number of times that factor divides into n. m is the integer part of n/factor denoted by ?/FONT>n/factor?/FONT>. Hence delete's loop is executed ?/FONT>n/factor?/FONT> times for this value of factor. Since factor takes on values , the total number of delete's loop executions is

or

But 1/2 + 1/3 + . . . + 1/n is roughly the natural logarithm of n, denoted by ln n.Thus the entire time for the loop of delete is a constant (for omit and the assignment statement) times . The condition of this loop contributes roughly its own constant times . The initialization of nextmultiple in delete contributes its constant times .

Adding everything, we get a good execution time estimate of

where C1, C2, C3, and C4 come from the times for condition evaluation, assignment statement execution, and set, insert, omit, and belongs.

For large n, this expression is dominated by the term C4n1n n. In fact, for large enough values of n, the expression will be no greater than C n1n n for an appropriate value of C. We describe this concisely by saying the expression is O(n1n n), read "order n1n n"; this notation will be used throughout the text.

By the storage requirements of primes is meant the total storage required for its data structures. Except for a few variables, the storage is required primarily for candidates and will be determined by its implementation. The analysis reveals that the bulk of the execution time for primes is spent executing the loop in delete. Consequently, to improve its time we speed up the time of the loop itself, or reduce the number of times it needs to be executed. For the array implementation, the storage required will be proportional to n, and this is where the bulk of the storage required is needed.

Primes can be improved in a number of obvious ways. Except for 2, no primes are even. Hence all other even numbers can be eliminated from consideration. This means that factor need never take on even values, so it can be increased by 2 instead of 1 in remove and can also be initialized to 3. Delete can be changed so that nextmultiple takes on only odd multiples of factor. With a little more work, candidates can be cut to roughly half its size by eliminating even integers. This will necessitate changes to set as well as changes to the for loops in create and print, but it reduces the storage needed by about one-half and the time by more than one-half.

Perhaps a less obvious improvement is to modify the increase of factor in remove so that, instead of increasing by 1 (or 2), factor increases to the value of the next prime in candidates (see Exercise 8). To realize this saving fully, it is necessary to use a new data structure, the list or the balanced binary tree, for the collection. Lists are discussed in Chapter 3, balanced binary trees in Chapters 7 and 9. These would save not only time but storage as well. Borrowing a result from mathematics about the distribution of prime numbers, we can state that any version of primes that stores all the primes from 2 to n in a collection will need n/ln n storage locations, since there are about that many such primes. Thus we will run out of storage before we run out of time!

Primes may be written so that its execution time increases but its storage needs decrease, or vice versa. Thus time is traded for storage, as is often done. Primes represents one extreme where all the primes are saved. A version of primes in which no primes are saved is at the other extreme (see Exercise 10). This version would require significantly more time.