12.5.1 Greedy Binary Search Trees

One such algorithm produces greedy binary search trees. These trees are constructed by an algorithm using the greedy method. Unlike the Huffman algorithm, which is also based on the greedy method, they are not guaranteed to be optimal, but they are guaranteed to be nearly optimal. Other distinct, nearly optimal, tree constructions are found in Mehlhorn [1975, 1977] and in Horibe and Nemetz [1979]. A comparison of all these and other algorithms for the generation of nearly optimal binary search trees appears in Korsh [1982]. The proof that the following greedy method yields nearly optimal trees is in Korsh [1981].

In constructing a nearly optimal binary search tree by the greedy method, the programmer must ensure that the solution is a binary search tree. Looking at the optimal tree (Figure 12.12) found by the preceding section's algorithm will provide the idea behind the greedy tree construction. Think of the tree being generated by creating the subtrees shown in Figure 12.13, one after another, starting with the subtree with k4 at its root and ending with the final optimal tree. Each subtree has its value (the sum of its weights) associated with it.

Figure 12.13 Trees Created from Subtrees (Circled Numbers Are the Trees' Values)

Each subtree is formed by taking for its root a key that has not yet appeared and adjoining a left and right subtree. These subtrees may correspond to an extended node (with weight some ai) or to an already produced subtree. The value of the subtree formed is given by the bi of its key plus the value of its left and right subtrees. These subtrees must be selected in a constrained fashion, or the resultant tree will not be a binary search tree. These constraints will be delineated shortly, but for now, the main point is that the weighted path of the optimal tree is the sum of the values of each of the subtrees. The greedy trees are built by always selecting the next subtree to generate as the one with smallest value from among all those satisfying the constraints. This is exactly how the Huffman construction works, except there are no constraints on the choice of subtrees since the resultant tree need not be a binary search tree.