12.2: Weighted Path Length

An important measure associated with binary trees is the weighted path length. This measure underlies the entire development in this chapter. Suppose a numeric weight is assigned to each node of a binary tree T. The meaning of these weights is dependent on the intended application. When the application of the binary tree is to text compression or text encoding, only the weights assigned to terminal nodes have meaning. Each terminal node corresponds to a different character that can appear in the text. The weight assigned to the node represents the relative number of occurrences of the character in the text. When the binary tree is used for record storage, each node stores a record with a particular key. The weight assigned to a node represents the frequency of search requests for the key corresponding to the node. These weights must be estimated or determined empirically for each specific application.

The weighted path length of a node in T is the product of its depth and its assigned weight. The weighted path length of T, W(T), is the sum of the weighted path lengths of all nodes of T. As an example, consider the binary tree in Figure 12.1(a), with assigned weights indicated for each node. Its weighted path length is

(1 ?3) + (2 ?17) + (2 ?10) + (3 ?15) + (3 ?0)

 + (3 ?20) + (4 ?31) + (4 ?4)

or 302. This was calculated directly from the definition of the weighted path length.

Another way to determine the weighted path length is to determine first the value of each node of the tree. The value of a node is the sum of the assigned weights of all nodes in the subtree with that node as root. Thus the root of a tree is assigned a value that is the sum of all the weights of the tree. The example would have values assigned to each node as indicated in Figure 12.1(b). The weighted path length of the tree is then calculated by adding up the values of the nodes.Thus, from Figure 12.1(b), the weighted path length is

100 + 67 + 30 + 15 + 35 + 20 + 31 + 4 = 302

It is easy to see that adding the node values must always yield the weighted path length, since the weight of any node is counted once for every subtree in which it appears. For example, the weight 31 is counted in the value attached to each node from the node to which it is assigned to the root. This is a total of four nodes, exactly the number of times that the node's weight should count in the weighted path length.

(a) Nodes Labeled with Assigned Weights

(b) Nodes Labeled with Values Derived by Adding Assigned Weights

Figure 12.1 Binary Tree T

The weighted path length can be viewed in still another way. It is given by the value of the root plus the weighted path length of the two subtrees of the root node. For the example shown in Figure 12.1(b), this is given by 100 + 152 + 50, where 100 is the root's value and 152 and 50 are the respective weighted path lengths of the left and right subtrees. Thus there are three ways in which the weighted path length of a binary tree can be calculated.