7.8.1 Decision Trees for Sorting Algorithms · Foundations of Algorithms Using C++ Pseudocod

## 7.8.1 Decision Trees for Sorting Algorithms Consider the following algorithm for sorting three keys. ``` void sortthree (keytype S[]) // S is indexed from 1 to 3. { keytype a, b, c; a = S [1]; b = S[2]; c = S[3]; if (a < b) if (b < c) S = a, b, c; // This means S[1] = a; S[2] = b; S[3] = c; else if (a < c) S = a, c, b; else S = c, a, b; else if (b < c) if (a < c) S = b, a, c; else S = b, c, a; else S = c, b, a; } ``` We can associate a binary tree with procedure *sortthree* as follows, We place the comparison of *a* and *b* at the root. The left child of the root contains the comparison that is made if *a < b*, whereas the right child contains the comparison that is made if *a ≤ b.* We proceed downward, creating nodes in the tree until all possible comparisons done by the algorithm are assigned nodes. The sorted keys are stored at the leaves. [Figure 7.11](#ch07fig11) shows the entire tree. This tree is called a ***decision tree*** because at each node a decision must be made as to which node to visit next. The action of procedure *sortthree* on a particular input corresponds to following the unique path from the root to a leaf, determined by that input. There is a leaf in the tree for every permutation of three keys, because the algorithm can sort every possible input of size 3. [![Click To expand](https://box.kancloud.cn/07890b472b82558a9d84d3423b2e12d7_350x213.jpg)](fig7-11_0.jpg) Figure 7.11: The decision tree corresponding to procedure sortthree. A decision tree is called ***valid*** for sorting *n* keys if, for each permutation of the *n* keys, there is a path from the root to a leaf that sorts that permutation. That is, it can sort every input of size *n.* For example, the decision tree in [Figure 7.11](#ch07fig11) is valid for sorting three keys, but would no longer be valid if we removed any branch from the tree. To every deterministic algorithm for sorting *n* keys, there corresponds at least one valid decision tree. The decision tree in [Figure 7.11](#ch07fig11) corresponds to procedure *sortthree*, and the decision tree in [Figure 7.12](#ch07fig12) corresponds to Exchange Sort when sorting three keys. (You are encouraged to verify this.) In that tree, *a, b*, and *c* are again the initial values of *S*\[1\], *S*\[2\], and *S*\[3\]. When a node contains, for example, the comparison "*c <* b," this does not mean that Exchange Sort compares *S* \[3\] with *S* \[2\] at that point; rather, it means that Exchange Sort compares the array item whose current value is *c* with the one whose current value is *b.* In the tree in [Figure 7.12](#ch07fig12), notice that the level–2 node containing the comparison "*b < a*" has no right child. The reason is that a "no" answer to that comparison, contradicts the answers obtained on the path leading to that node, which means that its right child could not be reached by making a consistent sequence of decisions starting at the root. Exchange Sort makes an unnecessary comparison at this [![Click To expand](https://box.kancloud.cn/0edbb33ab87059ec83836ea94db75806_350x188.jpg)](fig7-12_0.jpg) Figure 7.12: The decision tree corresponding to Exchange Sort when sorting three keys. [![Click To expand](https://box.kancloud.cn/8b2cc9b6e4ee569b57c32ceec1e8a1fd_350x297.jpg)](fig7-13_0.jpg) Figure 7.13: The trees in (a) and (b) have the same number of leaves, but the tree in (b) has a smaller *EPL* point, because Exchange Sort does not "know" that the answer to the question must be "yes." This often happens in suboptimal sorting algorithms. We say that a decision tree is ***pruned*** if every leaf can be reached from the root by making a consistent sequence of decisions. The decision tree in [Figure 7.12](#ch07fig12) is pruned, whereas it would not be pruned if we added a right child to the node just discussed, even though it would still be valid and would still correspond to Exchange Sort. Clearly, to every deterministic algorithm for sorting *n* keys there corresponds a pruned, valid decision tree. Therefore, we have the following lemma. Lemma 7.1**![Start example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)**To every deterministic algorithm for sorting *n* distinct keys there corresponds a pruned, valid, binary decision tree containing exactly *n*! leaves. Proof: As just mentioned, there is a pruned, valid decision tree corresponding to any algorithm for sorting *n* keys. When all the keys are distinct, the result of a comparison is always "<" or ">" Therefore, each node in that tree has at most two children, which means that it is a binary tree. Next we show that it has *n*! leaves. Because there are *n*! different inputs that contain *n* distinct keys and because a decision tree is valid for sorting *n* distinct keys only if it has a leaf for every input, the tree has at least *n*! leaves. Because there is a unique path in the tree for each of the *n*! different inputs and because every leaf in a pruned decision tree must be reachable, the tree can have no more than *n*! leaves. Therefore, the tree has exactly *n*! leaves. **![End example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)** Using [Lemma 7.1](#ch07ex09), we can determine bounds for sorting *n* distinct keys by investigating binary trees with ! leaves. We do this next.