2.7 Determining Thresholds · Foundations of Algorithms Using C++ Pseudocod

## 2.7 Determining Thresholds As discussed in [Section 2.1](LiB0015.html#145), recursion requires a fair amount of overhead in terms of computer time. If, for example, we are sorting only eight keys, is it really worth this overhead just so we can use a Θ (*n* lg *n*) algorithm instead of a Θ (*n*2)algorithm? Or perhaps, for such a small *n*, would Exchange Sort ([Algorithm 1.3](LiB0008.html#32)) be faster than our recursive Mergesort? We develop a method that determines for what values of *n* it is at least as fast to call an alternative algorithm as it is to divide the instance further. These values depend on the divide-and-conquer algorithm, the alternative algorithm, and the computer on which they are implemented. Ideally, we would like to find an ***optimal threshold value*** of *n*. This would be an instance size such that for any smaller instance it would be at least as fast to call the other algorithm as it would be to divide the instance further, and for any larger instance size it would be faster to divide the instance again. However, as we shall see, an optimal threshold value does not always exist. Even if our analysis does not yield an optimal threshold value, we can use the results of the analysis to pick a threshold value. We then modify the divide-and-conquer algorithm so that the instance is no longer divided once *n* reaches that threshold value; instead, the alternative algorithm is called. We have already seen the use of thresholds in [Algorithms 2.8](LiB0019.html#201), [2.9](LiB0020.html#214), and [2.10](LiB0020.html#218). To determine a threshold, we must consider the computer on which the algorithm is implemented. This technique is illustrated using Mergesort and Exchange Sort. We use Mergesort's worst-case time complexity in this analysis. So we are actually trying to optimize the worst-case behavior. When analyzing Mergesort, we determined that the worst case is given by the following recurrence: ![](https://box.kancloud.cn/1d70a2b6c1abbc94c1ef001f3f5b0111_318x38.jpg) Let's assume that we are implementing Mergesort 2 ([Algorithm 2.4](LiB0016.html#169)). Suppose that on the computer of interest the time Mergesort 2 takes to divide and recombine an instance of size *n* is 32*n* μs, where μs stands for microseconds. The time to divide and recombine the instance includes the time to compute the value of *mid*, the time to do the stack operations for the two recursive calls, and the time to merge the two subarrays. Because there are several components to the division and recombination time, it is unlikely that the total time would simply be a constant times *n*. However, assume that this is the case to keep things as simple as possible. Because the term *n*−1 in the recurrence for *W*(*n*) is the recombination time, it is included in the time 32*n* μs. Therefore, for this computer, we have ![](https://box.kancloud.cn/f9e0cc8273d709848873b541fe2702f0_326x37.jpg) for Mergesort 2. Because only a terminal condition check is done when the input size is 1, we assume that *W*(1) is essentially 0. For simplicity, we initially limit our discussion to *n* being a power of 2. In this case we have the following recurrence: ![](https://box.kancloud.cn/35571018c5af23ca34ee66fb136dc077_400x58.jpg) The techniques in [Appendix B](LiB0102.html#1369) can be used to solve this recurrence. The solution is ![](https://box.kancloud.cn/71b06ef819dbc5218300d7f47501cd2b_162x21.jpg) Suppose that on this same computer Exchange Sort takes exactly ![](https://box.kancloud.cn/b197c2df4ad7e717364189c9e8ac4e56_94x42.jpg) to sort an instance of size *n*. Sometimes students erroneously believe that the optimal point where Mergesort 2 should call Exchange Sort can now be found by solving the inequality ![](https://box.kancloud.cn/e9b3281b0f6bc4802e83aa9c5ce7752e_205x42.jpg) The solution is ![](https://box.kancloud.cn/b73b960bc3fe020d11fb6562c6d2ecb2_71x17.jpg) Students sometimes believe that it is optimal to call Exchange Sort when *n* < 591 and to call Mergesort 2 otherwise. This analysis is only approximate because we base it on *n* being a power of 2. But more importantly it is *incorrect*, because it only tells us that if we use Mergesort 2 and keep dividing until *n* = 1, then Exchange Sort is better for *n* < 591. We want to use Mergesort 2 and keep dividing until it is better to call Exchange Sort rather than divide the instance further. This is not the same as dividing until *n* = 1, and therefore the point at which we call Exchange Sort should be less than 591. That this value should be less than 591 is a bit hard to grasp in the abstract. The following concrete example, which determines the point at which it is more efficient to call Exchange Sort rather than dividing the instance further, should make the matter clear. From now on, we no longer limit our considerations to *n* being a power of 2. Example 2.7**![Start example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)**We determine the optimal threshold for [Algorithm 2.5](LiB0016.html#170) (Mergesort 2) when calling [Algorithm 1.3](LiB0008.html#32) (Exchange Sort). Suppose we modify Mergesort 2 so that Exchange Sort is called when *n* ≤ *t* for some threshold *t*. Assuming the hypothetical computer just discussed, for this version of Mergesort 2, ![](https://box.kancloud.cn/a043a85046036832dad03e8b5ccf04fb_400x67.jpg) We want to determine the optimal value of *t*. That value is the value for which the top and bottom expression in [Equality 2.5](#ch02eqn05) are equal, because this is the point where calling Exchange Sort is as efficient as dividing the instance further. Therefore, to determine the optimal value of *t*, we must solve ![](https://box.kancloud.cn/9170467ee4065a3b3572c53a4b4398bf_400x42.jpg) Because \[*t*/2\] and \[*t*/2\] are both less than or equal to *t*, the execution time is given by the top expression in [Equality 2.5](#ch02eqn05) if the instance has either of these input sizes. Therefore, ![](https://box.kancloud.cn/0c6136052c8c8e208d192676e0799bf3_400x33.jpg) Substituting these equalities into [Equation 2.6](#ch02eqn06) yields ![](https://box.kancloud.cn/dd02034fe2a15b567201050400adaf10_400x34.jpg) In general, in an equation with floors and ceilings, we can obtain a different solution when we insert an odd value for *t* than when we insert an even value for *t*. This is the reason there is not always an optimal threshold value. Such a case is investigated next. In this case, however, if we insert an even value for *t*, which is accomplished by setting \[*t*/2\] and \[*t*/2\] both equal to *t*/2, and solve [Equation 2.7](#ch02eqn07), we obtain ![](https://box.kancloud.cn/35e33afe517d3506f2e4cfcf02d92d67_65x18.jpg) If we insert an odd value for *t*, which is accomplished by setting \[*t*/2\] equal to (*t* − 1)/2 and \[*t*/2\] equal to (*t* + 1) /2 and solve [Equation 2.7](#ch02eqn07), we obtain ![](https://box.kancloud.cn/965847f730a6dbef0f9e653e7fd87fe0_97x16.jpg) Therefore, we have an optimal threshold value of 128. **![End example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)** Next we give an example where there is no optimal threshold value. Example 2.8**![Start example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)**Suppose for a given divide-and-conquer algorithm running on a particular computer we determine that ![](https://box.kancloud.cn/95f81044744ba83a90aec7e9e7038be4_228x38.jpg) where 16*n* μs is the time needed to divide and recombine an instance of size *n*. Suppose on the same computer a certain iterative algorithm takes *n*2μs to process an instance of size *n*. To determine the value *t* at which we should call the iterative algorithm, we need to solve ![](https://box.kancloud.cn/0956c6fb18bdf85033cd81b3504c7bad_179x47.jpg) Because \[*t*/2\] ≤ *t*, the iterative algorithm is called when the input has this size, which means that ![](https://box.kancloud.cn/a44c61b7127311ea04411495d2334a17_155x52.jpg) Therefore, we need to solve ![](https://box.kancloud.cn/562126cda1bb94d3157fdd91e201cfa8_148x51.jpg) If we substitute an even value for *t* (by setting \[*t*/2\] = *t*/2) and solve, we get ![](https://box.kancloud.cn/1dd0c113c75782d66da0d6821375bdbd_58x18.jpg) If we substitute an odd value for *t* (by setting \[*t*/2\] = (*t* + 1)/2) and solve, we get ![](https://box.kancloud.cn/d6a2d3f63844d733014e8896d7f21275_81x16.jpg) Because the two values of *t* are not equal, there is no optimal threshold value. This means that if the size of an instance is an even integer between 64 and 70, it is more efficient to divide the instance one more time, whereas if the size is an odd integer between 64 and 70, it is more efficient to call the iterative algorithm. When the size is less than 64, it is always more efficient to call the iterative algorithm. When the size is greater than 70, it is always more efficient to divide the instance again. [Table 2.5](#ch02table05) illustrates that this is so. **![End example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)** **![](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)**Table 2.5: Various instance sizes illustrating that the threshold is 64 for *n* even and 70 for *n* odd in Example 2.8*n* *n*2 ![](https://box.kancloud.cn/3bd951f41f8990b4c8b86eacbd495e89_104x29.jpg) - - - - - - 62 3844 3875 63 3969 4080 64 4096 4096 65 4225 4307 68 4624 4556 69 4761 4779 70 4900 4795 71 5041 5024 **![](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)**