7.5 Quicksort Revisited · Foundations of Algorithms Using C++ Pseudocod

## 7.5 Quicksort Revisited As a refresher, let's first repeat the algorithm for Quicksort: ``` void quicksort (index low, index high) { index pivotpoint; if (high > low){ partition(low, high, pivotpoint); quicksort (low, pivotpoint - 1); quicksort (pivotpoint + 1, high); } } ``` Although its worst-case time complexity is quadratic, we saw in [Section 2.4](LiB0066.html#752) that *the average-case time complexity of Quicksort's number of comparisons of keys* is given by ![](https://box.kancloud.cn/39b70925cbbca4bb5c9d807297877aed_187x19.jpg) which is not much worse than Mergesort. Quicksort has the advantage over Mergesort that no extra array is needed. However, it is still not an in-place sort because, while the algorithm is sorting the first subarray, the first and last indices of the other subarray need to be stored in the stack of activation records. Unlike Mergesort, we have no guarantee that the array will always be split in the middle. In the worse case, *partition* may repeatedly split the array into an empty subarray on the right and a subarray with one less item on the left. In this way, *n* –1 pairs of indices will end up being stacked, which means that the worst-case extra space usage is in Θ(*n*). It is possible to modify Quicksort so that the extra space usage is at most about lg *n*. Before showing this and other improvements of Quicksort, let's discuss the time complexity of the number of assignments of records by Quicksort. In the exercises, you are asked to establish that the average number of exchanges performed by Quicksort is about 0.69 (*n* + 1) lg *n*. Assuming that three assignments are done to perform one exchange, *the average-case time complexity of the number of assignments of records done by Quicksort* is given by ![](https://box.kancloud.cn/2fe61ac77921da61c17b80a41eb9fd43_192x18.jpg) #### Improvements of the Basic Quicksort Algorithm We can reduce the extra space usage of Quicksort in five different ways. First, in procedure *quicksort*, we determine which subarray is larger and always stack that one while the other is sorted. The following is an analysis of the space used by this version of *quicksort*. Analysis of Extra Space Usage (Improved Quicksort)**![Start example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)**In this version, the worst-case space usage occurs when *partition* splits the array exactly in half each time, resulting in a stack depth about equal to lg *n*. Therefore, the worst-case space usage is in Θ (lg *n*) indices. **![End example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)** Second, as discussed in the exercises, there is a version of *partition* that cuts the average number of assignments of records significantly. For that version, the *average-case time complexity of the number of assignments of records done by Quicksort* is given by ![](https://box.kancloud.cn/5052ac87a1415eeeb26f64f3f9eef542_191x19.jpg) Third, each of the recursive calls in procedure *quicksort* causes *low, high*, and *pivotpoint* to be stacked. A good deal of the pushing and popping is unnecessary. While the first recursive call to *quicksort* is being processed, only the values of *pivotpoint* and *high* need to be saved on the stack. While the second recursive call is being processed, nothing needs to be saved. We can avoid the unnecessary operations by writing *quicksort* iteratively and manipulating the stack in the procedure. That is, we do explicit stacking instead of stacking by recursion. You are asked to do this in the exercises. Fourth, as discussed in [Section 2.7](LiB0021.html#223), recursive algorithms such as Quicksort can be improved by determining *A* threshold value at which the algorithm calls an iterative algorithm instead of dividing an instance further. Finally, as mentioned in the worst-case analysis of [Algorithm 2.6](LiB0018.html#177) (Quicksort), the algorithm is least efficient when the input array is already sorted. The closer the input array is to being sorted, the closer we are to this worst-case performance. Therefore, if there is reason to believe that the array may be close to already being sorted, we can improve the performance by not always choosing the first item as the pivot item. One good strategy to use in this case is to choose the median of *S \[low\], S\[∟L (low + high)* /2∟\], and *S \[high\]* for the pivot point. Of course, if we have no reason to believe that there is any particular structure in the input array, choosing any item for the pivot item is, on the average, just as good as choosing any other item. In this case, all we really gain by taking the median is the guarantee that one of the subarrays will not be empty (as long as the three values are distinct).