7.2 Insertion Sort and Selection Sort · Foundations of Algorithms Using C++ Pseudocod

## 7.2 Insertion Sort and Selection Sort An ***insertion sort*** algorithm is one that sorts by inserting records in an existing sorted array. An example of a simple insertion sort is as follows. Assume that the keys in the first *i* – 1 array slots are sorted. Let *x* be the value of the key in the *i*th slot. Compare *x* in sequence with the key in the (*i* – 1)st slot, the one in the (*i* – 2)nd slot, etc., until a key is found that is smaller than *x*. Let *j* be the slot where that key is located. Move the keys in slots *j* + 1 through *i* – 1 to slots *j* + 2 through *i*, and insert *x* in the (*j* + 1)st slot. Repeat this process for *i* = 2 through *i* = *n*. [Figure 7.1](#ch07fig01) illustrates this sort. An algorithm for this sort follows: Algorithm 7.1: Insertion Sort**![Start example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)**Problem: Sort *n* keys in nondecreasing order. Inputs: positive integer *n*; array of keys of keys *S* indexed from 1 to *n*. Outputs: the array *S* containing the keys in nondecreasing order. ``` void insertionsort (int n, keytype S[]) { index i, j; keytype x; for (i=2; i <= n; i++){ x = S[i]; j = i-1; while (j > 0 && S[j] > x){ S[j + 1] = S[j]; j--; } S[j +1] = x } } ``` **![End example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)** [![Click To expand](https://box.kancloud.cn/a277f5688bdcee8e1e18a2de2621d0be_350x236.jpg)](fig7-1_0.jpg) Figure 7.1: An example illustrating what Insertion Sort does when *i* = 6 and *j* = 2 (top). The array before inserting, and (bottom) the insertion step. **![Start Sidebar](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)**Analysis of [Algorithm 7.1](#ch07ex01) Worst-Case Time Complexity Analysis of Number of Comparisons of Keys (Insertion Sort)Basic operation: the comparison of *S* \[*j*\] with *x*. Input size: *n*, the number of keys to be sorted. For a given *i*, the comparison of *S*\[*j*\] with *x* is done most often when the **while** loop is exited because *j* becomes equal to 0. Assuming the second condition in an **&&** expression is not evaluated when the first condition is false, the comparison of *S* \[*j*\] with *x* is not done when *j* is 0. Therefore, this comparison is done at most *i* – 1 times for a given *i*. Because *i* ranges in value from 2 to *n*, the total number of comparisons is at most ![](https://box.kancloud.cn/27509cd906d13c53fc171ba29e29a6db_185x54.jpg) It is left as an exercise to show that, if the keys are originally in nonincreasing order in the array, this bound is achieved. Therefore, ![](https://box.kancloud.cn/6a794053eaa2a2ecb7270233fc51feda_152x42.jpg) **![End Sidebar](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)** **![Start Sidebar](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)**Analysis of [Algorithm 7.1](#ch07ex01) Average-Case Time Complexity Analysis of Number of Comparisons of Keys (Insertion Sort)For a given *i*, there are *i* slots in which *x* can be inserted. That is, *x* can stay in the *i*th slot, go in the (*i* – 1)st slot, go in the (*i* – 2)nd slot, etc. Because we have not previously inspected *x* or used it in the algorithm, we have no reason to believe that it is more likely to be inserted in any one slot than in any other. Therefore, we assign equal probabilities to each of the first *i* slots. This means that each slot has the probability 1/*i*. The following list shows the number of comparisons done when *x* is inserted in each slot. ![](https://box.kancloud.cn/0c30a202f4f32641bf799812f7f9420a_259x153.jpg) The reason the number of comparisons is *i* – 1 and not *i* when *x* is inserted in the first slot is that the first condition in the expression controlling the **while** loop is false when *j* = 0, which means that the second condition is not evaluated. For a given *i*, the average number of comparsions needed to insert *x* is ![](https://box.kancloud.cn/0c868c016ee1f23021fb11bc6d32a07e_400x105.jpg) Therefore, the average number of comparsions needed to sort the array is ![](https://box.kancloud.cn/b4591da371a9fc966510ad97887d6e32_400x48.jpg) The last equality is obtained using the results of Examples A.1 and A.9 in [Appendix A](LiB0093.html#1281) and doing some algebraic manipulations. We have shown that ![](https://box.kancloud.cn/77a5646e7b1b33e55bb430597d7c9c9e_288x45.jpg) **![End Sidebar](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)** Next we analyze the extra usage. **![Start Sidebar](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)**Analysis of [Algorithm 7.1](#ch07ex01) Analysis of Extra Space Usage (Insertion Sort)The only space usage that increases with *n* is the size of the input array *S*. Therefore, the algorithm is an in-place sort, and the extra space is in Θ (1). **![End Sidebar](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)** In the exercises, you are asked to show that *the worst-case and average-case time complexities for the number of assignments of records done by Insertion Sort* are given by ![](https://box.kancloud.cn/e0bef2309a19d490ba8d30c2825b07af_400x35.jpg) Next we compare Insertion Sort with the other quadratic-time algorithm encountered in this text—namely, Exchange Sort ([Algorithm 1.3](LiB0008.html#32)). Recall that the every-case time complexity of the number of comparisons of keys in Exchange Sort is given by ![](https://box.kancloud.cn/ffd26f03493ea45e704d5434afc902b4_146x42.jpg) In the exercises, you are asked to show that *the worst-case and average-case time complexities for the number of assignments of records done by Exchange Sort* are given by ![](https://box.kancloud.cn/1e4e953c65dff0a1309acc11184d5ae7_400x44.jpg) Clearly, *Exchange Sort is an in-place sort*. [Table 7.1](#ch07table01) summarizes our results concerning Exchange Sort and Insertion Sort. We see from this table that in terms of comparisons of keys, Insertion Sort always performs at least as well as Exchange Sort and, on, the average, performs better. In terms of assignments of records, Insertion Sort, performs better both in the worst-case and on the average. Because both are in-place sorts, Insertion Sort is the better algorithm. Notice that another algorithm, Selection Sort, is also included in [Table 7.1](#ch07table01). This algorithm is a slight modification of Exchange Sort and removes one of the disadvantages of Exchange Sort. We present it next. **![](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)**Table 7.1: Analysis summary for Exchange Sort, Insertion Sort, and Selection Sort\[[\*](#ftn.ch07table01fnt01)\][![Click To expand](https://box.kancloud.cn/7fa6ee9d780bcd548191d11f695f2fb6_350x172.jpg)](figu273_3_0.jpg) \[[\*](#ch07table01fnt01)\]Entries are approximate. **![](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)**Algorithm 7.2: Selection Sort**![Start example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)**Problem: Sort *n* keys in nondecreasing order. Inputs: positive integer *n*; array of keys *S* indexed from 1 to *n*. Outputs: the array *S* containing the keys in nondecreasing order. ``` void selectionsort (int n, keytype S[]) { index i, j, smallest; for (i = 1; i <= n - 1; i++){ smallest = i; for (j = i + 1; j <= n; j++) if (S[j] < S[ smallest]) smallest = j; exchange S[i] and S[ smallest]; } } ``` **![End example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)** Clearly, this algorithm has the same time complexity as Exchange Sort in terms of comparisons of keys. However, the assignments of records are significantly different. Instead of exchanging *S* \[*i*\] and *S* \[*j*\] every time *S* \[*j*\] is found to be smaller than *S* \[*i*\], as Exchange Sort does ( see [Algorithm 1.3](LiB0008.html#32)), Selection Sort simply keeps track of the index of the current smallest key among the keys in the *i*th through the *n*th slots. After determining that record, it exchanges it with the record in the *i*th slot. In this way, the smallest key is placed in the first slot after the first pass through the **for**-*i* loop, the second smallest key is put in the second slot after the second pass, and so on. The result is the same as that of Exchange Sort. However, by doing only one exchange at the bottom of the **for**-*i* loop, we have made the number of exchanges exactly *n* – 1. Because three assignments are needed to do an exchange, *the every-case time complexity of the number of assignments of records done by Selection Sort* is given by ![](https://box.kancloud.cn/169e85df4ebdc3c05fd0f120221eaf1d_140x21.jpg) Recall that the average-case number of assignments of records for Exchange Sort is about 3*n*2/4. Therefore, on the average, we've replaced quadratic time with linear time. Exchange Sort sometimes does better than Selection Sort. For example, if the records are already sorted, Exchange Sort does no assignments of records. How does Selection Sort compare with Insertion Sort? Look again at [Table 7.1](#ch07table01). In terms of comparisons of keys, Insertion Sort always performs at least as well as Selection Sort and, on the average, performs better. However, Selection Sort's time complexity in terms of assignments of records is linear whereas Insertion Sort's is quadratic. Recall that linear time is much faster than quadratic time when *n* is large. Therefore, if *n* is large and the records are big (so that the time to assign a record is costly), Selection Sort should perform better. Any sorting algorithm that selects records in order and puts them in their proper positions is called a ***selection sort***. This means that Exchange Sort is also a selection sort. Another selection sort, called Heapsort, is presented in [Section 7.6](LiB0057.html#670). [Algorithm 7.2](#ch07ex02), however, has been honored with the name "Selection Sort." The purpose of comparing Exchange Sort, Insertion Sort, and Selection Sort was to introduce a complete comparison of sorting algorithms as simply as possible. In practice, none of these algorithms is practical for extremely large instances because all of them are quadratic-time in both the average case and the worst case. Next we show that as long as we limit ourselves to algorithms in the same class as these three algorithms, it is not possible to improve on quadratic time as far as comparisons of keys are concerned.