2.2 Mergesort · Foundations of Algorithms Using C++ Pseudocod

## 2.2 Mergesort A process related to sorting is merging. By ***two-way merging*** we mean combining two sorted arrays into one sorted array. By repeatedly applying the merging procedure, we can sort an array. For example, to sort an array of 16 items, we can divide it into two subarrays, each of size 8, sort the two subarrays, and then merge them to produce the sorted array. In the same way, each subarray of size 8 can be divided into two subarrays of size 4, and these subarrays can be sorted and merged. Eventually, the size of the subarrays will become 1, and an array of size 1 is trivially sorted. This procedure is called "Mergesort." Given an array with *n* items (for simplicity, let *n* be a power of 2), Mergesort involves the following steps: 1. *Divide* the array into two subarrays each with *n*/2 items. 2. *Conquer* (solve) each subarray by sorting it. Unless the array is sufficiently small, use recursion to do this. 3. *Combine* the solutions to the subarrays by merging them into a single sorted array. The following example illustrates these steps. Example 2.2**![Start example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)**Suppose the array contains these numbers in sequence: > 27 10 12 25 13 15 22. 1. Divide the array: 27 10 12 20 and 25 13 15 22. 2. Sort each subarray: 10 12 20 27 and 13 15 22 25. 3. Merge the subarrays: 10 12 13 15 20 22 25 27. **![End example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)** In Step 2 we think at the problem-solving level and assume that the solutions to the subarrays are available. To make matters more concrete, [Figure 2.2](#ch02fig02) illustrates the steps done by a human when sorting with Mergesort. The terminal condition occurs when an array of size 1 is reached; at that time, the merging begins. [![Click To expand](https://box.kancloud.cn/09491185f5e7402156d07592c28e549e_350x449.jpg)](fig2-2_0.jpg) Figure 2.2: The steps done by a human when searching with Binary Search. (*Note:x* = 18.) Algorithm 2.2: Mergesort**![Start example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)**Problem: Sort *n* keys in nondecreasing sequence. Inputs: positive integer *n*, array of keys *S* indexed from 1 to *n.* Outputs: the array *S* containing the keys in nondecreasing order. ``` void mergesort (int n, keytype S[]) { if (n>1) { const int h=[n/2], m = n - h; keytype U[1 ..h], V[1 ..m]; copy S[1] through S[h] to U[1] through U[h]; copy S[h+1] through S[n] to V[1] through V[m]; mergesort(h, U); mergesort(m, V); merge (h, m, U, V, S); } } ``` **![End example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)** Before we can analyze Mergesort, we must write and analyze an algorithm that merges two sorted arrays. Algorithm 2.3: Merge**![Start example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)**Problem: Merge two sorted arrays into one sorted array. Inputs: positive integers *h* and *m*, array of sorted keys *U* indexed from 1 to *h*, array of sorted keys *V* indexed from 1 to *m.* Outputs: an array *S* indexed from 1 to *h + m* containing the keys in *U* and *V* in a single sorted array. ``` void merge (int h, int m, const keytype U[], const keytype V[], keytype S[]) { index i, j, k; i = 1; j = 1; k = 1; while (i = h && j <= m) { if (U[i] < V[j]) { S[k] = U[i]; i++; } else{ S[k] = V[j]; j++; } k++; } if (i>h) copy V[j] through V[m] to S[k] through S[h+m]; else copy U[i] through U[h] to S[k] through S[h+m]; } ``` **![End example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)** [Table 2.1](#ch02table01) illustrates how procedure *merge* works when merging two arrays of size 4. Table 2.1: An example of merging two arrays *U* and *V* into one array *S*\[[\*](#ftn.ch02fnt01)\]*k* *U* *V* *S* (Resutl) 1 **10** 12 20 27 **13** 15 22 25 10 2 10 **12** 20 27 **13** 15 22 25 10 12 3 10 12 **20** 27 **13** 15 22 25 10 12 13 4 10 12 **20** 27 13 **15** 22 25 10 12 13 15 5 10 12 **20** 27 13 15 **22** 25 10 12 13 15 20 6 10 12 20 **27** 13 15 **22** 25 10 12 13 15 20 22 7 10 12 20 **27** 13 15 22 **25** 10 12 13 15 20 22 25 — 10 12 20 27 13 15 22 25 10 12 13 15 20 22 25 27 ← Final values \[[\*](#ch02fnt01)\]Items compared are in boldface. Items just exchanged appear in squares. **![Start Sidebar](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)**Analysis of [Algorithm 2.3](#ch02ex06) Worse-Case Time Complexity (Merge)As mentioned in [Section 1.3](LiB0017.html#172), in the case of algorithms that sort by comparing keys, the comparison instruction and the assignment instruction can each be considered the basic operation. Here we will consider the comparison instruction. When we discuss Mergesort further in [Chapter 7](LiB0052.html#627), we will consider the number of assignments. In this algorithm, the number of comparisons depends on both *h* and *m.* We therefore have the following: Basic operation: the comparison of *U\[i\]* with *V\[j\].* Input size: *h* and *m*, the number of items in each of the two input arrays. The worst case occurs when the loop is exited, because one of the indices—say, *i*—has reached its exit point *h*+1 whereas the other index *j* has reached *m*, 1 less than its exit point. For example, this can occur when the first *m* − 1 items in *V* are placed first in *S*, followed by all *h* items in *U*, at which time the loop is exited because *i* equals *h* + 1. Therefore, ![](https://box.kancloud.cn/d18581baa031cd4a188b570a4d77b88c_180x22.jpg) **![End Sidebar](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)** We can now analyze Mergesort. **![Start Sidebar](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)**Analysis of [Algorithm 2.2](#ch02ex05) Worst-Case Time Complexity (Mergesort)The basic operation is the comparison that takes place in *merge.* Because the number of comparisons increases with *h* and *m*, and *h* and *m* increase with *n*, we have the following: Basic operation: the comparison that takes place in *merge.* Input size: *n*, the number of items in the array *S.* The total number of comparisons is the sum of the number of comparisons in the recursive call to *mergesort* with *U* as the input, the number of comparisons in the recursive call to *mergesort* with *V* as the input, and the number of comparisons in the top-level call to *merge.* Therefore, ![](https://box.kancloud.cn/8f679c0a1455c83924280d26058b2d9a_400x48.jpg) We first analyze the case where *n* is a power of 2. In this case, ![](https://box.kancloud.cn/e064d0142e2cc0e082867f149d281863_227x116.jpg) Our expression for *W* (*n*) becomes ![](https://box.kancloud.cn/a9a375871ece17e41b463b0802d96a3a_276x78.jpg) When the input size is 1, the terminal condition is met and no merging is done. Therefore, *W*(1) is 0. We have established the recurrence ![](https://box.kancloud.cn/291d8bca36acfe712017877c606ae303_400x72.jpg) This recurrence is solved in Example B.19 in [Appendix B](LiB0102.html#1369). The solution is ![](https://box.kancloud.cn/09045a3b1c50f3d128721a8cf7ac1b59_295x22.jpg) For *n* not a power of 2, we will establish in the exercises that ![](https://box.kancloud.cn/2a295212423a7023f90ea5100586c930_319x38.jpg) where \[*y*\] and \[*y*\] are the smallest integer ≥ *y* and the largest integer ≤ *y*, respectively. It is hard to analyze this case exactly because of the floors (\[\]) and ceilings (\[\]). However, using an induction argument like the one in Example B.25 in [Appendix B](LiB0102.html#1369), it can be shown that *W(n)* is nondecreasing. Therefore, Theorem B.4 in that appendix implies that ![](https://box.kancloud.cn/e8203da0881339ff89b31ca0ed8150aa_153x22.jpg) **![End Sidebar](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)** An ***in-place sort*** is a sorting algorithm that does not use any extra space beyond that needed to store the input. [Algorithm 2.2](#ch02ex05) is not an in-place sort because it uses the arrays *U* and *V* besides the input array *S.* If *U* and *V* are variable parameters (passed by address) in *merge*, a second copy of these arrays will not be created when *merge* is called. However, new arrays *U* and *V* will still be created each time *mergesort* is called. At the top level, the sum of the numbers of items in these two arrays is *n.* In the top-level recursive call, the sum of the numbers of items in the two arrays is about *n*/2; in the recursive call at the next level, the sum of the numbers of items in the two arrays is about *n*/4; and, in general, the sum of the numbers of items in the two arrays at each recursion level is about one-half of the sum at the previous level. Therefore, the total number of extra array items created is about *n*(1 + 1/2 + 1/4 + …) = 2*n.* [Algorithm 2.2](#ch02ex05) clearly illustrates the process of dividing an instance of a problem into smaller instances because two new arrays (smaller instances) are actually created from the input array (original instance). Therefore, this was a good way to introduce Mergesort and illustrate the divide-and-conquer approach. However, it is possible to reduce the amount of extra space to only one array containing *n* items. This is accomplished by doing much of the manipulation on the input array *S.* The following method for doing this is similar to the method used in [Algorithm 2.1](LiB0015.html#149) (Binary Search, Recursive). Algorithm 2.4: Mergesort 2**![Start example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)**Problem: Sort *n* keys in nondecreasing sequence. Inputs: positive integer *n*, array of keys *S* indexed from 1 to *n.* Outputs: the array *S* containing the keys in nondecreasing order. ``` void mergesort2 (index low, index high) { index mid; if (low < high) { mid = [(low + high)/2]; mergesort2(low, mid); mergesort2(mid + 1, high); merge2(low, mid, high); } } ``` **![End example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)** Following our convention of making only variables, whose values can change in recursive calls, parameters to recursive routines, *n* and *S* are not parameters to procedure *mergesort2.* If the algorithm were implemented by defining *S* globally and *n* was the number of items in *S*, the top-level call to *mergesort2* would be as follows: *mergesort2*(1, *n*); The merging procedure that works with *mergesort2* follows. Algorithm 2.5: Merge 2**![Start example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)**Problem: Merge the two sorted subarrays of *S* created in Mergesort 2. Inputs: indices *low, mid*, and *high*, and the subarray of *S* indexed from *low* to *high.* The keys in array slots from *low* to *mid* are already sorted in nondecreasing order, as are the keys in array slots from *mid* + 1 to *high.* Outputs: the subarray of *S* indexed from *low* to *high* containing the keys in nondecreasing order. ``` void merge2 (index low, index mid, index high) { index i, j, k; keytype U[low .. high]; // A local array needed for the // merging i = low; j = mid + 1; k = low; while (i ≤ mid && j ≤ high){ if (S[i]S[j]){ U[k] = S[i]; i++; } else{ U[k] = S[j]; j++; } k++; } if (i > mid) move S[j] through S[high] to U[k] through U[high]; else move S[i] through S[mid] to U[k] through U[high]; move U[low] through U[high] to S[low] through S[high]; } ``` **![End example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)**