1.2 The Importance of Developing Efficient Algorithms · Foundations of Algorithms Using C++ Pseudocod

## 1.2 The Importance of Developing Efficient Algorithms Previously we mentioned that, regardless of how fast computers become or how cheap memory gets, efficiency will always remain an important consideration. Next we show why this is so by comparing two algorithms for the same problem. ### 1.2.1 Sequential Search Versus Binary Search Earlier we mentioned that the approach used to find a name in the phone book is a modified binary search, and it is usually much faster than a sequential search. Next we compare algorithms for the two approaches to show how much faster the binary search is. We have already written an algorithm that does a sequential search—namely, [Algorithm 1.1](LiB0008.html#27). An algorithm for doing a binary search of an array that is sorted in nondecreasing order is similar to thumbing back and forth in a phone book. That is, given that we are searching for *x*, the algorithm first compares *x* with the middle item of the array. If they are equal, the algorithm is done. If *x* is smaller than the middle item, then *x* must be in the first half of the array (if it is present at all), and the algorithm repeats the searching procedure on the first half of the array. (That is, *x* is compared with the middle item of the first half of the array. If they are equal, the algorithm is done, etc.) If *x* is larger than the middle item of the array, the search is repeated on the second half of the array. This procedure is repeated until *x* is found or it is determined that *x* is not in the array. An algorithm for this method follows. Algorithm 1.5: Binary Search**![Start example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)**- Problem: Determine whether *x* is in the sorted array *S* of *n* keys. - Inputs: positive integer *n*, sorted (nondecreasing order) array of keys *S* indexed from 1 to *n*, a key *x.* - Outputs: *location*, the location of *x* in *S* (0 if *x* is not in *S*). ``` void binsearch (int n, const keytype S[], keytype x, index& location) { index low, high, mid; low = 1; high = n; location = 0; while (low < = high && location = = 0){ mid = ∟(low + high)/2⌋; if (x = = S[mid]) location = mid; else if (x < S[ mid ]) high = mid - 1; else low = mid + 1; } } ``` **![End example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)** Let's compare the work done by Sequential Search and Binary Search. For focus we will determine the number of comparisons done by each algorithm. If the array *S* contains 32 items and *x* is not in the array, [Algorithm 1.1](LiB0008.html#27) (Sequential Search) compares *x* with all 32 items before determining that *x* is not in the array. In general, Sequential Search does *n* comparisons to determine that *x* is not in an array of size *n.* It should be clear that this is the largest number of comparisons Sequential Search ever makes when searching an array of size *n.* That is, if *x* is in the array, the number of comparisons is no greater than *n.* Next consider [Algorithm 1.5](#ch01ex10) (Binary Search). There are two comparisons of *x* with *S*\[*mid*\] in each pass through the while loop (except when *x* is found). In an efficient assembler language implementation of the algorithm, *x* would be compared with *S*\[*mid*\] only once in each pass, the result of that comparison would set the condition code, and the appropriate branch would take place based on the value of the condition code. This means that there would he only one comparison of *x* with *S*\[*mid*\] in each pass through the while loop. We will assume the algorithm is implemented in this manner. With this assumption, [Figure 1.1](#ch01fig01) shows that the algorithm does six comparisons when *x* is larger than all the items in an array of size 32. Notice that 6 = lg 32 + 1. By "lg" we mean log2. The log2 is encountered so often in analysis of algorithms that we reserve the special symbol lg for it. You should convince yourself that this is the largest number of comparisons Binary Search ever does. That is, if *x* is in the array, or if *x* is smaller than all the array items, or if *x* is between two array items, the number of comparisons is no greater than when *x* is larger than all the array items. [![Click To expand](https://box.kancloud.cn/2a950cc7ee5d79061a1f50723ba1b8bb_350x50.jpg)](fig1-1_0.jpg) Figure 1.1: The array items that Binary Search compares with *x* when *x* is larger than all the items in an array of size 32. The items are numbered according to the order in which they are compared. Suppose we double the size of the array so that it contains 64 items. Binary Search does only one comparison more because the first comparison cuts the array in half, resulting in a subarray of size 32 that is searched. Therefore, when *x* is larger than all the items in an array of size 64, Binary Search does seven comparisons. Notice that 7 = lg 64 + 1. In general, each time we double the size of the array we add only one comparison. Therefore, if *n* is a power of 2 and *x* is larger than all the items in an array of size *n*, the number of comparisons done by Binary Search is lg *n* + 1. [Table 1.1](#ch01table01) compares the number of comparisons done by Sequential Search and Binary Search for various values of *n*, when *x* is larger than all the items in the array. When the array contains around 4 billion items (about the number of people in the world), Binary Search does only 33 comparisons, whereas Sequential Search compares *x* with all 4 billion items. Even if the computer was capable of completing one pass through the while loop in a nanosecond (one billionth of a second), Sequential Search would take 4 seconds to determine that *x* is not in the array, whereas Binary Search would make that determination almost instantaneously. This difference would be significant in an online application or if we needed to search for many items. Table 1.1: The number of comparisons done by Sequential Search and Binary Search when *x* is larger than all the array itemsArray Size Number of Comparisons by Sequential Search Number of Comparisons by Binary Search 128 128 8 1,024 1,024 11 1,048,576 1,048,576 21 4, 294, 967, 296 4, 294, 967, 296 33 For convenience, we considered only arrays whose sizes were powers of 2 in the previous discussion of Binary Search. In [Chapter 2](LiB0014.html#141) we will return to Binary Search as an example of the divide-and-conquer approach, which is the focus of that chapter. At that time we will consider arrays whose sizes can be any positive integer. As impressive as the searching example is, it is not absolutely compelling because Sequential Search still gets the job done in an amount of time tolerable to a human life span. Next we will look at an inferior algorithm that does not get the job done in a tolerable amount of time. ### 1.2.2 The Fibonacci Sequence The algorithms discussed here compute the *n*th term of the Fibonacci sequence, which is defined recursively as follows: ![](https://box.kancloud.cn/f48e804b01fae69ba0bbffee7aca3d7e_255x75.jpg) Computing the first few terms, we have ![](https://box.kancloud.cn/02c1a3db584a5cafb31f99c027e448a9_237x102.jpg) There are various applications of the Fibonacci sequence in computer science and mathematics: Because the Fibonacci sequence is defined recursively, we obtain the following recursive algorithm from the definition. Algorithm 1.6: nth Fibonacci Term (Recursive)**![Start example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)**- Problem: Determine the *n*th term in the Fibonacci sequence. - Inputs: a nonnegative integer *n.* - Outputs: *fib*, the *n*th term of the Fibonacci sequence. ``` int fib (int n) { if (n <= 1) return n; else return fib (n - 1) + fib (n-2); } ``` **![End example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)** By "nonnegative integer" we mean an integer that is greater than or equal to 0, whereas by "positive integer" we mean an integer that is strictly greater than 0. We specify the input to the algorithm in this manner to make it clear what values the input can take. However, for the sake of avoiding clutter, we declare *n* simply as an integer in the expression of the algorithm. We will follow this convention throughout the text. Although the algorithm was easy to create and is understandable, it is extremely inefficient. [Figure 1.2](#ch01fig02) shows the recursion tree corresponding to the algorithm when computing *fib*(5). The children of a node in the tree contain the recursive calls made by the call at the node. For example, to obtain *fib*(5) at the to! level we need *fib*(4) and *fib*(3); then to obtain *fib*(3) we need *fib*(2) and *fib*(1), and so on. As the tree shows, the function is inefficient because values are computed over and over again. For example, *fib*(2) is computed three times. [![Click To expand](https://box.kancloud.cn/d3d9d69571a28941892841d7f12035f4_350x260.jpg)](fig1-2_0.jpg) Figure 1.2: The recursion tree corresponding to [Algorithm 1.6](#ch01ex11) when computing the fifth Fibonacci term. How inefficient is this algorithm? The tree in [Figure 1.2](#ch01fig02) shows that the algorithm computes the following numbers of terms to determine *fib(n)* for 0 ≤ *n* ≤ 6: *n* Number of Terms Computed 0 1 1 1 2 3 3 5 4 9 5 15 6 25 The first six values can be obtained by counting the nodes in the subtree rooted at *fib(n)* for 1 ≤ *n* ≤ 5, whereas the number of terms for *fib(6)* is the sum of the nodes in the trees rooted at *fib*(5) and *fib*(4) plus the one node at the root. These numbers do not suggest a simple expression like the one obtained for Binary Search. Notice, however, that in the case of the first seven values, the number of terms in the tree more than doubles every time it increases by 2. For example, there are nine terms in the tree when *n* = 4 and 25 terms when *n* = 6. Let's call *T(n)* the number of terms in the recursion tree for *n.* If the number of terms more than doubled every time *n* increased by 2, we would have the following for *m* a positive power of 2: ![](https://box.kancloud.cn/6a22b0718e607d9eb25d80f1eae8e110_211x245.jpg) Because *T*(0) = 1, this would mean *T(n)* > 2*n*/2. We use induction to show that this is true for *n* ≥ 2 even if *n* is not a power of 2. The inequality does not hold for *n* = 1 because *T*(1) = 1, which is less than 21/2. Induction is reviewed in [Section A.3](LiB0095.html#1289) in [Appendix A](LiB0093.html#1281). Theorem 1.1**![Start example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)**If *T* (*n*) is the number of terms in the recursion tree corresponding to [Algorithm 1.6](#ch01ex11), then, for *n* ≥ 2, ![](https://box.kancloud.cn/f2ad82b24865551f3096771e08a37b1b_103x24.jpg) Proof: The proof is by induction on *n*. Induction base: We need two base cases because the induction step assumes the results of two previous cases. For *n* = 2 and *n* = 3, the recursion in [Figure 1.2](#ch01fig02) shows that ![](https://box.kancloud.cn/82ed013155059ceca4bf28693555f633_200x55.jpg) Induction hypothesis: One of way to make the induction hypothesis is to assume that the statement is true for all *m < n.* Then, is the induction step, show that this implies that the statement must be true for *n.* This technique is used in this proof. Suppose for all *m* such that 2 ≤ *m < n.* ![](https://box.kancloud.cn/ef78ab9e1b1a1dfb21309fdab38b5d41_115x26.jpg) Induction step: We must show that *T*(*n*) < 2*n*/2. The value of *T*(*n*) is the sum of *T*(*n* - 1) and *T*(*n* - 2) plus the one node at the root. Therefore, ![](https://box.kancloud.cn/77308a71c9673bd8f27feca7c05341bf_400x61.jpg) **![End example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)** We established that the number of terms computed by [Algorithm 1.6](#ch01ex11) to determine the *n*th Fibonacci term is greater than 2*n*\\2. We will return to this result to show how inefficient the algorithm is. But first let's develop an efficient algorithm for computing the *n*th Fibonacci term. Recall that the problem with the recursive algorithm is that the same value is computed over and over. As [Figure 1.2](#ch01fig02) shows, *fib*(2) is computed three times in determining *fib*(5). If when computing a value, we save it in an array, then whenever we need it later we do not need to recompute it. The following iterative algorithm uses this strategy. Algorithm 1.7: *n*th Fibonacci Term (Iterative)**![Start example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)**- Problem: Determine the *n*th term in the Fibonacci sequence. - Inputs: a nonnegative integer *n.* - Outputs : *fib*2, the *n*th term in the Fibonacci sequence. ``` int fib 2 (int n) { index i; int f[0 .. n]; f[ 0 ] = 0; if (n < 0) f[ 1 ] = 1; for (i = 2; i<= n; i++) f[ i ] = f[i - 1] + f [i -2 ]; } return f[ n ]; } ``` **![End example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)** [Algorithm 1.7](#ch01ex13) can be written without using the array *f* because only the two most recent terms are needed in each iteration of the loop. However, it is more clearly illustrated using the array. To determine *fib*2(*n*), the previous algorithm computes every one of the first *n* + 1 terms just once. So it computes *n* + 1 terms to determine the *n*th Fibonacci term. Recall that [Algorithm 1.6](#ch01ex11) computes more than 2*n*/2 terms to determine the nth Fibonacci term. [Table 1.2](#ch01table02) compares these expressions for various values of *n.* The execution times are based on the simplifying assumption that one term can be computed in 10-9 second. The table shows the time it would take [Algorithm 1.7](#ch01ex13) to compute the nth term on a hypothetical computer that could compute each term in a nanosecond, and it shows a lower bound on the time it would take to execute [Algorithm 1.7](#ch01ex13). By the time *n* is 80, [Algorithm 1.6](#ch01ex11) takes at least 18 minutes. When *n* is 120, it takes more than 36 years, an amount of time intolerable compared with a human life span. Even if we could build a computer one billion times as fast, [Algorithm 1.6](#ch01ex11) would take over 40,000 years to compute the 200th term. This result can be obtained by dividing the time for the 200th term by one billion. We see that regardless of how fast computers become, [Algorithm 1.6](#ch01ex11) will still take an intolerable amount of time unless *n* is small. On the other hand, [Algorithm 1.7](#ch01ex13) computes the *n*th Fibonacci term almost instantaneously. This comparison shows why the efficiency of an algorithm remains an important consideration regardless of how fast computers become. Table 1.2: A comparison of [Algorithms 1.6](#ch01ex11) and [1.7](#ch01ex13)*n* *n* + 1 2*n*/2 Execution Time Using [Algorithm 1.7](#ch01ex13) Lower Bound on Execution Time Using [Algorithm 1.6](#ch01ex11) 40 41 1,048,576 41 ns\[[\*](#ftn.ch01table02fnt01)\] 1048 μ*s*\[[†](#ftn.ch01table02fnt02)\] 60 61 1\.l × 1O9 61 ns 1 s 80 81 1\.1 × 1012 81 ns 18 min 100 101 1\.1 × 1015 101 ns 13 days 120 121 1\.2 × 1018 121 ns 36 years 160 161 1\.2 × 1024 161 ns 3\.8 × 107 years 200 201 1\.3 × 1030 201 ns 4 × 1013 years \[[\*](#ch01table02fnt01)\]1 ns = 10-9 second. \[[†](#ch01table02fnt02)\]1 μs = 10−6 second. [Algorithm 1.6](#ch01ex11) is a divide-and-conquer algorithm. Recall that the divide-and-conquer approach produced a very efficient algorithm ([Algorithm 1.5](#ch01ex10): Binary Search) for the problem of searching a sorted array. As shown in [Chapter 2](LiB0014.html#141), the divide-and-conquer approach leads to very efficient algorithms for some problems, but very inefficient algorithms for other problems. Our efficient algorithm for computing the *n*th Fibonacci term ([Algorithm 1.7](#ch01ex13)) is an example of the *dynamic programming approach*, which is the focus of [Chapter 3](LiB0024.html#252). We see that choosing the best approach can be essential. We showed that [Algorithm 1.6](#ch01ex11) computes at least an exponentially large number of terms, but could it be even worse? The answer is no. Using the techniques in [Appendix B](LiB0102.html#1369), it is possible to obtain an exact formula for the number of terms, and the formula is exponential in *n.* See [Examples B.5](LiB0103.html#1384) and [B.9](LiB0103.html#1394) in [Appendix B](LiB0102.html#1369) for further discussion of the Fibonacci sequence.