5.4 The Sum-of-Subsets Problem · Foundations of Algorithms Using C++ Pseudocod

## 5.4 The Sum-of-Subsets Problem Recall our thief and the 0–1 Knapsack problem from [Section 4.4.1](LiB0036.html#429). In this problem, there is a set of items the thief can steal, and each item has its own weight and profit. The thief's knapsack will break if the total weight of the items in it exceeds *W*. Therefore, the goal is to maximize the total value of the stolen items while not making the total weight exceed *W*. Suppose here that the items all have the same profit per unit weight. Then an optimal solution for the thief would simply be a set of items that maximized the total weight, subject to the constraint that its total weight did not exceed *W*. The thief might first try to determine whether there was a set whose total weight equaled *W*, because this would be best. The problem of determining such sets is called the Sum-of-Subsets problem. Specifically, in the Sum-of-Subsets problem, there are *n* positive integers (weights) *wi* and a positive integer *W*. The goal is to find all subsets of the integers that sum to *W*. As mentioned earlier, we usually state our problems so as to find all solutions. For the purposes of the thief's application, however, only one solution need be found. Example 5.2**![Start example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)**Suppose that *n* = 5, *W* = 21, and ![](https://box.kancloud.cn/75804e40c1ba59a7009296b1839da78c_400x18.jpg) Because ![](https://box.kancloud.cn/91bd03d7cd6d2d05877200f60fbd45e2_400x98.jpg) **![End example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)** This instance can be solved by inspection. For larger values of *n*, a systematic approach is necessary. One approach is to create a state space tree. A possible way to structure the tree appears in [Figure 5.7](#ch05fig07). For the sake of simplicity, the tree in this figure is for only three weights. We go to the left from the root to include *w*1, and we go to the right to exclude *w*1. Similarly, we go to the left from a node at level 1 to include *w*2, and we go to the right to exclude *w*2, etc. Each subset is represented by a path from the root to a leaf. When we include *wi*, we write *wi* on the edge where we include it. When we do not include *wi*, we write 0. [![Click To expand](https://box.kancloud.cn/ba8fb637aba2948aa36ecb68fb3d724e_350x221.jpg)](fig5-7_0.jpg) Figure 5.7: A state space tree for instances of the Sum-of-Subsets problem in which *n* = 3. Example 5.3**![Start example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)**[Figure 5.8](#ch05fig08) shows the state space tree for *n* = 3, *W* = 6, and ![](https://box.kancloud.cn/f8aced0ce42e9c00b5bc36a9a727ecfa_239x18.jpg) [![Click To expand](https://box.kancloud.cn/3e5d102d358c0c9ee5c844c833f783ac_350x198.jpg)](fig5-8_0.jpg) Figure 5.8: A state space tree for the Sum-of-Subsets problem for the instance in [Example 5.3](#ch05ex06). Stored at each node is the total weight included up to that node. At each node, we have written the sum of the weights that have been included up to that point. Therefore, each leaf contains the sum of the weights in the subset leading to that leaf. The second leaf from the left is the only one containing a 6. Because the path to this leaf represents the subset {*w*1, *w*2}, this subset is the only solution. **![End example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)** If we sort the weights in nondecreasing order before doing the search, there is an obvious sign telling us that a node is nonpromising. If the weights are sorted in this manner, then *wi*+1 is the lightest weight remaining when we are at the *i*th level. Let **weight** be the sum of the weights that have been included up to a node at level *i*. If *wi*+1 would bring the value of *weight* above *W*, then so would any other weight following it. Therefore, unless *weight* equals *W* (which means that there is a solution at the node), a node at the *i*th level is nonpromising if ![](https://box.kancloud.cn/78a9b11e42aa5ea32d20fd526200841e_160x21.jpg) There is another, less obvious sign telling us that a node is nonpromising. If, at a given node, adding all the weights of the remaining items to *weight* does not make *weight* at least equal to *W*, then *weight* could never become equal to *W* by expanding beyond the node. This means that if ***total*** is the total weight of the remaining weights, a node is nonpromising if ![](https://box.kancloud.cn/e19dea6f7b5a35836ed97fd95007998d_158x18.jpg) The following example illustrates these backtracking strategies. Example 5.4**![Start example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)**[Figure 5.9](#ch05fig09) shows the pruned state space tree when backtracking is used with *n* = 4, *W* = 13, and ![](https://box.kancloud.cn/83f12aad2cdf6208b9e14610b3bb2a25_328x23.jpg) [![Click To expand](https://box.kancloud.cn/414baba832f9765493debfcf9be4819f_350x276.jpg)](fig5-9_0.jpg) Figure 5.9: The pruned state space tree produced using backtracking in [Example 5.4](#ch05ex07). Stored at each node is the total weight included up to that node. The only solution is found at the shaded node. Each nonpromising node is marked with a cross. The only solution is found at the node shaded in color. The solution is {*w*1, *w*2, *w*4}. The nonpromising nodes are marked with crosses. The nodes containing 12, 8, and 9 are nonpromising because adding the next weight (6) would make the value of *weight exceed W.* The nodes containing 7, 3, 4, and 0 are nonpromising because there is not enough total weight remaining to bring the value of *weight* up to *W.* Notice that a leaf in the state space tree that does not contain a solution is automatically nonpromising because there are no weights remaining that could bring *weight* up to *W.* The leaf containing 7 illustrates this. There are only 15 nodes in the pruned state space tree, whereas the entire state space tree contains 31 nodes. **![End example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)** When the sum of the weights included up to a node equals *W*, there is a solution at that node. Therefore, we cannot get another solution by including more items. This means that if ![](https://box.kancloud.cn/50ef3cdf53cb2ab4dc8c0d54349e768a_101x19.jpg) we should print the solution and backtrack. This backtracking is provided automatically by our general procedure *checknode* because it never expands beyond a promising node where a solution is found. Recall that when we discussed *checknode* we mentioned that some backtracking algorithms sometimes find a solution before reaching a leaf in the state space tree. This is one such algorithm. Next we present the algorithm that employs these strategies. The algorithm uses an array *include*. It sets *include\[i\]* to"yes" if *w\[i\]* is to be included and to "no" if it is not. Algorithm 5.4: The Backtracking Algorithm for the Sum-of-Subsets Problem**![Start example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)**Problem: Given *n* positive integers (weights) and a positive integer *W*, determine all combinations of the integers that sum to *W*. Inputs: positive integer *n*, sorted (nondecreasing order) array of positive integers *w* indexed from 1 to *n*, and a positive integer *W*. Outputs: all combinations of the integers that sum to *W*. ``` void sum_of_subsets (index i, int weight, int total) { if (promising (i)) if (weight == W) cout << include [1] through include [i]; else{ include [i + 1] = "yes"; // Include w[i + 1]. sum_of_subsets (i + 1, weight + w[i + 1], total - w[i + 1]); include [i + 1] = "no"; // Do not include w[i + 1]. sum_of_subsets (i + 1, weight, total - w [i + 1]); } } bool promising (index i); { return (weight + total >=W) && (weight == W || weight + w[i + 1] <= W); } ``` **![End example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)** Following our usual convention, *n, w, W*, and *include* are not inputs to our routines. If these variables were defined globally, the top-level call to *sum\_of\_subsets* would be as follows: ![](https://box.kancloud.cn/0be47475fda87cc532fb5063d4ecd2b7_297x21.jpg) where initially ![](https://box.kancloud.cn/2260770e2432d77d99cbdf24cc193baf_129x54.jpg) Recall that a leaf in the state space tree that does not contain a solution is nonpromising because there are no weights left that could bring the value of *weight* up to *W*. This means that the algorithm should not need to check for the terminal condition *i = n*. Let's verify that the algorithm implements thiscorrectly. When *i = n*, the value of *total* is 0 (because there are no weights remaining). Therefore, at this point ![](https://box.kancloud.cn/f56f02a2236d19301a3509fdc151d514_306x21.jpg) which means that ![](https://box.kancloud.cn/09e00984bb05bb0b4884e6cc5ecedad8_158x20.jpg) is true only if *weight* ≤ *W*. Because we always keep *weight* ≥ *W*, we must have *weight = W*. Therefore, when *i = n*, function *promising* returns true only if *weight = W*. But in this case there is no recursive call because we found a solution. Therefore, we do not need to check for the terminal condition *i = n*. Notice that there is never a reference to the nonexistent array item *w*\[*n* + 1\] in function *promising* because of our assumption that the second condition in an **or** expression is not evaluated when the first condition is true. The number of nodes in the state space tree searched by [Algorithm 5.4](#ch05ex07) is equal to ![](https://box.kancloud.cn/78a4ba3448175b51e6c4de227df75b71_258x24.jpg) This equality is obtained in [Example A.3](LiB0095.html#1297) in [Appendix A](LiB0093.html#1281). Given only this result, the possibility exists that the worst case could be much better than this. That is, it could be that for every instance only a small portion of the state space tree is searched. This is not the case. For each *n*, it is possible to construct an instance for which the algorithm visits an exponentially large number of nodes. This is true even if we want only one solution. To this end, if we take ![](https://box.kancloud.cn/5d974f270e6b36e770c351f733cd987e_199x58.jpg) there is only one solution {Wn}, and it will not be found until an exponentially large number of nodes are visited. As stressed before, even though the worst case is exponential, the algorithm can be efficient for many large instances. In the exercises youare asked to write programs using the Monte Carlo technique to estimate the efficiency of [Algorithm 5.4](#ch05ex08) on various instances. Even if we state the problem so as to require only one solution, the Sum-of-Subsets problem, like the 0–1 Knapsack problem, is in the class of problems discussed in [Chapter 9](LiB0074.html#880).