Chapter 5: Backtracking · Foundations of Algorithms Using C++ Pseudocod

# Chapter 5: Backtracking If you were trying to find your way through the well-known maze of hedges by Hampton Court Palace in England, you would have no choice but to follow a hopeless path until you reached a dead end. When that happened, you'd go back to a fork and pursue another path. Anyone who has ever tried solving a maze puzzle has experienced the frustration of hitting dead ends. Think how much easier it would be if there were a sign, positioned a short way down a path, that told you that the path led to nothing but dead ends. If the sign were positioned near the beginning of the path, the time savings could be enormous, because all forks after that sign would be eliminated from consideration. This means that not one but many dead ends would be avoided. There are no such signs in the famous maze of hedges or in most maze puzzles. However, as we shall see, they do exist in backtracking algorithms. Backtracking is very useful for problems such a the 0–1 Knapsack problem. Although in Section 4.4.3. we found a dynamic programming algorithm for this problem that is efficient if the capacity *W* of the knapsack is not large, the algorithm is still exponential-time in the worst case. The 0–1 Knapsack problem is in the class of problems discussed in [Chapter 9](LiB0074.html#880). No one has ever found algorithms for any of those problems whose worst-case time complexities are better than exponential, but no one has ever proved that such algorithms are not possible. One way to try to handle the 0–1 Knapsack problem would be to actually generate all the subsets, but this would be like following every path in a maze until a dead end is reached. Recall from [Section 4.4.1](LiB0036.html#429) that there are 2n subsets, which means that this brute-force method is feasible only for small values of *n*. However, if while generating the subsets we can find signs that tell us that many of them need not be generated, we can often avoid much unnecessary labor. This is exactly what a backtracking algorithm does. Backtracking algorithms for problems such as the 0–1 Knapsack problem are still exponential-time (or even worse) in the worst case. They are useful because they are efficient for many large instances, not because they are efficient for all large instances. We return to the 0–1 Knapsack problem in [Section 5.7](LiB0045.html#537). Before that, we introduce backtracking with a simple example in [Section 5.1](#ch05lev1sec1) and solve several other problems in the other sections. ## 5.1 The Backtracking Technique Backtracking is used to solve problems in which a ***sequence*** of objects is chosen from a specified ***set*** so that the sequence satisfies some ***criterion***. The classic example of the use of backtracking is in the *n*-Queens problem. The goal in this problem is to position *n* queens on an *n* × *n* chessboard so that no two queens threaten each other. That is, no two queens may be in the same row, column, or diagonal. The ***sequence*** in this problem is the *n* positions in which the queens are placed, the *set* for each choice is the *n*2 possible positions on the chessboard, and the *criterion* is that no two queens can threaten each other. The *n*-Queens problem is a generalization of its instance when *n* = 8, which is the instance using a standard chessboard. For the sake of brevity, we will illustrate backtracking using the instance when *n* = 4. Backtracking is a modified depth-first search of a tree (here "tree" means a rooted tree). So let's review the *depth-first search* before proceeding. Although the depth-first search is defined for graphs in general, we will discuss only searching of trees, because backtracking involves only a tree search. A ***preorder*** tree traversal is a ***depth-first search*** of the tree. This means that the root is visited first, and a visit to a node is followed immediately by visits to all descendants of the node. Although a depth-first search does not require that the children be visited in any particular order, we will visit the children of a node from left to right in the applications in this chapter. [Figure 5.1](#ch05fig01) shows a depth-first search of a tree performed in this manner. The nodes are numbered in the order in which they are visited. Notice that in a depth-first search a path is followed as deeply as possible until a dead end is reached. At a dead end we back up until we reach a node with an unvisited child, and then we again proceed to go as deep as possible. [![Click To expand](https://box.kancloud.cn/10befabbb471bedd5b1ed34622b5d4cf_350x215.jpg)](fig5-1_0.jpg) Figure 5.1: A tree with nodes numbered according to a depth-first search. There is a simple recursive algorithm for doing a depth-first search. Because we are presently interested only in preorder traversals of trees, we give a version that specifically accomplishes this. The procedure is called by passing the root at the top level. ``` void depth_first_tree_search (node v) { node u; visit v; for (each child u of v) depth_first-tree_search (u); } ``` This general-purpose algorithm does not state that the children must be visited in any particular order. However, as mentioned previously, we visit them from left to right. Now let's illustrate the backtracking technique with the instance of the *n*-Queens problem when *n* = 4. Our task is to position four queens on a 4 × 4 chessboard so that no two queens threaten each other. We can immediately simplify matters by realizing that no two queens can be in the same row. The instance can then be solved by assigning each queen a different row and checking which column combinations yield solutions. Because each queen can be placed in one of four columns, there are 4 × 4 × 4 × 4 = 256 candidate solutions. We can create the candidate solutions by constructing a tree in which the column choices for the first queen (the queen in row 1) are stored in level-1 nodes in the tree (recall that the root is at level 0), the column choices for the second queen (the queen in row 2) are stored in level-2 nodes, and so on. A path from the root to a leaf is a candidate solution (recall that a ***leaf*** in a tree is a node with no children). This tree is called a ***state space tree.*** A small portion of it appears in [Figure 5.2](#ch05fig02). The entire tree has 256 leaves, one for each candidate solution. Notice that an ordered pair < *i, j* > is stored at each node. This ordered pair means that the queen in the *i*th row is placed in the *j*th column. [![Click To expand](https://box.kancloud.cn/cad8997584e49e511433ac8b341dbf7f_350x218.jpg)](fig5-2_0.jpg) Figure 5.2: A portion of the state space tree for the instance of the *n*-Queens problem in which *n* = 4. The ordered pair <*i, j*>, at each node means that the queen in the *i* th row is placed in the *j*th column. Each path from the root to a leaf is a candidate solution. To determine the solutions, we check each candidate solution (each path from the root to a leaf) in sequence, starting with the leftmost path. The first few paths checked are as follows: > \[< 1, 1 >, < 2, 1 >, < 3, 1 >, < 4, 1 >\] > \[< 1, 1 >, < 2, 1 >, < 3, 1 >, < 4, 2 >\] > \[< 1, 1 >, < 2, 1 >, < 3, 1 >, < 4, 3 >\] > \[< 1, 1 >, < 2, 1 >, < 3, 1 >, < 4, 4 >\] > \[< 1, 1 >, < 2, 1 >, < 3, 2 >, < 4, 1 >\] Notice that the nodes are visited according to a depth-first search in which the children of a node are visited from left to right. A simple depth-first search of a state space tree is like following every path in a maze until you reach a dead end. It does not take advantage of any signs along the way. We can make the search more efficient by looking for such signs. For example, as illustrated in [Figure 5.3(a)](#ch05fig03), no two queens can be in the same column. Therefore, there is no point in constructing and checking any paths in the entire branch emanating from the node containing <2, 1>in [Figure 5.2](#ch05fig02). (Because we have already placed queen 1 in column 1, we cannot place queen 2 there.) This sign tells us that this node can lead to nothing but dead ends. Similarly, as illustrated in [Figure 5.3(b)](#ch05fig03), no two queens can be on the same diagonal. Therefore, there is no point in constructing and checking the entire branch emanating from the node containing <2, 2>in [Figure 5.2](#ch05fig02). [![Click To expand](https://box.kancloud.cn/2e04c8e21e5408ce4d69960ce8b664b2_350x181.jpg)](fig5-3_0.jpg) Figure 5.3: Diagram showing that if the first queen is placed in column 1, the second queen cannot be placed in column 1 (a) or column 2 (b). ***Backtracking*** is the procedure whereby, after determining that a node can lead to nothing but dead ends, we go back ("backtrack") to the node's parent and proceed with the search on the next child. We call a node ***nonpromising*** if when visiting the node we determine that it cannot possibly lead to a solution. Otherwise, we call it ***promising***. To summarize, backtracking consists of doing a depth-first search of a state space tree, checking whether each node is promising, and, if it is nonpromising, backtracking to the node's parent. This is called ***pruning*** the state space tree, and the subtree consisting of the visited nodes is called the ***pruned state space tree***. A general algorithm for the backtracking approach is as follows: ``` void checknode (node v) { node u; if (promising(v)) if (there is a solution at v) write the solution; else for (each child u of v) checknode(u); } ``` The root of the state space tree is passed to *checknode* at the top level. A visit to a node consists of first checking whether it is promising. If it is promising and there is a solution at the node, the solution is printed. If there is not a solution at a promising node, the children of the node are visited. The function *promising* is different in each application of backtracking. We call it the ***promising function*** for the algorithm. A backtracking algorithm is identical to a depth-first search, except that the children of a node are visited only when the node is promising and there is not a solution at the node. (Unlike the algorithm for the *n*-Queens problem, in some backtracking algorithms a solution can be found before reaching a leaf in the state space tree.) We have called the backtracking procedure *checknode* rather than *backtrack* because backtracking does not occur when the procedure is called. Rather, it occurs when we find that a node is nonpromising and proceed to the next child of the parent. A computer implementation of the recursive algorithm accomplishes backtracking by popping the activation record for a nonpromising node from the stack of activation records. Next we use backtracking to solve the instance of the *n*-Queens problem when *n* = 4. Example 5.1**![Start example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)**Recall that the function *promising* is different for each application of backtracking. For the *n*-Queens problem, it must return false if a node and any of the node's ancestors place queens in the same column or diagonal. [Figure 5.4](#ch05fig04) shows a portion of the pruned state space tree produced when backtracking is used to solve the instance in which *n* = 4. Only the nodes checked to find the first solution arc shown. [Figure 5.5](#ch05fig05) shows the actual chessboards. A node is marked with a cross in [Figure 5.4](#ch05fig04) if it is nonpromising. Similarly, there is a cross in a nonpromising position in [Figure 5.5](#ch05fig05). The color-shaded node in [Figure 5.4](#ch05fig04) is the one where the first solution is found. A walk-through of the traversal done by backtracking follows. We refer to a node by the ordered pair stored at that node. Some of the nodes contain the same ordered pair, but you can tell which node we mean by traversing the tree in [Figure 5.4](#ch05fig04) while we do the walk-through. **![End example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)** [![Click To expand](https://box.kancloud.cn/2e2666b1a005d3c368bbcf6a81503d6e_350x203.jpg)](fig5-4_0.jpg) Figure 5.4: A portion of the pruned state space tree produced when backtracking is used to solve the instance of the *n*-Queens problems in which *n* = 4. Only the nodes checked to find the first solution are shown. That solution is found at the color-shaded node. Each nonpromising node is marked with a cross. [![Click To expand](https://box.kancloud.cn/9da620d389411eacc8f9e9fdb9a60e69_350x306.jpg)](fig5-5_0.jpg) Figure 5.5: The actual chessboard positions that are tried when backtracking is used to solve the instance of the *n*-Queens problem in which *n* = 4. Each nonpromising position is marked with a cross. 1. <1, 1>is promising. {because queen 1 is the first queen positioned} 1. <2, 1>is nonpromising. <2, 2>is nonpromising. <2, 3>is promising. {because queen 1 is in column 1} {because queen 1 is on left diagonal} 1. <3, 1>is nonpromising. <3, 2>is nonpromising. <3, 3>is nonpromising. <3, 4>is nonpromising. {because queen 1 is in column 1} {because queen 2 is on right diagonal} {because queen 2 is in column 3} {because queen 2 is on left diagonal} 1. Backtrack to <1, 1>. <2, 4>is promising. 1. <3, 1>is nonpromising. <3, 2>is promising. {because queen 1 is in column 1} {This is the second time we've tried <3, 2>.} 1. <4, 1>is nonpromising. <4, 2>is nonpromising. <4, 3>is nonpromising. <4, 4>is nonpromising. {because queen 1 is in column 1} {because queen 3 is in column 2} {because queen 3 is on left diagonal} {because queen 2 is in column 4} 1. Backtrack to <2, 4>. <3, 3>is nonpromising. <3, 4>is nonpromising. {because queen 2 is on right diagonal} {because queen 2 is in column 4} 1. Backtrack to root. <1, 2>is promising. 1. <2, 1>is nonpromising. <2, 2>is nonpromising. <2, 3>is nonpromising. <2, 4>is promising. {because queen 1 is on right diagonal} {because queen 1 is in column 2} {because queen 1 is on left diagonal} 1. <3, 1>is promising. {This is the third time we've tried <3, 1>.} 1. <4, 1>is nonpromising. <4, 2>is nonpromising. <4, 3>is promising. {because queen 3 is in column 1} {because queen 1 is in column 2} At this point the first solution has been found. It appears in [Figure 5.5(k)](#ch05fig05), and the node at which it is found is shaded in [Figure 5.4](#ch05fig04). Notice that a backtracking algorithm need not actually create a tree. Rather, it only needs to keep track of the values in the current branch being investigated. This is the way we implement backtracking algorithms. We say that the state space tree exists ***implicitly*** in the algorithm because it is not actually constructed. A node count in [Figure 5.4](#ch05fig04) shows that the backtracking algorithm checks 27 nodes before finding a solution. In the exercises you will show that, without backtracking, a depth-first search of the state space tree checks 155 nodes before finding that same solution. You may have observed that there is some inefficiency in our general algorithm for backtracking (procedure *checknode*). That is, we check whether a node is promising after passing it to the procedure. This means that activation records for nonpromising nodes are unnecessarily placed on the stack of activation records. We could avoid this by checking whether a node is promising before passing it. A general algorithm for backtracking that does this is as follows: ``` void expand (node v) { node u; for (each child u of v) if (promising(u)) if (there is a solution at u) write the solution; else expand (u); } ``` The node passed at the top level to the procedure is again the root of the tree. We call this procedure *expand* because it is called when we expand a promising node. A computer implementation of this algorithm accomplishes backtracking from a nonpromising node by not pushing an activation record for the node onto the stack of activation records. In explaining algorithms in this chapter we use the first version of the algorithm (procedure *checknode*) because we have found that this version typically produces algorithms that are easier to understand. The reason is that one execution of *checknode* consists of the steps done when visiting a single node. That is, it consists of the following steps. Determine if the node is promising. If it is promising, then if there is a solution at the node, print the solution; otherwise, visit its children. On the other hand, one execution of expand involves doing the same steps for all children of a node. After seeing the first version of an algorithm, it is not hard to write the second version. Next we will develop backtracking algorithms for several problems, starting with the *n*-Queens problem. In all these problems, the state space tree contains an exponentially large or larger number of nodes. Backtracking is used to avoid unnecessary checking of nodes. Given two instances with the same value of *n*, a backtracking algorithm may check very few nodes for one of them but the entire state space tree for the other. This means that we will not obtain efficient time complexities for our backtracking algorithms as we did for the algorithms in the preceding chapters. Therefore, instead of the types of analyses done in those chapters, we will analyze our backtracking algorithms using the Monte Carlo technique. This technique enables us to determine whether we can expect a given backtracking algorithm to be efficient for a particular instance. The Monte Carlo technique is discussed in [Section 5.3](LiB0041.html#500).