6.2 The Traveling Salesperson Problem · Foundations of Algorithms Using C++ Pseudocod

## 6.2 The Traveling Salesperson Problem In [Example 3.12](LiB0030.html#345), Nancy won the sales position over Ralph because she found an optimal tour for the 20-city sales territory in 45 seconds using a Θ (*n*22*n*) dynamic programming algorithm to solve the the Traveling Salesperson problem. Ralph used the brute-force algorithm that generates all 19! tours. Because the brute-force algorithm takes over 3,800 years, it is still running. We last saw Nancy in [Section 5.6](LiB0044.html#530) when her sales territory was expanded to 40 cities. Because her dynamic programming algorithm would take more than six years to find an optimal tour for this territory, she became content with just finding any tour. She used the backtracking algorithm for the Hamiltonian Circuits problem to do this. Even if this algorithm did find a tour efficiently, that tour could be far from optimal. For example, if there were a long, winding road of 100 miles between two cities that were 2 miles apart, the algorithm could produce a tour containing that road even if it were possible to connect the two cities by a city that is a mile from each of them. This means Nancy could be covering her territory very inefficiently using the tour produced by the backtracking algorithm. Given this, she might decide that she better go back to looking for an optimal tour. If the 40 cities were highly connected, having the backtracking algorithm produce all the tours would not work, because there would be a worst-than-exponential number of tours. Let's assume that Nancy's instructor did not get to the branch-and-bound technique in her algorithms course (this is why Nancy settled for any tour in [Section 5.6](LiB0044.html#530)). After going back to her algorithms text and discoveing that the branch-and-bound technique is specifically designed for optimization problems, Nancy decides to apply it to the Traveling Salesperson problem. She might proceed as follows. Recall that the goal in this problem is to find the shortest path in a directed graph that starts at a given vertex, visits each vertex in the graph exactly once, and ends up back at the starting vertex. Such a path is called an ***optimal tour***. Because it does not matter where we start, the starting vertex can simply be the first vertex. [Figure 6.4](#ch06fig04) shows the adjacency matrix representation of a graph containing five vertices, in which there is an edge from every vertex to every other vertex, and an optimal tour for that graph. [![Click To expand](https://box.kancloud.cn/7759eeb4d7060a817d2b0ca3cc23a28f_350x158.jpg)](fig6-4_0.jpg) Figure 6.4: Adjacency matrix representation of a graph that has an edge from every vertex to every other vertex (left), and the nodes in the graph and the edges in an optimal tour (right). An obvious state space tree for this problem is one in which each vertex other than the starting one is tried as the first vertex (after the starting one) at level 1, each vertex other than the starting one and the one chosen at level 1 is tried as the second vertex at level 2, and so on. A portion of this state space tree, in which there are five vertices and in which there is an edge from every vertex to every other vertex, is shown in [Figure 6.5](#ch06fig05). In what follows, the term "node" means a node in the state space tree, and the term "vertex" means a vertex in the graph. At each node in [Figure 6.5](#ch06fig05), we have included the path chosen up to that node. For simplicity, we have denoted a vertex in the graph simply by its index. A node that is not a leaf represents all those tours that start with the path stored at that node. For example, the node containing \[1, 2, 3\] represents all those tours that start with the path \[1, 2, 3\]. That is, it represents the tours \[1, 2, 3, 4, 5, 1\] and \[1, 2, 3, 5, 4, 1\]. Each leaf represents a tour. We need to find a leaf that contains an optimal tour. We stop expanding the tree when there are four vertices in the path stored at a node because, at that time, the fifth one is uniquely determined. For example, the far-left leaf represents the tour \[1, 2, 3, 4, 5, 1\] because once we have specified the path \[1, 2, 3, 4\], the next vertex must be the fifth one. [![Click To expand](https://box.kancloud.cn/5a0fb8f67e67bcdb724942a9d2bb1c71_350x328.jpg)](fig6-5_0.jpg) Figure 6.5: A state space tree for an instance of the Traveling Salesperson problem in which there are five vertices. The indices of the vertices in the partial tour are stored at each node. To use best-first search, we need to be able to determine a bound for each node. Because of the objective in the 0–1 Knapsack problem (to maximize profit while keeping the total weight from exceeding *W*), we computed an upper bound on the amount of profit that could be obtained by expanding beyond a given node, and we called a node promising only if its bound was greater than the current maximum profit. In this problem, we need to determine a lower bound on the length of any tour that can be obtained by expanding beyond a given node, and we call the node promising only if its bound is less than the current minimum tour length. We can obtain a bound as follows. In any tour, the length of the edge taken when leaving a vertex must be at least as great as the length of the shortest edge emanating from that vertex. Therefore, a lower bound on the *cost* (length of the edge taken) of leaving vertex *v*1 is given by the minimum of all the nonzero entries in row 1 of the adjacency matrix, a lower bound on the cost of leaving vertex *v*2 is given by the minimum of all the nonzero entries in row 2, and so on. The lower bounds on the costs of leaving the five vertices in the graph represented in [Figure 6.4](#ch06fig04) are as follows: ![](https://box.kancloud.cn/7e70c5cd027838ce945fd8938b6c6544_247x122.jpg) Because a tour must leave every vertex exactly once, a lower bound on the length of a tour is the sum of these minimums. Therefore, a lower bound on the length of a tour is ![](https://box.kancloud.cn/f5017ae33f6639209cee6e690a172d03_181x17.jpg) This is not to say that there is a tour with this length. Rather, it says that there can be no tour with a shorter length. Suppose we have visited the node containing \[1, 2\] in [Figure 6.5](#ch06fig05). In that case we have already committed to making *v*2 the second vertex on the tour, and the cost of getting to *v*2 is the weight on the edge from *v*1 to *v*2, which is 14. Any tour obtained by expanding beyond this node, therefore, has the following lower bounds on the costs of leaving the vertices: ![](https://box.kancloud.cn/d54b433047beb9f0f6228b6e3443eb05_241x121.jpg) To obtain the minimum for *v*2 we do not include the edge to *v*1, because *v*2 cannot return to *v*1. To obtain the minimums for the other vertices we do not include the edge to *v*2, because we have already been at *v*2. A lower bound on the length of any tour, obtained by expanding beyond the node containing \[1, 2\], is the sum of these minimums, which is ![](https://box.kancloud.cn/02b7759cf9178c372127ee25f3fcf214_188x17.jpg) To further illustrate the technique for determining the bound, suppose we have visited the node containing \[1, 2, 3\] in [Figure 6.5](#ch06fig05). We have committed to making *v*2 the second vertex and *v*3 the third vertex. Any tour obtained by expanding beyond this node has the following lower bounds on the costs of leaving the vertices: ![](https://box.kancloud.cn/91a5a5b11592dc0afb3f4cdcfb501d3f_208x122.jpg) To obtain the minimums for *v*4 and *v*5 we do not consider the edges to *v*2 and *v*3, because we have already been to these vertices. The lower bound on the length of any tour we could obtain by expanding beyond the node containing \[1, 2, 3\] is ![](https://box.kancloud.cn/51603eb4cef6ab27ccda5b56269555e6_188x19.jpg) In the same way, we can obtain a lower bound on the length of a tour that can be obtained by expanding beyond any node in the state space tree, and we use these lower bounds in our best-first search. The following example illustrates this technique. We will not actually do any calculations in the example. They would be done as just illustrated. Example 6.3**![Start example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)**Given the graph in [Figure 6.4](#ch06fig04) and using the bounding considerations outlined previously, a best-first search with branch-and-bound pruning produces the tree in [Figure 6.6](#ch06fig06). The bound is stored at a nonleaf, whereas the length of the tour is stored at a leaf. We show the steps that produced the tree. We initialize the value of the best solution to ∞ (infinity) because there is no candidate solution at the root. (Candidate solutions exist only at leaves in the state space tree.) We do not compute bounds for leaves in the state space tree because the algorithm is written so as not to expand beyond leaves. When referring to a node, we refer to the partial tour stored at the node. This is different from the way we referred to a node when illustrating the 0–1 Knapsack problem. The steps are as follows: 1. Visit node containing \[1\] (the root). 1. Compute bound to be 21. {This is a lower bound on the} 2. Set *minlength* to ∞. {length of a tour.} 2. Visit node containing \[1, 2\]. 1. Compute bound to be 31. 3. Visit node containing \[1, 3\]. 1. Compute bound to be 22. 4. Visit node containing \[1, 4\]. 1. Compute bound to be 30 . 5. Visit node containing \[1, 5\]. 1. Compute bound to be 42. 6. Determine promising, unexpanded node with the smallest bound. 1. That node is the node containing \[1, 3\]. We visit its children next. 7. Visit node containing \[1, 3, 2\]. 1. Compute bound to be 22. 8. Visit node containing \[1, 3, 4\]. 1. Compute bound to be 27. 9. Visit node containing \[1, 3, 5\]. 1. Compute bound to be 39 . 10. Determine promising, unexpanded node with the smallest bound. 1. That node is the node containing \[1, 3, 2\]. We visit its children next. 11. Visit node containing \[1,3,2,4\]. 1. Because this node is a leaf, compute tour length to be 37. 2. Because its length 37 is less than ∞, the value of *minlength*, set *minlength* to 37. 3. The nodes containing \[1, 5\] and \[1, 3, 5\] become nonpromising because their bounds 42 and 39 are greater than or equal to 37, the new value of *minlength.* 12. Visit node containing \[1, 3, 2, 5\]. 1. Because this node is a leaf, compute tour length to be 31. 2. Because its length 31 is less than 37, the value of *minlength*, set *minlength* to 31. 3. The node containing \[1, 2\] becomes nonpromising because its bound 31 is greater than or equal to 31, the new value of *minlength*. 13. Determine promising, unexpanded node with the smallest bound. 1. That node is the node containing \[1, 3, 4\]. We visit its children next. 14. Visit node containing \[1, 3, 4, 2\]. 1. Because this node is a leaf, compute tour length to be 43. 15. Visit node containing \[1, 3, 4, 5\]. 1. Because this node is a leaf, compute tour length to be 34. 16. Determine promising, unexpanded node with the smallest bound. 1. The only promising, unexpanded node is the node containing \[1, 4\]. We visit its children next. 17. Visit node containing \[1, 4, 2\]. 1. Compute bound to be 45. 2. Determine that the node is nonpromising because its bound 45 is greater than or equal to 31, the value of *minlength*. 18. Visit node containing \[1, 4, 3\]. 1. Compute bound to be 38. 2. Determine that the node is nonpromising because its bound 38 is greater than or equal to 31, the value of *minlength*. 19. Visit node containing \[1, 4, 5\]. 1. Compute bound to be 30. 20. Determine promising, unexpanded node with the smallest bound. 1. The only promising, unexpanded node is the node containing \[1, 4, 5\]. We visit its children next . 21. Visit node containing \[1, 4, 5, 2\]. 1. Because this node is a leaf, compute tour length to be 30. 2. Because its length 30 is less than 31, the value of *minlength*, set *minlength* to 30. 22. Visit node containing \[1, 4, 5, 3\]. 1. Because this node is a leaf, compute tour length to be 48. 23. Determine promising, unexpanded node with the smallest bound. 1. There are no more promising, unexpanded nodes. We are done. We have determined that the node containing \[1, 4, 5, 2\], which represents the tour \[1, 4, 5, 2, 3, 1\], contains an optimal tour, and that the length of an optimal tour is 30. **![End example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)** [![Click To expand](https://box.kancloud.cn/bfffd622e21031c405d13b6ae2c4a27e_350x277.jpg)](fig6-6_0.jpg) Figure 6.6: The pruned state space tree produced using best-first search with branch-and-bound pruning in [Example 6.3](#ch06ex05). At each node that is not a leaf in the state space tree, the partial tour is at the top and the bound on the length of any tour that could be obtained by expanding beyond the node is at the bottom. At each leaf in the state space tree, the tour is at the top and its length is at the bottom. The node shaded in color is the one at which an optimal tour is found. There are 17 nodes in the tree in [Figure 6.6](#ch06fig06), whereas the number of nodes in the entire state space tree is 1 + 4 + 4 x 3 + 4 x 3 x 2 = 41. We will use the following data type in the algorithm that implements the strategy used in the previous example: ``` struct node { int level; // the node's level in the tree ordered_set path; number bound; }; ``` The field *path* contains the partial tour stored at the node. For example, in [Figure 6.6](#ch06fig06) the value of *path* for the far left child of the root is \[1, 2\]. The algorithm follows. Algorithm 6.3**![Start example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)****The Best-First Search with Branch-and-Bound Pruning Algorithm for the Traveling Salesperson problem** Problem: Determine an optimal tour in a weighted, directed graph. The weights are nonnegative numbers. Inputs: a weighted, directed graph, and *n*, the number of vertices in the graph. The graph is represented by a two-dimensional array *W*, which has both its rows and columns indexed from 1 to *n*, where *W* \[i\] \[j\] is the weight on the edge from the *i*th vertex to the *j*th vertex. Outputs: variable *minlength*, whose value is the length of an optimal tour, and variable *opttour*, whose value is an optimal tour. ``` void travel 2 (int n, const number W[] [], ordered-set& opttour, number& minlength) { priority_queue_of_node PQ; node u, v; initialize (PQ); // Initialize PQ to be empty. v. level = 0; v. path = [1]; // Make first vertex the v. bound = bound (v); // starting one. minlength = ∞; insert (PQ, v); while (! empty (PQ)){ remove (PQ, v); // Remove node with best bound. if (v. bound < minlength){ u. level = v. level + 1; // Set u to a child of v. for (all i such that 2 ≤ i ≤ n && i is not in v. path){ u. path = v. path; put i at the end of u. path; if (u. level == n - 2){ // Check if next vertex put not put of only vertex // completes a tour. not in u. path at the end of u. path; put 1 at the end of u. path; // Make first vertex last one. if (length (u > minlength){ // Function length computes the minlength = length (u); // length of the tour. opttour = u. path; } } else{ u. bound = bound (u); if (u. bound < minlength) insert (PQ, u); } } } } ``` You are asked to write functions *length* and *bound* in the exercises. Function *length* returns the length of the tour *u. path*, and function *bound* returns the bound for a node using the considerations discussed. **![End example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)** A problem does not necessarily have a unique bounding function. In the Traveling Salesperson problem, for example, we could observe that every vertex must be visited exactly once, and then use the minimums of the values in the columns in the adjacency matrix instead of the minimums of the values in the rows. Alternatively, we could take advantage of both the rows and the columns by noting that every vertex must be entered and exited exactly once. For a given edge, we could associate half of its weight with the vertex it leaves and the other half with the vertex it enters. The cost of visiting a vertex is then the sum of the weights associated with entering and exiting it. For example, suppose we are determining the initial bound on the length of a tour. The minimum cost of entering *v*2 is obtained by taking 1/2 of the minimum of the values in the second column. The minimum cost of exiting *v*2 is obtained by taking 1/2 of the minimum of the values in the second row. The minimum cost of visiting *v*2 is then given by ![](https://box.kancloud.cn/eb9a2150a4bf313d78ec1be790e99c2c_385x38.jpg) Using this bounding function, a branch-and-bound algorithm checks only 15 vertices in the instance in [Example 6.3](#ch06ex05). When two or more bounding functions are available, one bounding function may produce a better bound at one node whereas another produces a better bound at another node. Indeed, as you are asked to verify in the exercises, this is the case for our bounding functions for the Traveling Salesperson problem. When this is the case, the algorithm can compute bounds using all available bounding functions, and then use the best bound. However, as discussed in [Chapter 5](LiB0039.html#471), our goal is not to visit as few nodes as possible, but rather to maximize the overall efficiency of the algorithm. The extra computations done when using more than one bounding function may not be offset by the savings realized by visiting fewer nodes. Recall that a branch-and-bound algorithm might solve one large instance efficiently but check an exponential (or worse) number of nodes for another large instance, Returning to Nancy's dilemma, what is she to do if even the branch-and-bound algorithm cannot solve her 40-city instance efficiently? Another approach to handling problems such as the Traveling Salesperson problem is to develop approximation algorithms. ***Approximation algorithms*** are not guaranteed to yield optimal solutions, but rather yield solutions that are reasonably close to optimal. They are discussed in [Section 9.5](LiB0078.html#957). In that section we return to the Traveling Salesperson problem.