6.3 Abductive Inference (Diagnosis) · Foundations of Algorithms Using C++ Pseudocod

## 6.3 Abductive Inference (Diagnosis) ![](https://box.kancloud.cn/fe52dc567423925ec4693ad45a0553a1_21x21.jpg) This section requires knowledge of discrete probability theory and Bayes’ theorem. An important problem in artificial intelligence and expert systems is determining the most probable explanation for some findings. For example, in medicine we want to determine the most probable set of diseases, given a set of symptoms. In the case of an electronic circuit, we want to find the most probable explanation for a failure at some point in the circuit. Another example is the determination of the most probable causes for the failure of an automobile to function properly. This process of determining the most probable explanation for a set of findings is called ***abductive inference***. For the sake of focus, we use medical terminology. Assume that there are *n* diseases, *d*1, *d*2,…, *dn*, each of which may be present in a patient. We know that the patient has a certain set of symptoms *S.* Our goal is to find the set of diseases that are most probably present. Technically, there could be two or more sets that are probably present. However, we often discuss the problem as if a unique set is most probably present. The ***Bayesian network*** has become a standard for representing probabilistic relationships such as those between diseases and symptoms. It is beyond our scope to discuss belief networks here. They are discussed in detail in Neapolitan (1990, 2003) and Pearl (1988). For many Bayesian network applications, there exist efficient algorithms for determining the prior probability (before any symptoms are discovered) that a particular set of diseases contains the only diseases present in the patient. These algorithms are also discussed in Neapolitan (1990, 2003) and Pearl (1988). Here we will simply assume that the results of the algorithms are available to us. For example, these algorithms can determine the prior probability that *d*1, *d*3, and *d*6 are the only diseases present in the patient. We will denote this *probability* by ![](https://box.kancloud.cn/6e5e6228ce06b94ae8e84b641433ac65_276x24.jpg) where ![](https://box.kancloud.cn/601d2c5f644590e504933f23947d1b4f_132x21.jpg) These algorithms can also determine the probability that *d*1, *d*3, and *d*6 are the only diseases present, conditional on the information that the symptoms in *S* are present. We will denote this *conditional probability* by ![](https://box.kancloud.cn/2abde8f975bcaf4f063240000ed55f5a_310x26.jpg) Given that we can compute these probabilities (using the algorithms mentioned previously), we can solve the problem of determining the most probable set of diseases (conditional on the information that some symptoms are present) using a state space tree like the one in the 0–1 Knapsack problem. We go to the left of the root to include *d*1, and we go to the right to exclude it. Similarly, we go to the left of a node at level 1 to include *d*2, and we go to the right to exclude it, and so on. Each leaf in the state space tree represents a possible solution (that is, the set of diseases that have been included up to that leaf). To solve the problem, we compute the conditional probability of the set of diseases at each leaf, and determine which one has the largest conditional probability. To prune using best-first search, we need to find a bounding function. The following theorem accomplishes this for a large class of instances. Theorem 6.1**![Start example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)**If *D* and *D′* are two sets of diseases such that ![](https://box.kancloud.cn/354cec65a8463bf5e8a75dcdda70ec11_122x20.jpg) then ![](https://box.kancloud.cn/6a489b8e3975281c8598c332a3658c1c_140x46.jpg) Proof: According to Bayes’ theorem, ![](https://box.kancloud.cn/0663cfe8f906d5e99b21a8aba9bb136f_208x145.jpg) The first inequality is by the assumption in this theorem, and the second follows from the fact that any probability is less than or equal to 1. This proves the theorem. **![End example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)** For a given node, let *D* be the set of diseases that have been included up to that node, and for some descendant of that node, let *D*‘ be the set of diseases that have been included up to that descendant. Then *D* ⊆ D′. Often it is reasonable to assume that ![](https://box.kancloud.cn/84c48b669e327d4d0d53f072ca0cb2f6_297x29.jpg) The reason is that usually it is at least as probable that a patient has a set of diseases as it is that the patient has that set plus even more diseases. (Recall that these are prior probabilities before any symptoms are observed.) If we make this assumption, by [Theorem 6.1](#ch06ex07), ![](https://box.kancloud.cn/a8c63da0a34265775c4ceacd61f9b840_142x47.jpg) Therefore, *p*(*D*) /*p*(*S*) is an upper bound on the conditional probability of the set of diseases in any descendant of the node. The following example illustrates how this bound is used to prune branches. Example 6.4**![Start example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)**Suppose there are four possible diseases *d*1, *d*2, *d*3, and *d*4 and a set of symptoms *S.* The input to this example would also include a Bayesian network containing the probabilistic relationships among the diseases and the symptoms. The probabilities used in this example would be computed from this Bayesian network using the methods discussed earlier. These probabilities are not computed elsewhere in this text. We assign arbitrary probabilities to illustrate the best-first search algorithm. When using the results from one algorithm (in this case, the one for doing inference in a belief network) in another algorithm (in this case, the best-first search algorithm), it is important to recognize where the first algorithm supplies results that can simply be assumed in the second algorithm. [Figure 6.7](#ch06fig07) is the pruned state space tree produced by a best-first search. Probabilities have been given arbitrary values in the tree. The conditional probability is at the top and the bound is at the bottom in each node. The node shaded in color is the one at which the best solution is found. As was done in [Section 6.1](LiB0048.html#571), nodes are labeled according to their depth and position from the left in the tree. The steps that produce the tree follow. The variable *best* is the current best solution, whereas *p*(*best | S*) is its conditional probability. Our goal is to determine a value of *best* that maximizes this conditional probability. It is also assumed arbitrarily that ![](https://box.kancloud.cn/dc2e742462b6adf36374ad700679d669_100x24.jpg) 1. Visit node (0, 0) (the root). 1. Compute its conditional probability. {Ø is the empty set. This means that no diseases are present.} ![](https://box.kancloud.cn/3b3773d2c7a53c7a695f270a603cf73a_400x43.jpg) 2. Set ![](https://box.kancloud.cn/fdb96019794b981035ef64c52e1a18b0_378x21.jpg) 3. Compute its prior probability and bound. ![](https://box.kancloud.cn/ca8ffe568cd2f7d5ed22d313099c7c61_219x74.jpg) 2. Visit node (1, 1). 1. Compute its conditional probability. ![](https://box.kancloud.cn/5dd3f0d09391684d9f9accc444d9f501_111x24.jpg) 2. Because 0.4 > *p*(*best|S*), set ![](https://box.kancloud.cn/6c950fd5d49cbc570b3a37dd373af741_321x21.jpg) 3. Compute its prior probability and bound. ![](https://box.kancloud.cn/35d1960178adba38cdcc38b01a12a2f8_234x72.jpg) 3. Visit node (1, 2). 1. Its conditional probability is simply that of its parent—namely, 0.1. 2. Its prior probability and bound are simply those of its parent—namely, 0.9 and 90. 4. Determine promising, unexpanded node with the largest bound. 1. That node is node (1, 2). We visit its children next. 5. Visit node (2, 3). 1. Compute its conditional probability. ![](https://box.kancloud.cn/87f937f17ff5bb256091dc8f7340b781_118x23.jpg) 2. Compute its prior probability and bound. ![](https://box.kancloud.cn/bb03b0f77856d4f0c04ef6f13eac0da3_225x73.jpg) 6. Visit node (2, 4). 1. Its conditional probability is simply that of its parent—namely, 0.1. 2. Its prior probability and bound are simply those of its parent—namely, .9 and 90. 7. Determine promising, unexpended note with the largest bound. 1. That node is node (2, 4). We visit its children next. 8. Visit node (3, 3). 1. Compute its conditional probability. ![](https://box.kancloud.cn/147652dce8916c361edac2e523ff13ff_109x23.jpg) 2. Compute its conditional probability and bound. ![](https://box.kancloud.cn/54c71688bb92728121989ed7097b49f2_235x72.jpg) 3. Determine that it is nonpromising because its bound .2 is less than or equal to .4, the value of *p*(*best|S*). 9. Visit node (3, 4). 1. Its conditional probability is simply that of its parent—namely, 0.1. 2. Its prior probability and bound are simply those of its parent-namely, 0.9 and 90. 10. Determine promising, unexpanded node with the largest bound. 1. That node is node (3, 4). We visit its children next. 11. Visit node (4, 3). 1. Compute its conditional probability. ![](https://box.kancloud.cn/1274f0588b3d0452962e520930d97401_112x25.jpg) 2. Because 0.6 > *p*(*best|S*), set ![](https://box.kancloud.cn/07bdbf38ca7e17a44c4b15235b444ef5_321x22.jpg) 3. Set its bound to 0 because it is a leaf in the state space tree. 4. At this point, the node (2, 3) becomes nonpromising because its bound 0.5 is less than or equal to 0.6, the new value of *p*(*best|S*). 12. Visit node (4, 4). 1. Its conditional probability is simply that of its parent—namely, 0.1. 2. Set its bound to 0 because it is a leaf in the state space tree. 13. Determine promising, unexpanded node with the largest bound. 1. That node is node, (1, 1). We visit its children next. 14. Visit node (2, 1). 1. Compute its conditional probability. ![](https://box.kancloud.cn/7c5003867f54b9a6b060553902c0867d_135x24.jpg) 2. Compute its prior probability and bound. ![](https://box.kancloud.cn/ded6282cac267385a4385c0a30424e53_285x74.jpg) 3. Determine that it is nonpromising because its bound 0.3 is less than or equal to 0.6, the value of *p*(*best|S*). 15. Visit node (2, 2). 1. Its conditional probability is simply that its parent-namely, 0.4. 2. Its prior probability and bound are simply those of its parent-namely, 0.009 and 0.9. 16. Determine promising, unexpanded node with the greatest bound. 1. The only promising, unexpanded node is node (2, 2). We visit its children next. 17. Visit node (3, 1). 1. Compute its conditional probability. ![](https://box.kancloud.cn/47b72d4488ac16f7d37b2ca76904877c_143x21.jpg) 2. Compute its prior probability and bound. ![](https://box.kancloud.cn/fae110ee737a0f33cf16f7cd20ed78a0_283x74.jpg) 3. Determine that it is nonpromising because its bound 0.1 is less than or equal to 0.6, the value of *p*(*best|S*). 18. Visit node (3, 2). 1. Its conditional probability is simply that of its parent—namely, 0.4. 2. Its prior probability and found are simply those of its parent—namely, 0.009 and 0.9 . 19. Determine promising, unexpanded node with the largest bound. 1. The only promising, unexpanded node is node (3, 2). We visit its children next. 20. Visit node (4, 1). 1. Compute its conditional probability. ![](https://box.kancloud.cn/1b6f2776b99caaab84c890bc08c3947b_150x24.jpg) 2. Because 0.65 > *p*(*best|S*), set ![](https://box.kancloud.cn/dea99b7124c4eeb94bd6982543c7c4a5_357x21.jpg) 3. Set its bound to 0 because it is a leaf in the state space tree. 21. Visit node (4, 2). 1. Its conditional probability is simply that of its parent—namely, 0.4. 2. Set its bound to 0 because it is a leaf in the state space tree. 22. Determine promising, unexpanded node with the largest bound. 1. There are no more promising, unexpanded nodes. We are done. We have determined that the most probable set of diseases is {*d*1, *d*4} and that *p*(*d*1, *d*4|*S*) = 0.65. **![End example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)** [![Click To expand](https://box.kancloud.cn/bff1b176202b0dd1e7475552f88cbd66_350x316.jpg)](fig6-7_0.jpg) Figure 6.7: The pruned state space tree produced using best-first search with branch-and-bound pruning in [Example 6.4](#ch06ex08). At each node, the conditional probability of the diseases included up to that node is at the top, and the bound on the conditional probability that could be obtained by expanding beyond the node is at the bottom. The node shaded in color is the one at which an optimal set is found. A reasonable strategy in this problem would be to initially sort the diseases in nonincreasing order according to their conditional probabilities. There is no guarantee, however, that this strategy will minimize the search time. We have not done this in [Example 6.4](#ch06ex08), and 15 nodes were checked. In the exercises, you establish that if the diseases were sorted, 23 nodes would be checked. Next we present the algorithm. It uses the following declaration: ``` struct node { int level; // the node's level in the tree set_of_indices D; float bound; }; ``` The field *D* contains the indices of the diseases included up to the node. One of the inputs to this algorithm is a Bayesian network *BN.* As mentioned previously, a Bayesian network represents the probabilistic relationships among diseases and symptoms. The algorithms referenced at the beginning of this section can compute the necessary probabilities from such a network. The following algorithm was developed by Cooper (1984): Algorithm 6.4**![Start example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)****Cooper's Best-First Search with Branch-and-Bound Pruning Algorithm for Abductive Inference** Problem: Determine a most probable set of diseases (explanation) given a set of symptoms. It is assumed that if a set of diseases *D* is a subset of a set of diseases *D′*, then *p*(*D′*) ≤ *p* (*D*). Inputs: positive integer *n*, a Bayesian network *BN* representing the probabilistic relationships among *n* diseases and their symptoms, and a set of symptoms *S.* Outputs: a set *best* containing the indices of the diseases in a most probable set (conditional on *S*), and a variable *pbest* that is the probability of *best* given *S*. ``` void cooper (int n, Bayesian_network_of_i_diseases BN, set_of_symptoms S, set_of_indices& best, float& pbest) { priority_queue_of_node PQ; node u, v; v. level = 0; // Set v to the root. v.D = ⊘; // Store empty set at root. best = ⊘; pbest = bound (v); insert (PQ, v); while (! empty (PQ)){ remove (PQ, v); // Remove node with best bound. if (v. bound < pbest){ u. level = v. level + 1; // Set u to a child of v. u.D = v.D; // Set u to the child that put u. level in u.D; // includes the next disease. if (p(u. D|S) > pbest){ best = u.D; pbest = p(u.D|S); } u. bound = bound (u); if (u. bound > pbest) insert (PQ, u); u.D = v.D; // Set u to the child that does u. bound = bound (u); // not include the next disease. if (u. bound > pbest) insert (PQ, u); } } } int bound (node u) { if (u. level == n) // A leaf is nonpromising. return 0; else return p(u.D|p (S)); } ``` The notation *p*(*D*) stands for the prior probability of *D, p(S)* stands for the prior probability of *S*, and *p*(*D|S*) stands for the conditional probability of *D* given *S*. These values would be computed from the Bayesian network *BN* using the algorithms referenced at the beginning of this section. **![End example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)** We have written the algorithm strictly according to our guidelines for writing best-first search algorithms. An improvement is possible,. There is no need to call function *bound* for the right child f a node. The reason is that the right child contains the same set of diseases as the node itself, which means that its bound is the same. Therefore, the right child is pruned only if, at the left child, we change *pbest* to a value greater than or equal to this bound. We can modify our algorithm to prune the right child when this happens and to expand to the right child when it does not happen. Like the other problems described in this chapter, the problem of Abductive Inference is in the class of problems discussed in [Chapter 9](LiB0074.html#880). If there is more than one solution, the preceding algorithm only produces one of them. It is straightforward to modify the algorithm to produce all the best solutions. It is also possible to modify it to produce the *m* most probable explanations, where *m* is any positive integer. This modification is discussed in Neapolitan (1990). Furthermore, Neapolitan (1990) analyzes the algorithm in detail.