Unformatted Document Excerpt
>> New York
Course Hero has millions of student submitted documents similar to the one
below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.
Course Hero has millions of student submitted documents similar to the one below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.
paper This appeared at the 1st International Joint Workshop on Articial Intelligence and Operations Research, Timberline, Oregon, 1995.
Solving Problems with Hard and Soft Constraints Using a Stochastic Algorithm for MAX-SAT
Yuejun Jiang, Henry Kautz, and Bart Selman AT&T Bell Laboratories Direct correspondence to: Henry Kautz 600 Mountain Ave., Room 2C-407 Murray Hill, NJ 07974 kautz @research.att.com
Stochastic local search is an effective technique for solving certain classes of large, hard propositional satisability problems, including propositional encodings of problems such as circuit synthesis and graph coloring (Selman, Levesque, and Mitchell 1992; Selman, Kautz, and Cohen 1994). Many problems of interest to AI and operations research cannot be conveniently encoded as simple satisability, because they involve both hard and soft constraints that is, any solution may have to violate some of the less important constraints. We show how both kinds of constraints can be handled by encoding problems as instances of weighted MAX-SAT (nding a model that maximizes the sum of the weights of the satised clauses that make up a problem instance). We generalize our local-search algorithm for satisability (GSAT) to handle weighted MAX-SAT, and present experimental results on encodings of the Steiner tree problem, which is a well-studied hard combinatorial search problem. On many problems this approach turns out to be competitive with the best current specialized Steiner tree algorithms developed in operations research. Our positive results demonstrate that it is practical to use domain-independent logical representations with a general search procedure to solve interesting classes of hard combinatorial search problems.
Traditional satisability-testing algorithms are based on backtracking search 1
(Davis and Putnam 1960). Surprisingly few search heuristics have proven to be generally useful; increases in the size of problems that can be practically solved have come mainly from increases in machine speed and more efcient implementations (Trick and Johnson 1993). Selman, Levesque, and Mitchell (1992) introduced an alternative approach for satisability testing, based on stochastic local search. This algorithm, called GSAT, is only a partial decision procedure it cannot be used to prove that a formula is unsatisable, but only nd models of satisable ones and does not work on problems where the structure of the local search space yields no information about the location of global optima (Ginsberg and McAllester 1994). However, GSAT is very useful in practice. For example, it is the only approach that can solve certain very large, computationally hard, formulas derived from circuit synthesis problems (Selman, Kautz, and Cohen 1994). It can also solve randomly generated Boolean formulas that are two orders of magnitude larger than the largest handled by any current backtracking algorithm (Selman and Kautz 1993a). The success of stochastic local search in handling formulas that contain thousands of discrete variables has made it a viable approach for directly solving logical encodings of interesting problems in AI and operations research (OR), such as circuit diagnosis and planning (Selman and Kautz 1993b). Thus, at least on certain classes of problems, it provides a general model-nding technique that scales to realistically-sized instances, demonstrating that the use of a purely declarative, logical representation is not necessarily in conict with the need for computational efciency. One issue that arises in studying this approach to problem-solving is developing problem encodings where a solution corresponds to a satisfying model (Kautz and Selman 1992), instead of having a solution correspond to a refutation proof (Green 1969). But for some kinds of problems no useful encoding in terms of propositional satisability can be found in particular, problems that contain both hard and soft constraints. Each clause in a CNF (conjunctive normal form) formula can be viewed as a constraint on the values (true or false) assigned to each variable. For satisability, all clauses are equally important, and all clauses must evaluate to true in a satisfying model. Many problems, however, contain two classes of constraints: hard constraints that must be satised by any solution, and soft constraints, of different relative importance, that may or may not be satised. In the language of operations research, the hard constraints specify the set of feasible solutions, and the soft constraints specify a function to be optimized in choosing between the feasible solutions. When both kinds of constraints are represented by clauses, the formula constructed by conjoining all the clauses is likely to be unsatisable. In order to nd a solution to the original problem using an ordinary satisability procedure, it is necessary to repeatedly try to exclude different subsets of the soft constraints from the problem representation, until a satisable formula is found. Performing such a search through the space of soft constraints, taking into account their relative importance, can be 2
complex and costly in a practical sense, even when the theoretical complexity of the entire process is the same as ordinary satisability. A more natural representation for many problems involving hard and soft constraints is weighted maximum satisability (MAX-SAT). An instance of weighted MAX-SAT consists of a set of propositional clauses, each associated with a positive integer weight. If a clause is not satised in a truth assignment, then it adds the cost of the weight associated with the clause to the total cost associated with the truth assignment. A solution is a truth assignment that maximizes the sum of the weights of the satised clauses (or, equivalently, that minimizes the sum of the weights of the unsatised clauses). Note that if the sum of the weights of all clauses that correspond to the soft constraints in the encoding of some problem is , and each hard constraint is represented by a clause of weight greater than , then assignments that violate clauses of total weight or less exactly correspond to feasible solutions to the original problem. The basic GSAT algorithm can be generalized, as we will show, to handle weighted MAX-SAT in an efcient manner. An important difference between simple SAT and weighted MAX-SAT problems is that for the latter, but not the former, near (approximate) solutions are generally of value. The main experimental work described in this paper is on Boolean encodings of network Steiner tree problems. These problems have many applications in network design and routing, and have been intensively studied in operations research for several decades (Hwang et al. 1992). We worked on a well-known set of benchmark problems, and compared our performance with the best published results. One of our implicit goals in this work is to develop representations and algorithms that provide state-of-the-art performance, and advance research in both the AI and operations research communities (Ginsberg 1994). Not all possible MAX-SAT encodings of an optimization problem are equally good. For practical applications, the nal size of the encoding is crucial, and even a low-order polynomial blowup in size may be unacceptable. The number of clauses in a straightforward propositional encoding of a Steiner tree problem is quadratic in the (possibly very large) number of edges in the given graph. We therefore developed an alternative encoding, that is instead linear in the number of edges. This savings is not completely free, because the alternative representation only approximates the original problem instance that is, theoretically it might not lead to an optimal solution. Nonetheless, the experimental results we have obtained using this encoding and our stochastic local search algorithm are competitive in terms of both solution quality and speed with the best specialized Steiner tree algorithms from the operations research literature. The general approach used in our alternative representation of Steiner problems is to break the problem down into small, tractable subproblems, pre-compute a set of near-optimal solutions to each subproblem, and then use MAX-SAT to assemble a global solution by picking elements from the pre-computed sets. This general 3
technique is applicable to other kinds of problems in AI and operations research. In a sense this paper describes a line of research that has come full circle: much of the initial motivation for our earlier work on local search for satisability testing came from work by Adorf and Johnston (1990) and Minton et al. (1990) on using local search for scheduling problems that did involve both hard and soft constraints. Thus, we turned a method for optimization problems into one for decision problems, and now are returning to optimization problems. However, instead of creating different local search algorithms for each problem domain, we translate instances from different domains into weighted CNF, and use one general, highly optimized search algorithm. Thus we retain the use of purely propositional problem representations, and our nely-tuned randomized techniques for escaping from local minima during search.
2 A Stochastic Search Algorithm
The GSAT procedure mentioned in the introduction solves satisability problems by searching through the space of truth assignments for one that satises all clauses (Selman, Levesque, and Mitchell 1992). The search begins at a random complete truth assignment. The neighborhood of a point in the search space is dened as the set of assignments that differ from that point by the value assigned to a single variable. Each step in the search thus corresponds to ipping the truth-value assigned to a variable. The basic search heuristic is to move in the direction that maximizes the number of satised clauses. Similar local-search methods to satisability testing has also been investigated by Hanson and Jaumard (1990) and Gu (1992). Thus GSAT can already be viewed as a special kind of MAX-SAT procedure, where all clauses are treated uniformly, and which is run until a completely satisfying model is found. We have experimented with many modications to the search heuristic, and currently obtain the best performance with the following specic strategy for picking a variable to change. First, a clause in the problem instance that is unsatised by the current assignment is chosen at random the variable to be ipped will come from this clause. Next, a coin is ipped. If it comes up heads (with a probability that is one of the parameters to the procedure), then a variable that appears in the clause is chosen at random. This kind of choice is called a random walk. If the coin comes up tails instead, then the algorithm chooses a variable from the clause that, when ipped, will cause as few clauses as possible that are currently satised to become unsatised. This kind of choice is called a greedy move. Note that ipping a variable chosen in this manner will always make the chosen clause satised, and will tend to increase the overall number of satised clauses but sometimes will in fact decrease the number of satised clauses. This renement of GSAT was called WSAT (for walksat) in Selman, Kautz, and Cohen (1994). The weighted MAX-SAT version of Walksat, shown in Fig. 1, uses the sum of 4
procedure Walksat(WEIGHTED-CLAUSES, HARD-LIMIT, MAX-FLIPS, TARGET, MAX-TRIES, NOISE) M : a random truth assignment over the variables that appear in WEIGHTED-CLAUSES; HARD-UNSAT : clauses not satised by M with weight HARD-LIMIT; SOFT-UNSAT : clauses not satised by M with weight HARD-LIMIT; BAD : sum of the weight of HARD-SAT and SOFT-UNSAT; TOPLOOP: for I : 1 to MAX-TRIES do for J : 1 to MAX-FLIPS do if BAD TARGET then break from TOPLOOP; endif if HARD-UNSAT is not empty then C : a random member of HARD-UNSAT; else C : a random member of SOFT-UNSAT; endif Flip a coin that has probability NOISE of heads; if heads then P : a randomly chosen variable that appears in C; else for each proposition Q that appears in C do BREAKCOUNT[Q] : 0; for each clause C that contains Q do if C is satised by M, but not satised if Q is ipped then BREAKCOUNT[Q] weight of C endif endfor endfor P : a randomly chosen variable Q that appears in C and whose BREAKCOUNT[Q] value is minimal; endif Flip the value assigned to P by M; Update HARD-UNSAT, SOFT-UNSAT, and BAD; endfor endfor print Weight of unsatised clauses is, BAD; print M; end Walksat.
Figure 1: The Walksat procedure for weighted MAX-SAT problems.
the weights of the affected clauses in computing the greedy moves. The parameter HARD-LIMIT is set by the user to indicate that any clause with that weight or greater should be considered to be a hard constraint. The algorithm searches for MAX-FLIPS steps, or until the sum of the weights of the unsatised clauses is less than or equal to the TARGET weight. If the target is not reached, then a new initial assignment is chosen and the process repeats MAX-TRIES times. The parameter NOISE controls the amount of stochastic noise in the search, by adjust the ratio of random walk and greedy moves. The best performance on the problems in this paper was found when NOISE 0 2. Walksat is biased toward satisfying hard constraints before soft constraints. However, while working on the soft constraints, one or more hard constraints may again become unsatised. Thus, the search proceeds through a mixture of feasible and infeasible solutions. This is in sharp contrast with standard operations research methods, which generally work by stepping from feasible solution to feasible solution. Such methods are at least guaranteed (by denition) to nd a local minimum in the space of feasible solutions. On the other hand, there is no such guarantee for our approach. It therefore becomes an empirical question as to whether local search on a weighted MAX-SAT encoding of problems with both hard and soft constraints would work even moderately well. Our initial test problems were encodings of airline scheduling problems that had been studied by researchers in constraint logic programming (CLP) (Lever and Richards 1994). The results were encouraging; we found solutions approximately 10 to 100 times faster than the CLP approach. However, for the purposes of the paper, we wished to work on a larger test set, that had been studied more intensively over a longer period of time. We found such a set of benchmark problems in the operations research community, as we describe in the next section.
3 Steiner Tree Problems
Network Steiner tree problem have long been studied in operations research (Hwang et al. 1992), and many well-known, hard benchmark instances are available. The problems we used can be obtained by ftp from the OR Repository at Imperial College (mscmga.ms.ic.ac.uk). We ran our experiments on these problems so that our results could be readily compared against those of the best competing approaches. A network Steiner tree problem consists of an undirected graph, where each edge is assigned a positive integer cost, and a subset of its nodes, called the Steiner nodes. The goal is to nd a subtree of the graph that spans the Steiner nodes, such that the sum of the costs of the edges of the tree is minimal. Fig. 2 shows an example of a Steiner problem. The top gure shows the graph, where the Steiner nodes are nodes 1, 2, 3, 6, and 7. The weights are given along the edges. The bottom gure shows a Steiner 6
2 6 1
2 1 1 1 2 4 1 2 3 2 1 5
1 1 1 4 1 1 5
Figure 2: An example of a network Steiner problem and its solution. tree connecting those nodes. Note that the solution involves two non-Steiner nodes (4 and 5). In general, nding such a Steiner tree is NP-complete. There is a direct translation of Steiner problems into MAX-SAT. The encoding requires 2|E|2 variables, where |E| is the number of edges in the entire graph. While this encoding is of theoretical interest, it is not practical for realistically-sized problems: even a quadratic blowup in the number of variables relative to the number of edges in original instance is simply too large. As we will see below, many of the problems we wish to handle contain over 10,000 edges, and we cannot hope to process a formula containing 100,000,000 variables! Therefore we developed an alternative encoding of Steiner tree problems that is only linearly dependent on the number of edges. The intuition behind our encoding is that the original problem is broken down into a set of tractable subproblems; a range of near-optimal solutions to the subproblems are pre-computed; and then MAX-SAT is used to combine a selection of solutions to the subproblems to create a global solution. For Steiner tree problems, the subproblems are smaller Steiner trees that connect just pairs of nodes from the original Steiner set. Such two-node Steiner problems are tractable, because a solution is simply the shortest path between the nodes. A range of near-optimal solutions, i.e. the shortest path,
the next shortest path, etc., can be generated using a modied version of Dijkstras algorithm. This approach actually only approximates the original problem instance, because we do not generate all paths between pairs of nodes, but only the k shortest paths for some xed k. (We discuss the choice of k below.) Pathological problem instances exist that require very non-optimal subproblem solutions. However, we shall see that the approach works quite well in practice. We illustrate the encoding using the example from Fig. 2. First, we introduce a variable for each edge of the graph. For example, the edge nodes between 1 and 2 is represented by variable 1 2. The interpretation of the variable is that if the variable is true, then the corresponding edge is part of the Steiner tree. To capture the cost of including this edge in the tree, we include a unit clause of the form 1 2 with weight 2, the cost of the edge. This clause is soft constraint. Note that when this edge is included in the solution, i.e., 1 2 is true, this clause is unsatised, so the truth-assignment incurs a cost of 2. Similarly we have a clause for every edge. Second, we list the Steiner nodes in an arbitrary order, and then for each successive pair of nodes in this list, we generate the shortest paths between the nodes. We associate a variable with each path. For example, if 2, then the two shortest paths between Steiner nodes 1 and 2 are 12 and 142. We name the variables 1 2 and 1 4 2 . Third, we introduce hard constraints that assert that a solution must contain a path between each pair of Steiner nodes. For example, the clause 1 2 1 4 2 is a hard constraint, and therefore assigned a high weight (greater than the sum of all soft constraints). Hard constraints also assert that if a path appears in a solution, then the edges it contains appear. For example, for the path 1-4-2, we introduce the clauses 1 4 2 1 4 and 1 4 2 4 2 . This concludes our encoding. The encoding requires |E|$ |S|' 1 variables, where |E| is the number of edges in the graph, |S| is the number of Steiner nodes, and is the number of shortest paths pre-computed between each pair. The total number of clauses is |S|' 1 , where is the maximum number of edges in any of the pre-computed paths.
4 Empirical Results
A good description of our benchmark problems appears in Beasley (1989). The set contains four classes (B, C, D, E) of problem instances of increasing size and complexity. We omitted class B because the problems are small and easy to solve. Each class has 20 instances. Tables 1, 2, and 3 contain our results, as well as those of the two best specialized Steiner tree algorithms, as reported Beasley (1989) and Chopra et al. (1992). In the table, |V| denotes the number of nodes in the graph, |E| the number of edges, and |S| the number of Steiner nodes. The columns labeled Soln give the weight of the best 8
7 0 0 9865$4321)(
Steiner tree found by each method. The solutions found by Chopra et al. are globally optimal, except for instance E18. For some problems we also give the second best solution (labeled Soln2) found by Walksat, to indicate how effective the procedure can be in practice, since it may locate a near-global optimum in a very short time. Walksat ran on a SGI Challenge with a 150 MHz MIPS R4400 processor. Beasleys algorithm ran on a Cray XMP, and Chopras on a Vax 8700. A hyphen in the table in the case of Beasleys algorithm indicates that the problem was not solved after 21,600 seconds; in the case of Chopras algorithm, it indicates that problem was not solved after 10 days. We have not attempted to adjust the numbers for machine speed. Caution must be used in comparing different algorithms running on radically different kinds of hardware (the SGI has a RISC architecture, the Vax is CISC, and the Cray is a parallel vector processing machine). The SGI is rated is 136 MIPS, while the Vax is rated at 6 MIPS. This would indicate a ratio of 22 in relative speed; however, at least one user of both machines (Johnson 1994) reports a maximum speedup factor of 15 on combinatorial algorithms, with as small a factor as 3 on large instances. The Cray is rated 230 peak MIPS, which would appear to be faster than the SGI; however, Cray Research also reports that code that performs no vector processing at all runs at only 30 MIPS. Thus, differences in hardware could account for a speedup of between 3 and 22 when comparing Chopras VAX to our SGI, and of between 0.6 and 4.5 when comparing Beasleys Cray to our SGI. In any case, this indicates that all of the differences in performance described below cannot be attributed entirely to differences in machine speed. We found that we could obtain good solutions with a value of , the number of pre-computed paths between pairs of nodes, of up to 150 for the smaller instances (A 10 Steiner nodes), and up to 20 for the larger instances. The timing results for Walksat are averaged over 10 runs. The running times in the table do not include the time to pre-compute the set of paths between successive Steiner nodes. This is reasonable because in practice one often deals with a xed network, and wants to compute Steiner trees for many different subsets of nodes. For example, in teleconferencing applications, the network is xed, and each problem instance involves nding a Steiner tree to connect a set of sites. Given a xed network, one can pre-compute, using Dijkstras algorithm, sets of paths between every pair of nodes. From the tables we can see that for problems with up to 10 Steiner nodes, Walksat usually nd an optimal solution at least as fast as the other two approaches, even allowing differences in machine speeds. For example, for D1 and D2, Walksat is about 100 times faster than the other two in reaching the global optimum. For D6, Walksat runs about 50 times faster than Beasley and 30 times faster than Chopra. The difference is particularly dramatic for E1, where Walksat nds the optimal solution in less than 1 second, and Beasley and Chopra both take over 1,000 seconds. On 9
Problem Parameters ID C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C12 C13 C14 C15 C16 C17 C18 C19 C20
Soln 85 144 766 1094 1594 55 106 524 722 1112 34 48 265 336 563 11 20 123 155 269
5 10 83 125 250 5 10 83 125 250 5 10 83 125 250 5 10 83 125 250
Beasley CPU secs (Cray XMP) 113.57 5.84 152.78 3.61 2.73 48.55 4.44 8.63 198.97 4.53 188.02 25.04 166.53 8.67 7.30 32.37 24.17 104.34 86.48 157.80
Chopra et al. Soln CPU secs (Vax 8700) 85 27.3 144 811.7 754 543.4 1079 509.6 1579 473.9 55 48.9 102 83.2 509 674.4 707 1866.3 1093 245.6 32 333.3 46 119.8 258 9170.3 323 211.7 556 210.6 11 10.1 18 98.0 113 45847.7 146 116.9 267 14.9
Soln1 85 144 808 1128 1654 55 102 553 754 1169 32 46 286 349 587 11 18 130 165 278
Walksat CPU Soln2 (SGI) 1.11 72.69 146 0.05 0.09 0.12 3.41 3.02 103 0.07 0.09 0.16 0.44 34 65.64 47 0.23 0.25 0.40 6.25 19.50 19 4.89 5.25 5.79
Table 1: Computation Results for Beasleys C class Steiner Tree Problems
E2, Walksat takes about 800 seconds to reach the global optimum 214, which is comparable to Chopras 6000 seconds (a ratio of 7.5). Walksat takes only about 28 seconds to reach a tree with weight 216, compared to Beasley who takes 7000 seconds to reach only 231. On E6, Walksat takes less than 2 seconds, compared to over 670 seconds for Chopra. A near-optimal solution takes less than 1 seconds, compared to 1700 seconds for Beasley. Surprisingly, Walksat can locate some of the optimal and near-optimal solutions for the large E-class instances that cannot be found by Beasley in a reasonable amount of time. For example, for E12, Walksat nds a local optima of 68 which was not reached by Beasley within the time limit of 21,600 seconds. For E7, Walksat nds the global optimum of 145, while Beasley only reaches 157. On problems with a larger numbers of Steiner nodes, Walksat usually produces less optimal solutions than the other two methods. The problem Walksat has on instances with a large number of Steiner nodes may due to the fact that the MAX10
BB IHGB EDFB
CPU (SGI) 30.57
Problem Parameters ID D1 D2 D3 D4 D5 D6 D7 D8 D9 D10 D11 D12 D13 D14 D15 D16 D17 D18 D19 D20
Soln 107 228 1599 2170 3360 71 103 1108 1684 2235 31 42 520 688 1208 14 25 247 384 544
5 10 167 250 500 5 10 167 250 500 5 10 167 250 500 5 10 167 250 500
Beasley CPU secs (Cray XMP) 226.27 252.47 21.85 11.71 11.76 4065.69 18.71 475.14 243.48 20.21 3290.48 48.04 36.06 443.26 32.25 161.43 277.20 222.15 256.15 1023.60
Chopra et al. Soln CPU secs (Vax 8700) 106 475.6 220 283.5 1565 2290.1 1935 3529.0 3250 810.6 67 2339.5 103 99.7 1072 6984.5 1448 4629.7 2110 1312.1 29 1374.4 42 305.0 500 1864.0 667 3538.4 1116 1409.7 13 871.3 23 6965.2 223 245192.1 310 878.3 537 47.1
Soln1 106 220 1646 2044 3419 67 103 1180 1585 2219 29 42 544 740 1193 13 23 262 359 558
Walksat CPU Soln2 (SGI) 2.61 107 1.54 227 0.21 0.28 0.53 75.51 70 0.47 0.35 0.41 0.72 2.78 30 0.79 1.07 0.74 1.70 18.29 735 24 20.48 21.52 24.45
Table 2: Computation Results for Beasleys D class Steiner Tree Problems
SAT encodings simply become too large to be processed efciently. (For example, the number of ips per second goes down signicantly on very large formulas.) Nonetheless, given the fact that Walksat is a completely general algorithm, as opposed to the specialized algorithms of Beasley and Chopra, it performs surprisingly well on these hard benchmark problems. It is important to note that Walksat scales up to problems based on large graphs, especially when the set of Steiner nodes is relatively small. This should be contrasted with some other local-search style approaches to solving Steiner trees using simulated annealing (Dowsland 1991) and genetic algorithms (Kapsalis et al. 1993). Despite the fact that these local search algorithms were designed specically for solving Steiner problems, they can only handle the smallest instances in the B and C classes. This has led Hwang et al. (page 172) to conclude that simulated annealing and hillclimbing (a form of local search) are ill-suited for Steiner tree problems. However, our work demonstrates that local search can in fact be successful for Steiner problems. 11
BB IHGB EDFB
CPU (SGI) 0.85 0.98
Problem Parameters ID E1 E2 E3 E4 E5 E6 E7 E8 E9 E10 E11 E12 E13 E14 E15 E16 E17 E18 E19 E20
Soln 115 231 4131 5208 8413 78 157 2733 3721 5899 39 69 1336 1773 3008 15 26 840 923 1376
5 10 417 625 1250 5 10 417 625 1250 5 10 417 625 1250 5 10 417 625 1250
Beasley CPU secs (Cray XMP) 1116.80 7124.10 1346.05 378.66 98.22 1760.49 4459.30 18818,53 311.57 3061.45 457.98 7880.40 445.69 14037.13
Chopra et al. Soln CPU secs (Vax 8700) 111 1149.6 214 6251.2 4013 26468.4 5101 46007.6 8128 12564.1 73 678.0 145 27124.0 2640 118617.5 3604 24527.8 5600 39260.7 34 1900.6 67 7199.7 1280 207058.6 1732 29262.6 2784 7666.0 15 179.0 25 36039.9 (563.03) 758 6371.8 1342 272.2
Soln1 111 214 4282 5398 8518 73 145 2899 3913 5957 34 68 1417 1884 3125 15 27 667 853 1400
Walksat CPU Soln2 (SGI) 0.54 113 817.70 216 1.43 2.10 3.95 1.71 78 5170.50 149 2.05...