planning[1] - Planning I CPS 170 Ron Parr An...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Planning I CPS 170 Ron Parr An Example Planning Applica8on Space shu<le arm is currently controlled by a highly trained human. 1 Planning Applica8on •  Remove human from the control loop •  Specific goals for system: –  Rearrange items in cargo bay –  Connect space sta8on pieces •  Assuming mechanical engineering issues can be resolved: –  Arm could work while astronauts sleep –  Complicated training could be eliminated Some Actual Planning Applica8ons •  Used to fulfill mission objec8ves in Nasa’s Deep Space One (Remote Agent) –  Par8cularly important for space opera8ons due to latency –  Also used for rovers •  •  •  •  AircraR assembly schedules Logis8cs for the U.S. Navy Observa8on schedules for Hubble space telescope Scheduling of opera8ons in an Australian beer factory 2 Scheduling •  Many “planning” problems are scheduling problems •  Scheduling can be viewed as a generaliza8on of the planning problem to include resource constraints –  Time & Space –  Money & Energy •  Many principles from regular planning generalize, but some special purpose methods (not discussed here) are used Characterizing Planning Problems •  Start state (group of states) •  Goal – almost always a group of states •  Ac8ons •  Objec8ve: Plan = A sequence of ac8ons that is guaranteed to achieve the goal. •  Like everything else, can view planning as search… •  So, how is this different from search? 3 What makes planning special? •  States typically specified by a set of rela8ons or proposi8ons: –  On(solar_panels, cargo_floor) –  arm_broken •  Goal is almost always a set –  Typically care about a small number of things: •  at(Ron, airport), •  parked_in(X, car_of(Ron)) •  airport_parking_stall(X) –  Many things are irrelevant •  parked_in(Y, car_of(Bill)) •  adjacent(X,Y) •  Branching factor is large Planning Algorithms •  Extremely ac8ve and rapidly changing area •  Annual compe88ons pit different algorithms against each other on suites of challenge problems •  Algorithms compete in different categories –  General –  Domain specific •  No clearly superior method has emerged, though there are trends 4 Planning With Logic/Theorem Proving •  Need to describe effects of ac8ons with logic •  Ask for the existence of plans that achieve our goals •  Challenge: Talking about dynamic situa8ons in logic z Y x Situa8ons •  Recall that we can’t have contradic8ons in our knowledge base – OTW, can prove anything •  Need to index our claims about the world with 8me in some way (otherwise changes would create contradic8ons) •  Add an extra argument onto every predicate indica8ng when things are true: –  on(table, z, s) –  on(x, y, s) •  result(s,a) = result of doing a in s •  (result(s,a) = result(s’a’)) iff ((s=s’) AND (a=a’)) 5 Describing Ac8ons •  Let’s move A from B to C •  applicable(move(A,B,C,S)) : ­ –  on(A,B,S) –  clear(C,S) •  S’=result(move(A,B,C,S), S) : ­ –  applicable(move(A,B,C,S)) –  clear(B,S’) –  on(A,C,S’) Successor State Axioms •  Ac8on descrip8ons tell us what has changed, but how do we say what persists? •  Successor state axioms: –  On(C,D,result(A,S)) iff applicable(A,S) AND •  On(C,D,S), A!=move, OR •  On(C,D,S), A=move(A,B,C), C!=A, D!=B, OR •  A=move(C,E,D) •  Need one of these for every proposi8on! 6 Finding the Plan •  Assume we have: –  Descrip8ons of all ac8ons –  Successor ­state axioms –  Descrip8on of the ini8al state (situa8on) •  Q: How do we find the plan? •  A: Ask theorem prover if there exists a situa8on in which the goal is true! •  Theorem prover will return plan as a binding: result(move(X,Table,(result(move(Y,X,Z,S)) Planning via Theorem Proving: A Good Idea? •  Pros: –  Very general –  Very powerful representa8on –  Access to theorem proving infrastructure •  Cons: –  Awkward representa8on (unless you are a logician) –  Slow in prac8ce (price of generality) 7 Overcoming Limita8ons of Planning via Theorem Proving •  Simplify the representa8on •  Avoid successor state axioms •  Avoid generality of full, first order logic in hopes of allowing faster, special purpose algorithms for planning PDDL •  Ac8ons have a set of precondi8ons and effects •  Think of the world as a database –  Precondi8ons specify what must be true in the database for the ac8on to be applied –  Effects specify which things will be changed in the database if the ac8on is •  NB: PDDL supersedes an earlier, similar representa8on called STRIPS 8 move(obj,from,to) •  Precondi8ons –  clear(obj) –  on(obj,from) –  clear(to) •  Effects –  –  –  –  on(obj,from) clear(to) not(on(obj,to)) not(clear(from)) y z x move(y,x,z) Limita8ons of PDDL •  Assumes that a small number of things change with each ac8on –  Dominoes –  Pulling out the bo<om block from a stack •  Precondi8ons and effects are conjunc8ons •  No quan8fica8on 9 Planning Ac8ons vs. Search Ac8ons •  Plan ac8ons are really ac*on schemata •  Every strips rule specifies a huge number of ground ­ level ac8ons •  Consider move(obj, from, to) –  Assume n objects in the world –  This ac8on alone specifies O(n3) ground ac8ons –  Planning tends to have a very large ac8on space •  Compare with CSPs Planning vs. CSPs •  Both have large ac8on spaces •  CSPs are atemporal •  Effects of ac8ons (assignments) are implicit •  Path ma<ers: Knowing that solu8on exists isn’t sufficient 10 How hard is planning? •  Planning is NP hard •  How can we prove this? –  Reduce graph coloring to planning –  Ac8ons correspond to assigning colors –  Convenient to keep track of which nodes have colors assigned: •  Colored(node) •  Color(node,{R,G,B}) Graph Coloring Reduc8on •  Given a graph coloring problem, what is our goal? •  Goal is: colored(vi) for all nodes vi •  Ini8al state is: –  not(colored(vi)) for all nodes vi –  color(V,nil) for all nodes vi –  neighor(vi, vj) for all neighbors in the graph –  Neighbors(v,i) for all nodes v with i neighbors •  What are our ac8ons? –  color(V,color) 11 Color(v,c) •  Precondi8ons –  Not(colored(v)) –  Neighbors(v,1) –  neighbor(v,u) –  colored(u,c’) –  c!=c’ •  Effects –  Colored(v) –  Color(v,c) Addi8onal Ac8ons •  As described, we need different ac8ons for different numbers of neighbors – why? •  No way to quan8fy over all neighbors in a PDDL rule •  How expensive is this? 12 What this Does •  Ac8ons correspond to coloring graph nodes •  Only legal assignments are allowed •  Plan exists iff graph is colorable •  Result: Planning is at least as hard as graph color, i.e., NP ­hard Is planning NP ­complete? •  NO! •  Consider the towers of Hanoi: –  h<p:// –  Ac8ons are exactly the same as the blocks moving ac8ons •  Requires exponen8al number of moves •  Planning is actually PSPACE complete •  Planning with bounded plans is NP ­complete 13 Should plan size worry us? •  What if you have a problem with an exponen8al length solu8on? •  Imprac8cal to execute (or even write down) the solu8on, so maybe we shouldn’t worry •  Some8mes this may just be an ar8fact of our ac8on representa8on –  Towers of Hanoi solu8on can be expressed as a simple recursive program –  Nice if planner could find such programs Planning Using Search •  Forward Search: –  As with theorem proving, blind forward search is problema8c because of the huge branching factor –  Some success using this method with carefully chosen ac8on pruning heuris8cs (not covered in class) •  Backward Search: –  As with theorem proving, tends to focus search on relevant terms –  Called “Goal Regression” in the planning context 14 Goal Regression •  Goal regression is a form of backward search from goals •  Basic principle goes back to Aristotle •  Embodied in earliest AI systems –  GPS: General Problem Solver by Newell & Simon •  Cogni8vely plausible •  Idea: –  Pick ac8ons that achieve (some of) your goal –  Make precondi8ons of these ac8ons your new goal –  Repeat un8l the goal set is sa8sfied by start state Goal Regression Example Regress on(z,x) through move(z,table,x) New goal: clear(x) y z x Goal: on(z,x) 15 Facts About Goal Regression •  Elegant solu8on to the problem of backward search from mul8ple goal states –  In planning, goal state is usually a set of states –  Does backward search at the level of state sets •  Goal regression is sound and complete •  Need to be careful to avoid endless loops on problems like Sussman anomaly (coming up) Heuris8cs in planning •  Need heuris8cs for searching in planning, but… •  Planning problems tend to defy natural efforts to develop good heuris8cs: –  Ignoring precondi8ons: Finding shortest path while ignoring precondi8ons is s8ll an intractable problem –  Ignoring dele8ons: Also intractable to find shortest path –  (Above two difficul8es mean that coming with an admissible heuris8c is non ­trivial.) –  Coun8ng number of conjuncts true (admissible but very weak) 16 The Sussman Anomaly x z x y y z Goal: on(x,y), on(y,z) Problems with naïve subgoaling •  The number of conjuncts sa8sfied may not be a good heuris8c •  Achieving individual conjuncts in isola8on may actually make things harder •  Causes simple planners to go into loops x z x y y z 17 Planning Features & Challenges •  State space is very large •  Goals usually defined over state sets •  Very large, implicitly defined ac8on space •  Difficult to come up with good heuris8cs •  Path (plan) usually ma<ers •  We will see that plan graphs are a clever way of coming up with good heuris8cs for planners Can our exper8se in CSPs help? •  Can planning be reduced to CSPs? •  CSPs are a more restric8ve language •  Need to consider bounded ­length plans –  In general, this isn’t too much of a problem because extremely long plans are an indica8on that we need to reformulate the problem (Towers of Hanoi) •  Our hope: Solve plan as a CSP at let our CSP insights do the work for us (Doesn’t quite work, but it helps…) 18 Formula8ng Planning as a CSP •  Introduce Ac8on(a,i) (binary) to indicate if ac8on a is taken at step i. –  We introduce |Ac8ons| x plan_length variables •  We also need to represent the statements in our database using proposi8on(p,i) (binary) to indicate the truth of proposi8on p at 8me I –  This introduces |proposi8ons| x plan_length variables –  But there’s a catch… Proposi8onalizing •  Also called “grounding out” •  Recall that domain descrip8ons an ac8ons involve rela8ons: –  on(x,table) –  clear(x) •  Proposi8ons don’t take arguments –  arm_broken 19 Conver8ng to Proposi8onal Form •  Consider on(x,y) •  Note that we considered this type of issue before when thinking about plan branching factor •  If there are n objects in the world, how many proposi8ons do we need to express all possible realiza8ons of on(x,y)? •  What if there are k rela8ons that each take d variables? Digression on Proposi8onalizing •  It turns out that in many planning domains the number of ac8ons (k) is rela8vely low •  The number of variables involved in each ac8on is usually rela8vely low too •  Hard to think of an ac8on that involves six or more variables •  In general, proposi8onalizing is viewed as an inelegant trick that people would like to avoid •  Is fast planning possible w/o this? 20 Back to CSP formula8on •  We now have ac8on(move_x_y_z, i) = t iff we move x from y to z at 8me i. •  We also have proposi8on(on_y_z, i) = t iff y is on z at 8me i. •  Now we need to set up our constraints so that the problem is sa8sfiable iff there exists a plan Plan CSP Constraints •  Ac8ons must be sequen8al –  For all a,a’ not(ac8on(a,i) and ac8on(a’,i)) –  Another quadra8c factor! •  Ac8ons’ effects on the world. If ac8on(a,i)=t –  –  –  –  Proposi8on(p,i ­1) = t for each p in precondi8ons Proposi8on(p,i)=t for each p in add list Proposi8on(p,i)=f for each p in delete list This is linear in the new ac8on, proposi8on space 21 What’s Missing? •  We need to express that proposi8ons persist –  Proposi8on(p,i) = f unless •  It was true in previous step and not deleted •  It was false in previous step but added •  We need to assert ini8al and final states –  Easier than it sounds –  We just set these variables to have the right values and the CSP does the rest This works, but… •  The CSP is very large •  It is very highly connected –  Variable elimina8on is hard –  Hard to do k ­consistency •  Turns a hard planning problem into a hard(er) CSP 22 Plan Graphs •  High Level Idea: –  Avoid construc8ng the exactly CSP –  Construct a sequence of simplified (trivially solvable) CSPs corresponding to different plan lengths –  Ignore precondi8ons and never delete things –  If plan of size k exists, then CSP of size k sa8sfiable –  Note is this if, not iff •  Why this is useful: –  Did not have a good method of coming up with admissible heuris8cs for planning –  If simplified CSPs are cheap to solve, then we have a reasonable, admissible heuris8c Plan Graph Form Ini8al Possible t=1 ac8ons Possible t=1 configura8ons Possible t=2 ac8ons … Goal Or Level Consider a par8cular world configura8on c Find the first phase containing all proposi8ons in c Distance to goal will be an admissable heuris8c for forward search and backward search. 23 Example Planning Graph delete on(x,y) on(y,z) on(z,table) clear(x) . . . Star8ng configura8on Placeholder (persistence) ac8ons move(x,y,table) . . . Possible t=1 ac8ons ¬on(x,y) on(y,z) on(z,table) clear(x) on(x,table) . . . Possible t=1 configura8ons Facts About Planning Graphs •  Similar to CSP constraint graph •  The planning graph includes everything that might be true at a par8cular 8me •  Includes all ac8ons that might be possible at a par8cular 8me •  Is a relaxa+on of the original problem 24 Why this is good •  Relaxa8ons are a good way of developing admissable heuris8cs •  A major difficulty with planning is that we have trouble coming up with good heuris8cs •  Note that plan graphs can provide admissable heuris8cs for either direc8on (forward or regression [means/ends] search) Why this isn’t good enough •  Produces fairly weak heuris8cs •  Proposi8ons are never really deleted •  Can take many ac8ons simultaneously •  The problem is now too relaxed –  Need to figure out a way to use structure more effec8vely without losing •  Problem independence •  Clarity, speed •  Admissability 25 Mutual Exclusion Between Ac8ons •  Two real (non ­persistence) ac8ons can’t be taken simultaneously; we mark these mutually exclusive •  Types of mutual exclusion –  Inconsistent effects/Interference •  persist(on_x_y,1) •  ac8on(move_x_y_z, 1) –  Compe8ng needs •  Precondi8on appears posi8ve in one ac8on •  Appears negated in another Extending graphs using mutex For each planning phase: 1. Generate all ac8ons with non ­mutex precondi8ons 2.  Mark as mutex all ac8on/maintain pairs that conflict 3.  Mark as mutex all ac8on/ac8on pairs with mutex preconds 4.  Generate all poten8al proposi8ons for next 8me step 5.  Mark pairs of proposi8ons that can only be generated by mutex ac8ons as mutex We now think of everything in terms of mutually compa8ble sets of proposi8ons. 26 Plan Graphs with Mutex Constraints On(C, A) On(A, Table) On(C, A) Move(C,A,B) Clear(C) On(A, Table) Clear(C) MoveToTable(C,A) On(B, Table) On(B, Table) Clear(B) •  Each level has literals that “could be true” at that level •  Mutex (mutual exclusion) rela8ons indicate incompa8ble ac8ons/ literals Clear(B) Move(B,Table,C) Clear(A) On(C, B) On(C, Table) On(B, C) Slide courtesy Vince Conitzer Plan Graphs with Mutex Constraints •  Extend forward un8l goal conjunc8ons appear non ­mutex •  This is s8ll a relaxa8on of the problem •  In essence, we have relaxed the original planning CSP so that we only worry about 2 ­consistency •  We s8ll have an admissable heuris8c •  For any configura8on, we search for the earliest one in which the configura8on proposi8ons appear in non ­mutex form 27 How do we use this? •  Original graphplan algorithm had a special planning algorithm that work with the plan graph •  Modern approaches primarily use the plan graph in conjunc8on with some kind of search •  Despite some apparent complexity, this turns out to be much, much cleaner, faster and easier to implement than planning algorithms from the 80s and early 90s How well does it work? •  The ini8al graphplan algorithm was so much faster than compe8ng algorithms it was hard to even compare them on the same scale. •  There is a web page devoted to graphplan: –  h<p:// 28 Graphplan Summary •  Graphplan combines two concepts: –  Constraint ­based reasoning with a form of 2 ­consistency –  Basic search •  More elaborate approaches are possible: –  Add more complicated constraints to plan graph –  Trade off: As plan graph becomes richer: •  Heuris8c values get closer to true plan length •  Cost of building/using plan graph grows steeply •  Graphplan combines our knowledge of good search methods with our knowledge of good CSP methods Other Approaches: SAT •  If we can convert planning to a CSP (kind of) and get some advantage from viewing it as a CSP, why not try conver8ng to some other problem •  SATPlan converts (bounded length) planning problems to SAT problems •  Uses off ­the ­shelf SAT solvers •  As with plan graphs, this requires proposi8onalizing (grounding out) the problem 29 Interes8ng things about SATPlan •  This actually works pre<y well for some domains •  Details of the transforma8on are somewhat tricky •  As with the CSP formula8on, it tends to produce very large problem instances •  Can cause problems for domains with many items Modern Planning Conclusion •  Fast planning algorithms seem to rely simple, fast underlying methods •  Ruling out bad things quickly seems to help –  Heuris8cs used in SAT solvers (not covered here) –  Constraint propaga8on in graphplan variants •  S8ll a very open area, not as clean as search/CSPs 30 What’s Missing? •  As described, plans are “open loop” •  No provisions for: –  Ac8ons failing –  Uncertainty about ini8al state –  Observa8ons •  Solu8ons: –  Plan monitoring, replanning –  Conformant/Sensorless planning –  Con8ngency planning Planning Under Uncertainty •  What if there is a probability distribu8on over possible outcomes? –  Called: Planning under uncertainty, decision theore8c planning, Markov Decision Processes (MDPs) –  Much more robust: Solu8on is a “universal plan”, i.e., a plan for all possible outcomes (monitoring and replanning are implicit) –  Much more difficult computa8onally •  What if observa8ons are unreliable? –  Called: “Par8al Observability”, Par8ally Observable MDPs (POMDPs) –  Applica8ons to medical diagnosis, defense –  Way, way harder computa8onally 31 ...
View Full Document

This note was uploaded on 02/17/2012 for the course COMPSCI 170 taught by Professor Parr during the Spring '11 term at Duke.

Ask a homework question - tutors are online