This preview shows pages 1–4. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: 1 SA1 1 Probabilistic Robotics Planning and Control: Markov Decision Processes 2 Problem Classes Deterministic vs. stochastic actions Full vs. partial observability Today how to make decisions under uncertainty Uncertainty and decisions Previously how to do state estimation under uncertainty Uncertainty can affect how the robot makes decisions How to encode preferences, between different outcomes of the planes (e.g. going to the airport lots of options, risks) Utility theory reasoning about preferences (utility quality of being useful) Every state has some utility Decision theory = probability theory + utility theory Principle of maximum expected utility agent is rational if it chooses an action with the highest expected utility 3 Designing control systems Often in addition to stability, observability, controllability, we want to have some optimality Such that the goal it that the trajectory will maximize certain performance index (e.g. time travelled, fuel cost, quadratic cost for trajectory tracking ) Using techniques from calculus of variations to solve for functions which maximize the performance index V Special class of systems nstage decision processes Find such V and choices of action such that the V is maximal Blackboard example : Recursive computation of V in deterministic case (in case of grid world similar to waverfront planner) Principle of dynamic programming decompose the problem in nstages; at each stage relaxation Next what if the outcomes of actions are uncertain ? 4 2 Making decisions under uncertainty Suppose I believe the following: P(A 25 gets me there on time  ) = 0.04 P(A 90 gets me there on time  ) = 0.70 P(A 120 gets me there on time  ) = 0.95 P(A 1440 gets me there on time  ) = 0.9999 Which action to choose? Depends on my preferences for missing flight vs. time spent waiting, etc. Utility theory is used to represent and infer preferences Decision theory = probability theory + utility theory A simple knowledgebased agent The agent must be able to: Represent states, actions, etc. Incorporate new percepts Update internal representations of the world Deduce hidden properties of the world Deduce appropriate actions Markov decision processes Framework for represention complex multistage decision problems in the presence of uncertainty Efficient solutions Models the dynamics of the environment under different actions Markov assumptions : next state depends in the previous state, and action not the past CS 685 7 Markov Decision Process Formal definition 4tuple (X, U, T, R) Set of states X  finite Set of actions A  finite Transition model Transition probability for each action, state Reward model 8 T : X U X [0,1] X U X R 3 Example Robot navigating on the grid 4 actions up, down, left, right Effects of moves are stochastic, we may end up in...
View Full
Document
 Fall '08
 Luke,S

Click to edit the document details