cs685-mdps

cs685-mdps - 1 SA-1 1 Probabilistic Robotics Planning and...

Info iconThis preview shows pages 1–4. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 1 SA-1 1 Probabilistic Robotics Planning and Control: Markov Decision Processes 2 Problem Classes Deterministic vs. stochastic actions Full vs. partial observability Today how to make decisions under uncertainty Uncertainty and decisions Previously how to do state estimation under uncertainty Uncertainty can affect how the robot makes decisions How to encode preferences, between different outcomes of the planes (e.g. going to the airport lots of options, risks) Utility theory reasoning about preferences (utility quality of being useful) Every state has some utility Decision theory = probability theory + utility theory Principle of maximum expected utility agent is rational if it chooses an action with the highest expected utility 3 Designing control systems Often in addition to stability, observability, controllability, we want to have some optimality Such that the goal it that the trajectory will maximize certain performance index (e.g. time travelled, fuel cost, quadratic cost for trajectory tracking ) Using techniques from calculus of variations to solve for functions which maximize the performance index V Special class of systems n-stage decision processes Find such V and choices of action such that the V is maximal Blackboard example : Recursive computation of V in deterministic case (in case of grid world similar to waverfront planner) Principle of dynamic programming decompose the problem in n-stages; at each stage relaxation Next what if the outcomes of actions are uncertain ? 4 2 Making decisions under uncertainty Suppose I believe the following: P(A 25 gets me there on time | ) = 0.04 P(A 90 gets me there on time | ) = 0.70 P(A 120 gets me there on time | ) = 0.95 P(A 1440 gets me there on time | ) = 0.9999 Which action to choose? Depends on my preferences for missing flight vs. time spent waiting, etc. Utility theory is used to represent and infer preferences Decision theory = probability theory + utility theory A simple knowledge-based agent The agent must be able to: Represent states, actions, etc. Incorporate new percepts Update internal representations of the world Deduce hidden properties of the world Deduce appropriate actions Markov decision processes Framework for represention complex multi-stage decision problems in the presence of uncertainty Efficient solutions Models the dynamics of the environment under different actions Markov assumptions : next state depends in the previous state, and action not the past CS 685 7 Markov Decision Process Formal definition 4-tuple (X, U, T, R) Set of states X - finite Set of actions A - finite Transition model Transition probability for each action, state Reward model 8 T : X U X [0,1] X U X R 3 Example Robot navigating on the grid 4 actions up, down, left, right Effects of moves are stochastic, we may end up in...
View Full Document

Page1 / 14

cs685-mdps - 1 SA-1 1 Probabilistic Robotics Planning and...

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online