# mdp1 - M aki C om pl D eci ons ng ex si CSci 5512...

This preview shows pages 1–7. Sign up to view the full content.

M aking Com plex Decisions CSci 5512: Artifcial Intelligence II

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Sequential Decision Problems Search Planning Markov decision problems (MDPs) Decision theoretic planning Partially observable MDPs (POMDPs) explicit actions and subgoals uncertainty and utility uncertainty and utility uncertain sensing (belief states) explicit actions and subgoals
Markov Decision Process 123 1 2 3 1 + 1 4 START 0.8 0.1 0.1 States s S , actions a A Model T ( s , a , s ± ) P ( s ± | s , a ) Reward function R ( s ) (or R ( s , a ), R ( s , a , s ± )) R ( s )= ± - 0 . 04 (small penalty) for nonterminal states ± 1 for terminal states

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Solving MDPs In search problems, aim is to fnd an optimal sequence In MDPs, aim is to fnd an optimal policy π ( s ) Best action For every possible state s Cannot predict where one will end up Optimal policy maximizes expected sum of rewards Optimal policy when state penalty R ( s ) is –0.04: 123 1 2 3 1 + 1 4
Reward and Optimal Policy 1 + 1 r = [ ! 0.4278 : ! 0.0850] 1 + 1 r = [ ! 0.0480 : ! 0.0274] 1 + 1 r = [ ! 0.0218 : 0.0000] 1 + 1 r = [ ! : ! 1.6284] 8

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Utility of State Sequences Need to understand preferences between sequences of states Typically consider stationary preferences on reward sequences [ r , r 0 , r 1 , r 2 ,... ] ± [ r , r ± 0 , r ± 1 , r ± 2 ] [ r 0 , r 1 , r 2 ] ± [ r ± 0 , r ± 1 , r ± 2 ] Theorem : Only two ways to combine rewards over time: 1) Additive utility function: U ([ s 0 , s 1 , s 2 ]) = R ( s 0 )+ R ( s 1 R ( s 2 ··· 2) Discounted
This is the end of the preview. Sign up to access the rest of the document.

## This note was uploaded on 02/07/2012 for the course CSCI 5512 taught by Professor Staff during the Spring '08 term at Minnesota.

### Page1 / 16

mdp1 - M aki C om pl D eci ons ng ex si CSci 5512...

This preview shows document pages 1 - 7. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online