This preview shows page 1. Sign up to view the full content.
Unformatted text preview: State reached st+1 = δ (st , at ). reward Reward received: rt+1 = r(st , at ).
at /rt+1 st Mehryar Mohri  Foundations of Machine Learning at+1 /rt+2 st+1 st+2 page 8 Environment MDPs  Properties
Finite MDPs: A and S ﬁnite sets.
Finite horizon when T < ∞.
Reward r(s, a) : often deterministic function. Mehryar Mohri  Foundations of Machine Learning page 9 Example  Robot Picking up Balls
start search/[.1, R1] search/[.9, R1] carry/[.5, R3]
other Mehryar Mohri  Foundations of Machine Learning carry/[.5, 1] pickup/[1, R2] page 10 Policy
Deﬁnition: a policy is a mapping π : S → A.
Objective: ﬁnd policy π maximizing expected
return.
T −t
ﬁnite horizon: τ =0 r(st+τ , π(st+τ )).
T −t τ
inﬁnite horizon: τ =0 γ r(st+τ , π(st+τ )), γ ∈ [0, 1). •
• Theorem: there exists an optimal policy from any
start state. Mehryar Mohri  Foundations of Machine Learning page 11 Policy Value
Deﬁnition: the value of a policy π at state s is
ﬁnite horizon: • Vπ (s) = E T −t
τ =0 r(st+τ , π (st+τ ))  st = s . • inﬁnite horizon: dicount factor γ ∈ [0, 1),
Vπ (s) = E T −t
τ =0 γ τ r(st+τ , π (st+τ ))  st = s . Problem: ﬁnd policy π with maximum value for all
states.
Mehryar Mohri  Foundations of Machine Learning page 12 Policy Evaluation
Analysis of policy value:
Vπ (s) = E T −t
γ τ r(st+τ , π (st+τ ))  st = s
T −t
τ =0
= E[r(s, π (s)] + γ E
γ τ r(st+1+τ , π (st+1+τ ))  st = s
τ =0 = E[r(s, π (s)] + γ E[Vπ (δ (s, π (s)))]. Bellman equation (system of linear equations):
Vπ (s) = E[r(s, π (s)] + γ
s Mehryar Mohri  Foundations of Machine Learning Pr[s s, π (s)]Vπ (s ). page 13 Bellman Equation  Existence and Uniqueness
Notation:
transition probability matrix Ps,s = Pr[s s, π(s)].
value column matrix V = Vπ (s).
expected reward column matrix: R = E[r(s, π(s)]. •
•
• Theorem...
View
Full
Document
This note was uploaded on 07/12/2012 for the course CSCI GA.256600 taught by Professor Mohri during the Spring '12 term at NYU.
 Spring '12
 Mohri
 Machine Learning

Click to edit the document details