{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

lecture 13

# lecture 13 - CS 188 Artificial Intelligence Spring 2010...

This preview shows pages 1–5. Sign up to view the full content.

1 CS 188: Artificial Intelligence Spring 2010 Lecture 13: Probability 3/2/2010 Pieter Abbeel – UC Berkeley Many slides adapted from Dan Klein. 1 Announcements s Upcoming s **new** Tomorrow/Wednesday: probability review session s 7:30-9:30pm in 306 Soda s P3 due on Thursday (3/4) s W4 going out on Thursday, due next week Thursday (3/11) s Midterm in evening of 3/18 2

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
2 Today s We’re almost done with search and planning! s MDP’s: policy search wrap-up s Next, we’ll start studying how to reason with probabilities s Diagnosis s Tracking objects s Speech recognition s Robot mapping s … lots more! s Third part of course: machine learning 3 Policy Search 4
3 MDPs recap s MDP recap: (S, A, T, R, s 0 , γ ) s In small MDPs: can find V(s) and/or Q(s,a) s Known T, R: value iteration, policy iteration s Unknown T, R: Q learning s In large MDPs: cannot enumerate all states 5 Function Approximation s Q-learning with linear q-functions: s Intuitive interpretation: s Adjust weights of active features s E.g. if something unexpectedly bad happens, disprefer all states with that state’s features s Formal justification: online least squares 6 Exact Q’s Approximate Q’s

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
4 Policy Search Idea s Problem: often the feature-based policies that work well aren’t the ones that approximate V / Q best s Solution: learn the policy that maximizes rewards rather than the value that predicts rewards s This is the idea behind policy search, such as what controlled the upside-down helicopter 7 Policy Search s Simplest policy search: s Start with an initial linear value function or Q-function s Nudge each feature weight up and down and see if
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### Page1 / 11

lecture 13 - CS 188 Artificial Intelligence Spring 2010...

This preview shows document pages 1 - 5. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online