SP10 cs188 lecture 9 -- MDPs (2PP)

# SP10 cs188 lecture 9 -- MDPs (2PP) - CS 188 Artificial...

This preview shows pages 1–4. Sign up to view the full content.

1 CS 188: Artificial Intelligence Spring 2010 Lecture 9: MDPs 2/16/2010 Pieter Abbeel – UC Berkeley Many slides adapted from Dan Klein 1 Announcements s Assignments s P2 due Thursday s We reserved Soda 271 on Wednesday Feb 17 from 4 to 6. One of the GSI's will periodically drop in to see if he can provide any clarifications/assistance. It's a great opportunity to meet other students who might still be looking for a partner. s Readings: s For MDPs / reinforcement learning, we’re using an online reading s s Lecture version is the standard for this class 2

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
2 Example: Insurance s Consider the lottery [0.5,\$1000; 0.5,\$0] s What is its expected monetary value (EMV) ? (\$500) s What is its certainty equivalent ? s Monetary value acceptable in lieu of lottery s \$400 for most people s Difference of \$100 is the insurance premium s There’s an insurance industry because people will pay to reduce their risk s If everyone were risk-neutral, no insurance needed! 3 Example: Insurance s Because people ascribe different utilities to different amounts of money, insurance agreements can increase both parties’ expected utility You own a car. Your lottery: L Y = [0.8, \$0 ; 0.2, -\$200] i.e., 20% chance of crashing You do not want -\$200! U Y (L Y ) = 0.2*U Y (-\$200) = -200 U Y (-\$50) = -150 Amount Your Utility U Y \$0 0 -\$50 -150 -\$200 -1000
3 Example: Insurance s Because people ascribe different utilities to different amounts of money, insurance agreements can increase both parties’ expected utility You own a car. Your lottery: L Y = [0.8, \$0 ; 0.2, -\$200] i.e., 20% chance of crashing You do not want -\$200! U Y (L Y ) = 0.2*U Y (-\$200) = -200 U Y (-\$50) = -150 Insurance company buys risk: L I = [0.8, \$50 ; 0.2, -\$150] i.e., \$50 revenue + your L Y Insurer is risk-neutral: U(L)=U(EMV(L)) U I (L I ) = U(0.8*50 + 0.2*(-150)) = U(\$10) > U(\$0) Example: Human Rationality? s

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### Page1 / 13

SP10 cs188 lecture 9 -- MDPs (2PP) - CS 188 Artificial...

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online