{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

cs188_sp09_mt1_sol 8

# cs188_sp09_mt1_sol 8 - 8 The dribble success probability y...

This preview shows page 1. Sign up to view the full content.

8 The dribble success probability y in fact depend on the presence or absence of a defending robot, D . A has no way of detecting whether D is present, but does know some statistical properties of its environment. D is present 2 3 of the time. When D is absent, y = 3 4 . When D is present, y = 1 4 . (f) (4 pt) What is the posterior probability that D is present, given that A D ribbles twice successfully from 1 to 3, then S hoots from state 3 and scores. We can use Bayes’ rule, where D is a random variable denoting the presence of D , and e is the evidence that A dribbled twice and scored. P ( d | e ) = P ( e | d ) · P ( d ) P ( e ) P ( e ) = P ( e | d ) · P ( d ) + P ( e d ) · P ( ¬ d ) P ( e | d ) = 1 4 · 1 4 · 1 2 P ( e d ) = 3 4 · 3 4 · 1 2 P ( e ) = 1 32 · 2 3 + 9 32 · 1 3 = 11 96 P ( d | e ) = 2 96 / 11 96 = 2 11 (g) (3 pt) What transition model should A use in order to correctly compute its maximum expected reward when it doesn’t know whether or not D is present? To maximize expected total reward, the agent should model the situation as accurately as possible.
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

Ask a homework question - tutors are online