final-06

final-06 - CS228 Final 1 CS 228, Winter 2006 Final Handout...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: CS228 Final 1 CS 228, Winter 2006 Final Handout #16 1. [8 points] Influence Diagrams Consider a Influence Diagram containing a single utility factor in which all variables other than D(a decision node) and its parents have been eliminated using standard VE. Show that the decision rule D that maximizes: X D, Pa D D - D ( D, Pa D ) . is defined as: D ( w ) = 1 , d = argmax d V al ( D ) - D ( d, w ) , otherwise for w V al ( Pa D ). 2. [18 points] Causality (a) [5 points] For probabilistic queries, we have that min x P ( y | x ) P ( y ) max x P ( y | x ) . Show that the same property does not hold for intervention queries. Specifically, provide an example where it is not the case that: min x P ( y | do ( x )) P ( y ) max x P ( y | do ( x )) . (b) [6 points] As for probabilistic independence, we can define a notion of causal in- dependence: ( X C Y | Z ) if, for any values x , x V al ( X ), we have that P ( Y | do ( Z ) , do ( x )) = P ( Y | do ( Z ) , do ( x )). (Note that, unlike probabilistic in- dependence ( X Y | Z ) causal independence is not symmetric over X , Y .) Is causal independence equivalent to the statement: For any value x V al ( X ), we have that P ( Y | do ( Z ) , do ( x )) = P ( Y | do ( Z )). (Hint: Use your result from (a).) (c) [7 points] Prove that ( X C Y | Z , W ) and ( W C Y | X , Z ) implies that ( X , W C Y | Z ). Intuitively, this property states that, if changing X cannot affect P ( Y ) when W is fixed, and changing W cannot affect P ( Y ) when X is fixed, then changing X and W together cannot affect P ( Y ). 3. [10 points] Learning in DBNs (a) [5 points] Suppose that we have fully-observed sequences (from time 0 to T) of the variables in a DBN. What is the marginal likelihood P ( D | G ) of the DBN using Dirichlet (1 , . . . , 1) priors on the parameters? (b) [5 points] Now suppose that we want to learn the optimal DBN structure (a B , B pair) for the data from part(a) given some initial structure. Can we do this by running the usual greedy structure search algorithm (p. 399) on the BN produced by unrolling our DBN T time steps? If so, why? If not, what changes must be made? CS228 Final 2 4. [21 points] Inference Consider a chain probabilistic network X 1- X 2- - X n and a corresponding clique tree of the form C 1- - C n- 1 where Scope [ C i ] = { X i , X i +1 } . Each variable....
View Full Document

Page1 / 4

final-06 - CS228 Final 1 CS 228, Winter 2006 Final Handout...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online