This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Inference by Stochastic Simulation 0 Basic idea: 3 Draw N samples from a sampling distribution
0 Compute an approximate posterior probability P
CSci 5512: Artificial Intelligence II o Show this converges to the true probability P 0 Sampling approaches: Instructor: Arindam Banerjee Sampling from an empty network
Rejection sampling Likelihood weighting Markov chain Monte Carlo (MCMC) February 1, 2012 Instructor: Arindam Banerjee Instructor: Arindam Banerjee —
Bayesian Networks with Loops Sampling from an empty network 0 Consider a Bayesian Network P(X1,...,X,,) o The joint distribution factorizes as n
P(X1, . . . ,X,,) = H P(X,Parents(X,))
i=1 o Fori=l,...,n
0 Assume Parents(Xi) have been instantiated . . . . D ,f ' PX,P X,
o A direct application of sumproduct can be problematic O raw 3 sampe X o owmg I I arents( )) 0 Can be converted to a junction tree, size can be exponential . (X1 X") forms a sample from the Bayesian Network
,..., 0 Focus on approximate inference techniques: 0 Stochastic inference, based on sampling
: Deterministic inference, based on approximations Instructor: Arindam Banerjee Instructor: Arindam Banerjee o [5(Xe) estimates from samples agreeing with e o Draw sample x from the Bayesian network
a If x is consistent with e, increment N(x)
0 Obtain P(Xe) by normalization 0 Example o Estimate P(Rain5prinkler = true) using 100 samples a 27 samples have Sprinkler = true 0 Of these, 8 have Rain = true and 19 have Rain = false
8 FA’(Rain = true$prinkler = true) = E T .10
F .50 Instructor: Arindam Banerjee Instructor: Arindam Banerjee —
Sampling from an Empty Network (Contd.) Analysis of Rejection Sampling 0 Probability of generating (x1, . . . ,x,,) = P(x1, . . . ,x,,)
9 Sampling following true prior probability 0 How to estimate P(x1,. ..,x,,) from samples? 0 Rejection sampling estimates N(X, e) and N(e)
0 Let N(x1 ...x,,) = # samples of (x1, . . . ,x,,) o The conditional probability estimate
0 Then
A N X P X
A P(Xe) = aN(X, e) = m g = P(Xe)
lim P(x1,...,xn) = lim N(x1,...,x,,)/N (e) (e)
N—)oo N—)oo P x1 x . . . .
( ’ ’ ") o Obtains consnstent posterior estimates 0 P(e) drops off exponentially with number of evidence variables 0 Estimates derived from samples are consistent 0 What if P(e) is very small
A 0 Need large number of samples to get reliable estimates
P(X1,...,Xn) % P(X1,...,Xn) Instructor: Arindam Banerjee Instructor: Arindam Banerjee Likelihood Weighting Likelihood Weighting Analysis 0 Sampling probability for nonevidence component 2 I
S(z,e) = H P(z,Parents(Z,))
0 Main Idea i=1 0 EVICIEI'ICC variables, sample only nonevidence variables 0 Sample from evidence component e
: Weigh each sample by the likelihood of the evidence In
w(z, e) = H P(e,Parent5(E,))
i=1 oSetw=l. Fori=1ton o If X, is a non—evidence variable, sample P(X,Parents(X,)) . Weighted sampling probability iS
o If X, is an evidence variable E,, w <— w x P(E,Parents(E,)) S(z, e)w(z, e) I m
H P(z,Parents(Z,)) H P(e,Parents(E,))
i—1 i=1 0 Then (X, w) forms a weighted sample _
P(z,e) o Likelihood weighting returns consistent estimates
0 Performance degrades with many evidence variables Instructor: Arindam Banerjee Instructor: Arindam Banerjee —
Likelihood Weighting Example Approximate Inference using MCMC 0 Construct a Markov chain based on the Bayesian network T .10
F .50 0 “State” of network = current assignment to all variables T .80
F .20 0 Generate next state by sampling one variable given Markov
blanket 0 Sample each variable in turn. keeping evidence fixed
a More general sampling schedules are admissible w = 1.0 x 0.1 x 0.99 = 0.099 Instructor: Arindam Banerjee Instructor: Arindam Banerjee The Markov chain Markov Blanket Sampling With Sprinkler = true, WetGrass = true, there are four states: 0 Markov blanket of Cloudy is Sprinkler and Rain
0 Markov blanket of Rain is Cloudy, Sprinkler, and WetGrass . 0 Probability given the Markov blanket is calculated as
W /, P(x,MB(X,)) oc P(x,Parents(X,)) H P(zJParents(ZJ))
ZjEChi/dren(X,) 0 Main computational problems a Difficult to tell if convergence has been achieved Wander about for a while, average what you see I I
a Can be wasteful if Markov blanket IS large Instructor: Arindam Banerjee Instructor: Arindam Banerjee MCMC Example (Contd.) 0 Problem: Estimate
P(Rain$prinkler = true, WetGrass = true) 0 Sample Cloudy or Rain given its Markov blanket, repeat 0 Count number of times Rain is true and false in the samples 0 Example: Visit 100 states a 31 have Rain 2 true, 69 have Rain 2 false
31 P(Rain = true$prinkler = true, WetGrass = true) 2 m 0 Theorem: Markov chain approaches stationary distribution
0 Longrun fraction is proportional to posterior probability Instructor: Arindam Banerjee ...
View
Full Document
 Spring '08
 Staff

Click to edit the document details