We dene the transition kernel to be the probability

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: tain p(Z, θ, φ|w, α, β ) ∝ m ni mm Multinomial(wi,j ; φzi,j ) i=1 j =1 m m i=1 m ni mm Multinomial(zi,j ; θi )× i=i j =1 Dirichlet(θi ; α) K m k =1 23 Dirichlet(φk ; β ). (22) For any given Z, θ, φ, w, α, and β , we can easily evaluate (22). (All the nor­ malizations are known so it’s easy.) We will see in the next section that this is sufficient to be able to simulate draws from the posterior. Even using conjugate priors, in general it will not be possible to recover the posterior analytically for hierarchical models of any complexity. We will rely on (among a few other options) sampling methods like the Monte Carlo Markov Chains (MCMC) that we discuss in the next section. What the statistics community call Bayesian hierarchical models are in the machine learning community often treated as a special case of Bayesian graphical models (specifically, directed acyclic graphs). There is at least one entire course at MIT on inference in Bayesian graphical models (6.438). 6 Markov Chain Monte Carlo sampling As we have se...
View Full Document

This note was uploaded on 03/24/2014 for the course MIT 15.097 taught by Professor Cynthiarudin during the Spring '12 term at MIT.

Ask a homework question - tutors are online