This preview shows page 1. Sign up to view the full content.
Unformatted text preview: tain
p(Z, θ, φ|w, α, β ) ∝ m ni
mm Multinomial(wi,j ; φzi,j ) i=1 j =1
i=1 m ni
mm Multinomial(zi,j ; θi )× i=i j =1 Dirichlet(θi ; α) K
k =1 23 Dirichlet(φk ; β ). (22) For any given Z, θ, φ, w, α, and β , we can easily evaluate (22). (All the nor
malizations are known so it’s easy.) We will see in the next section that this
is suﬃcient to be able to simulate draws from the posterior.
Even using conjugate priors, in general it will not be possible to recover
the posterior analytically for hierarchical models of any complexity. We will
rely on (among a few other options) sampling methods like the Monte Carlo
Markov Chains (MCMC) that we discuss in the next section. What the
statistics community call Bayesian hierarchical models are in the machine
learning community often treated as a special case of Bayesian graphical
models (speciﬁcally, directed acyclic graphs). There is at least one entire
course at MIT on inference in Bayesian graphical models (6.438). 6 Markov Chain Monte Carlo sampling As we have se...
View Full Document
This note was uploaded on 03/24/2014 for the course MIT 15.097 taught by Professor Cynthiarudin during the Spring '12 term at MIT.
- Spring '12