2 let us use mean field variational technique for

This preview shows page 2 - 4 out of 5 pages.

2. Let us use mean-field variational technique for approximate inference. In particular, let us approximate the posterior by the following distribution: Q ( θ d , { z dn } N d n = 1 | γ d , { φ dn } N d n = 1 ) = braceleftBigg P ( θ d | γ d ) N d n = 1 ( φ dnk ) z dnk bracerightBigg (2) where φ dn = ( φ dn 1 , ··· , φ dnK ) is a variational multinomial distribution over topics for position n in the document d and γ d = ( γ d 1 , ··· , γ dK ) are the parameters of a variational Dirichlet distribution given by: P ( θ d | γ d ) = Γ ( k γ dk ) k Γ ( γ dk ) K k = 1 ( θ dk ) γ dk 1 (3) The graphical representation for the variational distribution is given in figure 1(b). 2
Image of page 2

Subscribe to view the full document.

(b) Mean field variational distribution for LDA M N w α θ β z γ θ z φ Μ Ν (a) LDA graphical representation Figure 1: Admixture model: graphical representation Starting with the results of the mean field variational inference given in Eq. (4) below, derive closed form expressions for the variational parameters γ dk and φ dnk . Q ( x ) = 1 Z exp ( Φ : X scope ( Φ ) E Q ( U ) [ ln Φ ( U Φ , x )]) (4) where Φ is a factor in the graphical model and U Φ = scope ( Φ ) X . In your derivation, you will use the following properties of the Dirichlet distribution: E Q ( θ d | γ d ) [ log θ dk ] = Ψ ( γ dk ) Ψ ( K k = 1 γ dk ) (5) where Ψ ( γ dk ) = d d γ dk log Γ ( γ dk ) is the digamma function. Prove the result above using the properties of the expo- nential family we discussed in class. (Hint: convert Dirichlet to its natural parametrization and use the property that the expected value of a sufficient statistic is equal to the first derivative of the log-partition function w.r.t the corresponding natural parameter.) 3. Now, let us consider the Logistic Normal admixture model, known in common parlance as the Correlated Topic Model. It differs from LDA in only that the symmetric Dirichlet prior with parameter α is replaced by a Logistic normal distribution, which is given as follows: P ( η d | μ , Σ ) = N ( μ , Σ ) (6) where η d = ( η d 1 , ··· , η dK ) with each η dk R . Each θ dk is a simple logistic transformation of η dk given by θ dk = exp ( η dk ) K k = 1 exp ( η dk ) (7) Using a logistic normal distribution as described above, allows us to capture correlations in topics, given by the matrix Σ .
Image of page 3
Image of page 4
  • Fall '07
  • CarlosGustin
  • Normal Distribution, Probability theory, nd, Dirichlet distribution, Belief propagation, Chordal graph

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern