*This preview shows
pages
1–2. Sign up
to
view the full content.*

This
** preview**
has intentionally

**sections.**

*blurred***to view the full version.**

*Sign up*
**Unformatted text preview: **CS229 Lecture notes Andrew Ng Mixtures of Gaussians and the EM algorithm In this set of notes, we discuss the EM (Expectation-Maximization) for den- sity estimation. Suppose that we are given a training set { x (1) , . . . , x ( m ) } as usual. Since we are in the unsupervised learning setting, these points do not come with any labels. We wish to model the data by specifying a joint distribution p ( x ( i ) , z ( i ) ) = p ( x ( i ) | z ( i ) ) p ( z ( i ) ). Here, z ( i ) Multinomial( ) (where j 0, k j =1 j = 1, and the parameter j gives p ( z ( i ) = j ),), and x ( i ) | z ( i ) = j N ( j , j ). We let k denote the number of values that the z ( i ) s can take on. Thus, our model posits that each x ( i ) was generated by randomly choosing z ( i ) from { 1 , . . . , k } , and then x ( i ) was drawn from one of k Gaussians depeneding on z ( i ) . This is called the mixture of Gaussians model. Also, note that the z ( i ) s are latent random variables, meaning that theyre hidden/unobserved. This is what will make our estimation problem difficult. The parameters of our model are thus , and . To estimate them, we can write down the likelihood of our data: ( , , ) = m summationdisplay i =1 log p ( x ( i ) ; , , ) = m summationdisplay i =1 log k summationdisplay...

View
Full
Document