lect6

lect6 - Clustering III Lecture outline Soft (model-based)...

Info iconThis preview shows pages 1–11. Sign up to view the full content.

View Full Document Right Arrow Icon
Clustering III
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Lecture outline Soft (model-based) clustering and EM algorithm Clustering aggregation [A. Gionis, H. Mannila, P. Tsaparas: Clustering aggregation, ICDE 2004] Impossibility theorem for clustering [Jon Kleinberg, An impossibility theorem for clustering, NIPS 2002]
Background image of page 2
Expectation-maximization algorithm Iterative procedure to compute the Maximum Likelihood (ML) estimate – even in the presence of missing or hidden data EM consists of two steps: Expectation step: the (missing) data are estimated given the observed data and current estimates of model parameters Maximization step: The likelihood function is maximized under the assumption that the (missing) data are known
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
EM algorithm for mixture of Gaussians What is a mixture of K Gaussians? with and F(x| Θ ) is the Gaussian distribution with parameters Θ = { μ , Σ } = Θ = K k k k x F x p 1 ) | ( ) ( π = = K k k 1 1
Background image of page 4
EM algorithm for mixture of Gaussians If all points x є X are mixtures of K Gaussians then Goal: Find π 1 ,…, π k and Θ 1 ,…, Θ k such that P(X) is maximized Or, ln(P(X)) is maximized: ∏∑ = = = Θ = = n i K k k i k n i i x F x p X p 1 1 1 ) | ( ) ( ) ( π = = Θ = Θ n i K k k i k x F L 1 1 ) | ( ln ) (
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Mixtures of Gaussians -- notes Every point x i is probabilistically assigned (generated) to (by) the k -th Gaussian Probability that point x i is generated by the k -th Gaussian is = Θ Θ = K j j i j k i k ik x F x F w 1 ) | ( ) | ( π
Background image of page 6
Mixtures of Gaussians -- notes Every Gaussian (cluster) C k has an effective number of points assigned to it N k With mean And variance = = n i ik k w N 1 = = n i i ik k k x w N 1 1 μ ( 29 ( 29 = - - = Σ n i T k i i k i ik k k x x x w N 1 1
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
EM for Gaussian Mixtures Initialize the means μ k , variances Σ k ( Θ k =( μ k , Σ k ) ) and mixing coefficients π k , and evaluate the initial value of the loglikelihood Expectation step : Evaluate weights = Θ Θ = K j j i j k i k ik x F x F w 1 ) | ( ) | ( π
Background image of page 8
EM for Gaussian Mixtures Maximization step : Re-evaluate parameters = = n i i ik k new k x w N 1 1 μ ( 29 ( 29 = - - = Σ n i T new k i i new k i ik k new k x x x w N 1 1 N N k new k = π
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Soft (model-based) clustering and EM algorithm Clustering aggregation [A. Gionis, H. Mannila, P. Tsaparas: Clustering aggregation, ICDE 2004]
Background image of page 10
Image of page 11
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 38

lect6 - Clustering III Lecture outline Soft (model-based)...

This preview shows document pages 1 - 11. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online