lect6

lect6 - Clustering III Lecture outline Soft(model-based...

Info iconThis preview shows pages 1–11. Sign up to view the full content.

View Full Document Right Arrow Icon
Clustering III
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Lecture outline Soft (model-based) clustering and EM algorithm Clustering aggregation [A. Gionis, H. Mannila, P. Tsaparas: Clustering aggregation, ICDE 2004] Impossibility theorem for clustering [Jon Kleinberg, An impossibility theorem for clustering, NIPS 2002]
Background image of page 2
Expectation-maximization algorithm Iterative procedure to compute the Maximum Likelihood (ML) estimate – even in the presence of missing or hidden data EM consists of two steps: Expectation step: the (missing) data are estimated given the observed data and current estimates of model parameters Maximization step: The likelihood function is maximized under the assumption that the (missing) data are known
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
EM algorithm for mixture of Gaussians What is a mixture of K Gaussians? with and F(x| Θ ) is the Gaussian distribution with parameters Θ = { μ , Σ } = Θ = K k k k x F x p 1 ) | ( ) ( π = = K k k 1 1
Background image of page 4
EM algorithm for mixture of Gaussians If all points x є X are mixtures of K Gaussians then Goal: Find π 1 ,…, π k and Θ 1 ,…, Θ k such that P(X) is maximized Or, ln(P(X)) is maximized: ∏∑ = = = Θ = = n i K k k i k n i i x F x p X p 1 1 1 ) | ( ) ( ) ( π = = Θ = Θ n i K k k i k x F L 1 1 ) | ( ln ) (
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Mixtures of Gaussians -- notes Every point x i is probabilistically assigned (generated) to (by) the k -th Gaussian Probability that point x i is generated by the k -th Gaussian is = Θ Θ = K j j i j k i k ik x F x F w 1 ) | ( ) | ( π
Background image of page 6
Mixtures of Gaussians -- notes Every Gaussian (cluster) C k has an effective number of points assigned to it N k With mean And variance = = n i ik k w N 1 = = n i i ik k k x w N 1 1 μ ( 29 ( 29 = - - = Σ n i T k i i k i ik k k x x x w N 1 1
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
EM for Gaussian Mixtures Initialize the means μ k , variances Σ k ( Θ k =( μ k , Σ k ) ) and mixing coefficients π k , and evaluate the initial value of the loglikelihood Expectation step : Evaluate weights = Θ Θ = K j j i j k i k ik x F x F w 1 ) | ( ) | ( π
Background image of page 8
EM for Gaussian Mixtures Maximization step : Re-evaluate parameters = = n i i ik k new k x w N 1 1 μ ( 29 ( 29 = - - = Σ n i T new k i i new k i ik k new k x x x w N 1 1 N N k new k = π
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Soft (model-based) clustering and EM algorithm Clustering aggregation [A. Gionis, H. Mannila, P. Tsaparas: Clustering aggregation, ICDE 2004]
Background image of page 10
Image of page 11
This is the end of the preview. Sign up to access the rest of the document.

This document was uploaded on 10/05/2010.

Page1 / 38

lect6 - Clustering III Lecture outline Soft(model-based...

This preview shows document pages 1 - 11. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online