lect6

# lect6 - Clustering III Lecture outline Soft(model-based...

This preview shows pages 1–11. Sign up to view the full content.

Clustering III

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Lecture outline Soft (model-based) clustering and EM algorithm Clustering aggregation [A. Gionis, H. Mannila, P. Tsaparas: Clustering aggregation, ICDE 2004] Impossibility theorem for clustering [Jon Kleinberg, An impossibility theorem for clustering, NIPS 2002]
Expectation-maximization algorithm Iterative procedure to compute the Maximum Likelihood (ML) estimate – even in the presence of missing or hidden data EM consists of two steps: Expectation step: the (missing) data are estimated given the observed data and current estimates of model parameters Maximization step: The likelihood function is maximized under the assumption that the (missing) data are known

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
EM algorithm for mixture of Gaussians What is a mixture of K Gaussians? with and F(x| Θ ) is the Gaussian distribution with parameters Θ = { μ , Σ } = Θ = K k k k x F x p 1 ) | ( ) ( π = = K k k 1 1
EM algorithm for mixture of Gaussians If all points x є X are mixtures of K Gaussians then Goal: Find π 1 ,…, π k and Θ 1 ,…, Θ k such that P(X) is maximized Or, ln(P(X)) is maximized: ∏∑ = = = Θ = = n i K k k i k n i i x F x p X p 1 1 1 ) | ( ) ( ) ( π = = Θ = Θ n i K k k i k x F L 1 1 ) | ( ln ) (

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Mixtures of Gaussians -- notes Every point x i is probabilistically assigned (generated) to (by) the k -th Gaussian Probability that point x i is generated by the k -th Gaussian is = Θ Θ = K j j i j k i k ik x F x F w 1 ) | ( ) | ( π
Mixtures of Gaussians -- notes Every Gaussian (cluster) C k has an effective number of points assigned to it N k With mean And variance = = n i ik k w N 1 = = n i i ik k k x w N 1 1 μ ( 29 ( 29 = - - = Σ n i T k i i k i ik k k x x x w N 1 1

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
EM for Gaussian Mixtures Initialize the means μ k , variances Σ k ( Θ k =( μ k , Σ k ) ) and mixing coefficients π k , and evaluate the initial value of the loglikelihood Expectation step : Evaluate weights = Θ Θ = K j j i j k i k ik x F x F w 1 ) | ( ) | ( π
EM for Gaussian Mixtures Maximization step : Re-evaluate parameters = = n i i ik k new k x w N 1 1 μ ( 29 ( 29 = - - = Σ n i T new k i i new k i ik k new k x x x w N 1 1 N N k new k = π

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Soft (model-based) clustering and EM algorithm Clustering aggregation [A. Gionis, H. Mannila, P. Tsaparas: Clustering aggregation, ICDE 2004]
This is the end of the preview. Sign up to access the rest of the document.

## This document was uploaded on 10/05/2010.

### Page1 / 38

lect6 - Clustering III Lecture outline Soft(model-based...

This preview shows document pages 1 - 11. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online