chap9_advanced_cluster_analysis_sh

chap9_advanced_cluster_analysis_sh - Data Mining Cluster...

Info iconThis preview shows pages 1–8. Sign up to view the full content.

View Full Document Right Arrow Icon
Data Mining Cluster Analysis: Advanced Concepts and Algorithms Lecture Notes for Chapter 9 Introduction to Data Mining by Tan, Steinbach, Kumar Edited for STATS202, Stanford University, Fall 2010 © Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 2 Fuzzy clustering z Each point belongs to j -th cluster with weight w ij z Minimizes, with constraints on sum of each cluster’s w ’s=1 z Advantages: uses all points for each cluster z Disadvantages: same as K-means. 2 ) , ( ) ( j i ji p ij c x dist w C SSE ∑ ∑ =
Background image of page 2
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 3 Mixture modelling z Consider each observation as coming from a mixture of distributions z Common example: components are Gaussian. z Algorithm to fit model: EM algorithm. ) ; ( ) ( j j j x f x f θ π =
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 4 Mixture modelling z Consider each observation as coming from a mixture of distributions z Common example: components are Gaussian. z Algorithm to fit model: EM algorithm. ) ; ( ) ( j j j x f x f θ π =
Background image of page 4
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 5 Mixture modelling
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 6 EM algorithm for mixtures z Alternating algorithm to maximize the likelihood z E-step: compute responsibilities / weights w ij hts, proportional to “probability” (density) point i belongs to cluster j. z M-step: estimate parameters θ j . z Repeat. = ) ; ( ) ; ,..., , ,..., ( 1 1 j i j j i K K x f x L θ π
Background image of page 6
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 7 Responsibilities / Mahalanobis B A C Weights related to Mahalanobis distance from each point to center of cluster, balanced with size of cluster.
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 8
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

Page1 / 22

chap9_advanced_cluster_analysis_sh - Data Mining Cluster...

This preview shows document pages 1 - 8. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online