clustering2

clustering2 - 1 Clustering Algorithms Hierarchical...

This preview shows pages 1–11. Sign up to view the full content.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 1 Clustering Algorithms Hierarchical Clustering k-Means Algorithms CURE Algorithm 2 Methods of Clustering r Hierarchical (Agglomerative) : R Initially, each point in cluster by itself. R Repeatedly combine the two “nearest” clusters into one. r Point Assignment : R Maintain a set of clusters. R Place points into their “nearest” cluster. 3 Hierarchical Clustering r Two important questions: 1. How do you determine the “nearness” of clusters? 2. How do you represent a cluster of more than one point? 4 Hierarchical Clustering --- (2) r Key problem : as you build clusters, how do you represent the location of each cluster, to tell which pair of clusters is closest? r Euclidean case : each cluster has a c e n t r o i d = average of its points. R Measure intercluster distances by distances of centroids. 5 Example (5,3) o (1,2) o o (2,1) o (4,1) o (0,0) o (5,0) x (1.5,1.5) x (4.5,0.5) x (1,1) x (4.7,1.3) 6 And in the Non-Euclidean Case? r The only “locations” we can talk about are the points themselves. R I.e., there is no “average” of two points. r Approach 1 : c l u s t r o i d = point “closest” to other points. R Treat clustroid as if it were centroid, when computing intercluster distances. 7 “Closest” Point? r Possible meanings: 1. Smallest maximum distance to the other points. 2. Smallest average distance to other points. 3. Smallest sum of squares of distances to other points. 4. Etc., etc. 8 Example 1 2 3 4 5 6 intercluster distance clustroid clustroid 9 Other Approaches to Defining “Nearness” of Clusters r Approach 2 : intercluster distance = minimum of the distances between any two points, one from each cluster. r Approach 3 : Pick a notion of “cohesion” of clusters, e.g., maximum distance from the clustroid. R Merge clusters whose u n i o n is most cohesive. 10 Return to Euclidean Case r Approaches 2 and 3 are also used sometimes in Euclidean clustering....
View Full Document

This document was uploaded on 03/04/2012.

Page1 / 40

clustering2 - 1 Clustering Algorithms Hierarchical...

This preview shows document pages 1 - 11. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online