*This preview shows
pages
1–11. Sign up to
view the full content.*

This ** preview**
has intentionally

**sections.**

*blurred***to view the full version.**

*Sign up*This ** preview**
has intentionally

**sections.**

*blurred***to view the full version.**

*Sign up*This ** preview**
has intentionally

**sections.**

*blurred***to view the full version.**

*Sign up*This ** preview**
has intentionally

**sections.**

*blurred***to view the full version.**

*Sign up*This ** preview**
has intentionally

**sections.**

*blurred***to view the full version.**

*Sign up*
**Unformatted text preview: **1 Clustering Algorithms Hierarchical Clustering k -Means Algorithms CURE Algorithm 2 Methods of Clustering Hierarchical (Agglomerative) : Initially, each point in cluster by itself. Repeatedly combine the two nearest clusters into one. Point Assignment : Maintain a set of clusters. Place points into their nearest cluster. 3 Hierarchical Clustering Two important questions: 1. How do you determine the nearness of clusters? 2. How do you represent a cluster of more than one point? 4 Hierarchical Clustering --- (2) Key problem : as you build clusters, how do you represent the location of each cluster, to tell which pair of clusters is closest? Euclidean case : each cluster has a centroid = average of its points. Measure intercluster distances by distances of centroids. 5 Example (5,3) o (1,2) o o (2,1) o (4,1) o (0,0) o (5,0) x (1.5,1.5) x (4.5,0.5) x (1,1) x (4.7,1.3) 6 And in the Non-Euclidean Case? The only locations we can talk about are the points themselves. I.e., there is no average of two points. Approach 1 : clustroid = point closest to other points. Treat clustroid as if it were centroid, when computing intercluster distances. 7 Closest Point? Possible meanings: 1. Smallest maximum distance to the other points. 2. Smallest average distance to other points. 3. Smallest sum of squares of distances to other points. 4. Etc., etc. 8 Example 1 2 3 4 5 6 intercluster distance clustroid clustroid 9 Other Approaches to Defining Nearness of Clusters Approach 2 : intercluster distance = minimum of the distances between any two points, one from each cluster. Approach 3 : Pick a notion of cohesion of clusters, e.g., maximum distance from the clustroid. Merge clusters whose union is most cohesive. 10 Return to Euclidean Case Approaches 2 and 3 are also used sometimes in Euclidean clustering....

View Full
Document