HierarCluster-L16

HierarCluster-L16 - 1 CSE 572 Data Mining Lecture 16...

This preview shows pages 1–8. Sign up to view the full content.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 1 CSE 572: Data Mining Lecture 16: Hierarchical Clustering 2 Hierarchical Clustering ● Produces a set of nested clusters organized as a hierarchical tree ● Can be visualized as a dendrogram – A tree like diagram that records the sequences of merges or splits 1 3 2 5 4 6 0.05 0.1 0.15 0.2 1 2 3 4 5 6 1 2 3 4 5 3 Strengths of Hierarchical Clustering ● Do not have to assume any particular number of clusters – Any desired number of clusters can be obtained by ‘cutting’ the dendrogram at the proper level ● They may correspond to meaningful taxonomies – Example in biological sciences (e.g., animal kingdom, phylogeny reconstruction, …) 4 Hierarchical Clustering ● Two main types of hierarchical clustering – Agglomerative: Start with the points as individual clusters At each step, merge the closest pair of clusters until only one cluster (or k clusters) left – Divisive: Start with one, all-inclusive cluster At each step, split a cluster until each cluster contains a point (or there are k clusters) ● Traditional hierarchical algorithms use a similarity or distance matrix – Merge or split one cluster at a time 5 MST: Divisive Hierarchical Clustering ● Build MST (Minimum Spanning Tree) – Start with a tree that consists of any point – In successive steps, look for the closest pair of points (p, q) such that one point (p) is in the current tree but the other (q) is not – Add q to the tree and put an edge between p and q 6 MST: Divisive Hierarchical Clustering ● Use MST for constructing hierarchy of clusters 7 Agglomerative Clustering Algorithm ● More popular hierarchical clustering technique ● Basic algorithm is straightforward 1. Compute the proximity matrix 2. Let each data point be a cluster 3. Repeat 1. Merge the two closest clusters 2. Update the proximity matrix 1. Until only a single cluster remains ●...
View Full Document

{[ snackBarMessage ]}

Page1 / 37

HierarCluster-L16 - 1 CSE 572 Data Mining Lecture 16...

This preview shows document pages 1 - 8. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online