Clustering_2_march_2011

Clustering_2_march_2011 - Clustering TWSChow Jan2010 We...

Info iconThis preview shows pages 1–11. Sign up to view the full content.

View Full Document Right Arrow Icon
Clustering  T W S Chow  Jan 2010 We must acknowledge Carnegie Mellon University as some of these slides are modified from the CMU AI Clustering ppt
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
2 Outline What is Clustering? Distance Measure General Applications of Clustering Major Clustering Approaches
Background image of page 2
3 What is Clustering? Organizing data into clusters such that there is high intra-cluster similarity low inter-cluster similarity Clustering = unsupervised classification (no predefined classes) Informally, finding natural groupings among objects.
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
4 What is Clustering? What is a natural grouping among these objects? Clustering is subjective Simpson's Family School Employees Females Males
Background image of page 4
5 What is Clustering? Typical usage As a stand-alone tool to get insight into data distribution As a preprocessing step for other algorithms
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
6 Distance Measure What is Similarity? not easy to answer, but important. The quality or state of being similar; likeness; resemblance; as, a similarity of features. Similarity is hard to define, but… “ We know it when we see it question. We will take a more pragmatic approach . The real meaning of similarity is a philosophical
Background image of page 6
7 Cosine distance Cosine distance is a widely used measure to reflect the Euclidean distance between two investigated objects. Turn to cosine distance pdf
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
8
Background image of page 8
9 Their similarities in this case can be measured by means of dress, height, hair, age, smoke etc. This is a topic of feature extraction.
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Dendogram (for hierarchical) clustering A dendogram is a cluster tree diagram In dendogram it shows where split or merge occurs It is a visualization of hierarchical clustering It enables us to specify the cutting pt for determining the number of clusters i.e., Fig. 1, we cut at 2 and obtain 2 clusters {4 objects (3,5,6 4), and 2 objects (1,2)} In Fig. 2, set to 1.2, we obtain 3 clusters. Fig. 1
Background image of page 10
Image of page 11
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 36

Clustering_2_march_2011 - Clustering TWSChow Jan2010 We...

This preview shows document pages 1 - 11. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online