lec7-clustering - Machine Learning Lecture 8 Yang Yang...

Info icon This preview shows pages 1–15. Sign up to view the full content.

View Full Document Right Arrow Icon
Machine Learning Lecture 8 Yang Yang Department of Computer Science & Engineering Shanghai Jiao Tong University
Image of page 1

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Clustering and Distance Metrics Reading: Chap. 9, 13 C.B book
Image of page 2
What Is Clustering ? Are there any grouping among them ? What is each group ? How many ? How to identify them ?
Image of page 3

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
What Is Clustering ? Clustering: the process of grouping a set of objects into classes of similar objects high intra-class similarity low inter-class similarity it is the commonest form of unsupervised learning Unsupervised learning = learning from raw ( unlabeled, unannotated, etc. ) data, as opposed to supervised data where a classification of examples is given A common and important task that finds many applications in Science, Engineering, Information Science, and other places Group genes that perform the same function Group individuals that has similar political view Categorize documents of similar topics Ideality similar objects from pictures
Image of page 4
Examples
Image of page 5

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Issues for Clustering What is a natural grouping among these objects ? Definition of groupness What makes objects related ? Definition of similarity/distance Representation for objects Vector space ? Normalization ? How many clusters ? Fixed a priori ? Completely data driven ? Avoid trivial clusters — too large or small Clustering Algorithms Partitional algorithms Hierarchical algorithms Density-based algorithms Formal foundation and convergence
Image of page 6
What Is a Natural Grouping Among These Objects ?
Image of page 7

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
What Is Similarity ? Hard to define ! But we know it when we see it The real meaning of similarity is a philosophical question. We will take a more pragmatic approach Depends on representation and algorithm
Image of page 8