19-clustering

19-clustering - Clustering CS273 Data and Knowledge Bases...

Info iconThis preview shows pages 1–11. Sign up to view the full content.

View Full Document Right Arrow Icon
Clustering S273 ata and Knowledge Bases CS273 - Data and Knowledge Bases Xifeng Yan Computer Science niversity of California at Santa Barbara University of California at Santa Barbara
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Department of Computer Science Unsupervised Learning iven n observations Given n observations x1, …, xn of a random variable X having joint density Pr(X), infer the properties of Pr(X). Data and Knowledge Bases | University of California at Santa Barbara 2
Background image of page 2
Department of Computer Science Supervised vs. Unsupervised Supervised: data are labeled ross- alidation Cross Validation Unsupervised: data are unlabeled o measure of success No measure of success Heuristic arguments for judgments ots of methods developed Lots of methods developed Data and Knowledge Bases | University of California at Santa Barbara 3
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Department of Computer Science Alternative View Unsupervised vs Supervised Data and Knowledge Bases | University of California at Santa Barbara 4
Background image of page 4
Department of Computer Science Methods K-Means K-Medoids Graph Partition/Decomposition Hierarchical Clustering Principal Components Independent Components elf rganizing Maps Self-Organizing Maps Multidimensional Scaling Spectral Clustering And many others Data and Knowledge Bases | University of California at Santa Barbara 5
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Department of Computer Science Cluster Analysis Grouping a collection of objects into clusters, such at those within each cluster are ore closely that those within each cluster are more closely related Core problem: distance definition Intra cluster distance Inter cluster distance ut what if But what if A B C Data and Knowledge Bases | University of California at Santa Barbara 6
Background image of page 6
Department of Computer Science Tricky Situations Claims: No Perfect Method Data and Knowledge Bases | University of California at Santa Barbara 7
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Department of Computer Science Distance Measures Metric function, which satisfies: ) (x ) x ) y, (x, . z) d(x, d(y, y) 2. y x 0 y) y) 1. , 0 Non-Metric function, e.g., graph x) y) 3. No Triangle Inequality No Symmetry Data and Knowledge Bases | University of California at Santa Barbara 8
Background image of page 8
Department of Computer Science Distance Measures Euclidean Distance: d ) ) anhattan Distance: 1 i 2 i i T 2 ) y (x y) (x y) - (x y) - (x y) d(x, Manhattan Distance: d 1 i i i | y x | | y - x | y) Infinity (Sup) Distance: | y x | max y) i i d i 1 Data and Knowledge Bases | University of California at Santa Barbara 9
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Department of Computer Science Intra-cluster Distance
Background image of page 10
Image of page 11
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 01/09/2012 for the course CS CS273 taught by Professor Xifengyan during the Spring '11 term at UCSB.

Page1 / 31

19-clustering - Clustering CS273 Data and Knowledge Bases...

This preview shows document pages 1 - 11. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online