clustering1

# clustering1 - Clustering Preliminaries Applications...

This preview shows pages 1–9. Sign up to view the full content.

1 Clustering Preliminaries Applications Euclidean/Non-Euclidean Spaces Distance Measures

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
2 The Problem of Clustering rhombus6 Given a set of points, with a notion of distance between points, group the points into some number of clusters , so that members of a cluster are in some sense as close to each other as possible.
3 Example x x x x x x x x x x x x x x x x xx x x x x x x x x x x x x x x x x x x x x x x x

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
4 Problems With Clustering rhombus6 Clustering in two dimensions looks easy. rhombus6 Clustering small amounts of data looks easy. rhombus6 And in most cases, looks are not deceiving.
5 The Curse of Dimensionality rhombus6 Many applications involve not 2, but 10 or 10,000 dimensions. rhombus6 High-dimensional spaces look different: almost all pairs of points are at about the same distance. rhombus4 Example : assume random points within a bounding box, e.g., values between 0 and 1 in each dimension.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
6 Example : SkyCat rhombus6 A catalog of 2 billion “sky objects” represents objects by their radiation in 9 dimensions (frequency bands). rhombus6 Problem : cluster into similar objects, e.g., galaxies, nearby stars, quasars, etc. rhombus6 Sloan Sky Survey is a newer, better version.
7 Example : Clustering CD’s (Collaborative Filtering) rhombus6 Intuitively: music divides into categories, and customers prefer a few categories. rhombus4 But what are categories really? rhombus6 Represent a CD by the customers who bought it. rhombus6 Similar CD’s have similar sets of customers, and vice-versa.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
8 The Space of CD’s rhombus6 Think of a space with one dimension for each customer. rhombus4 Values in a dimension may be 0 or 1 only. rhombus6 A CD’s point in this space is ( x 1 , x 2 ,…, x k ), where x i = 1 iff the i th customer bought the CD.
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### What students are saying

• As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

Kiran Temple University Fox School of Business ‘17, Course Hero Intern

• I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

Dana University of Pennsylvania ‘17, Course Hero Intern

• The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

Jill Tulane University ‘16, Course Hero Intern