clustering1 - Clustering Preliminaries Applications...

Info icon This preview shows pages 1–9. Sign up to view the full content.

View Full Document Right Arrow Icon
1 Clustering Preliminaries Applications Euclidean/Non-Euclidean Spaces Distance Measures
Image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
2 The Problem of Clustering rhombus6 Given a set of points, with a notion of distance between points, group the points into some number of clusters , so that members of a cluster are in some sense as close to each other as possible.
Image of page 2
3 Example x x x x x x x x x x x x x x x x xx x x x x x x x x x x x x x x x x x x x x x x x
Image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
4 Problems With Clustering rhombus6 Clustering in two dimensions looks easy. rhombus6 Clustering small amounts of data looks easy. rhombus6 And in most cases, looks are not deceiving.
Image of page 4
5 The Curse of Dimensionality rhombus6 Many applications involve not 2, but 10 or 10,000 dimensions. rhombus6 High-dimensional spaces look different: almost all pairs of points are at about the same distance. rhombus4 Example : assume random points within a bounding box, e.g., values between 0 and 1 in each dimension.
Image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
6 Example : SkyCat rhombus6 A catalog of 2 billion “sky objects” represents objects by their radiation in 9 dimensions (frequency bands). rhombus6 Problem : cluster into similar objects, e.g., galaxies, nearby stars, quasars, etc. rhombus6 Sloan Sky Survey is a newer, better version.
Image of page 6
7 Example : Clustering CD’s (Collaborative Filtering) rhombus6 Intuitively: music divides into categories, and customers prefer a few categories. rhombus4 But what are categories really? rhombus6 Represent a CD by the customers who bought it. rhombus6 Similar CD’s have similar sets of customers, and vice-versa.
Image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
8 The Space of CD’s rhombus6 Think of a space with one dimension for each customer. rhombus4 Values in a dimension may be 0 or 1 only. rhombus6 A CD’s point in this space is ( x 1 , x 2 ,…, x k ), where x i = 1 iff the i th customer bought the CD.
Image of page 8
Image of page 9
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern