l22 - 6.896 Sublinear Time Algorithms April 26, 2007...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 6.896 Sublinear Time Algorithms April 26, 2007 Lecture 22 Lecturer: Ronitt Rubinfeld Scribe: Brendan Juba 1 Overview We will continue examining sublinear time algorithms for clustering. Last time, we considered a set of n points and gave an algorithm to decide, in a constant number of queries, are they ( k, b )-radius clus- terable? Today well give an algorithm for a completely different notion of clustering, which examines the average distance of the points from their centers rather than the maximum distance. Our algorithm will make O (log n ) queries, which is worse, but outputs an approximation of how well the data can be clustered rather than simply testing whether or not a clustering exists. In fact, well even see how to find a concise representation of an approximate clustering in sublinear time. Its worth noting that the exact version of this problem is also NP-complete. 2 Notation and preliminaries Let X be a set of n points such that for any two points x,y X , the distance between x and y , dist ( x, y ), is at most M . Given k centers , c 1 , .. . ,c k X , we define f c 1 ,...,c k ( x ) = min i dist ( x,c i ) We remark that, in clustering, one should always check whether the cluster centers are allowed to be arbitrary points, or whether they are restricted to come from the input data. In this case, notice that they must lie in X , the set we wish to cluster. We will define the cost of a clustering to be the average over all points of the distance to the closest center, i.e., the cost of the clustering f c 1 ,...,c k is E X [ f c 1 ,...,c k ( x )] = 1 | X | x X f c 1 ,...,c k ( x ) Our goal to choose centers c 1 ,. . ., c,....
View Full Document

This note was uploaded on 04/02/2010 for the course CS 6.896 taught by Professor Ronittrubinfeld during the Fall '04 term at MIT.

Page1 / 3

l22 - 6.896 Sublinear Time Algorithms April 26, 2007...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online