Problem Set 2
September 28, 2009
Due date:
Wed, October 14 2009 at 4pm; before class.
Exercise 1:
(20 points) Assume two ddimensional real vectors
x
and
y
. And denote by
x
i
(
y
i
) the value
in the
i
th coordinate of
x
(
y
). Prove or disprove the following statements:
1. Distance function
L
1
(
x, y
) =
d
X
i
=1

x
i

y
i

is a metric. (5 points)
2. Distance function
L
2
(
x, y
) =
v
u
u
t
d
X
i
=1
(
x
i

y
i
)
2
is a metric. (5 points)
3. Distance function
L
2
2
(
x, y
) =
d
X
i
=1
(
x
i

y
i
)
2
is a metric. (10 points)
Exercise 2:
(30 points) In class, we have discussed Kleinberg’s impossibility theorem for clustering. Show
whether the
k
means clustering function satisﬁes (a) richness, (b) scale invariance, and (c) consistency. (10
points for every axiom)
Exercise 3:
(20 points) The
k
means clustering problem takes as input
n
points
X
in a
d
dimensional space
and asks for a partition of the points into
k
parts
C
1
, . . . , C
k
. Each part
C
i
is represented by a
d
dimensional
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
This is the end of the preview.
Sign up
to
access the rest of the document.
 Fall '09
 Data Mining, hierarchical clustering, Ri, Singlelinkage clustering, jaccard similarity functions

Click to edit the document details