*This preview shows
pages
1–2. Sign up to
view the full content.*

Boston University
Department of Computer Science
CS 565 Data Mining
Midterm Exam
Date: Oct 14, 2009
Time: 4:00 p.m. - 5:30 p.m.
Write Your University Number Here:
Answer all questions.
Good luck!
Problem 1 [25 points]
True or False:
1. Maximal frequent itemsets are suﬃcient to determine all frequent itemsets with their
supports.
2. The maximal frequent itemsets (and only those) constitute the positive border of a
frequent-set collection.
3. Let
D
be the Euclidean distance between multidimensional points. Assume a set of
n
points
X
=
{
x
1
,...,x
n
}
in a
d
-dimensional space and project them into a lower-
dimensional space
k
≥
O
(log
n
). If
Y
=
{
y
1
,...,y
n
}
is the new set of
k
-dimensional
points, then, the Johnson Lindenstrauss lemma states that for all pairs (
i,j
) it holds
that
S
(
x
i
,x
j
) =
D
(
y
i
,y
j
). (All points
x
i
and
y
i
are normalized to have length 1.)
4. Computing the mean and a variance of a stream of numbers can be done using a single
pass over the data and constant (

This ** preview**
has intentionally

**sections.**

*blurred***to view the full version.**

*Sign up*
This is the end of the preview. Sign up
to
access the rest of the document.