mid_sol - Boston University Department of Computer Science...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Boston University Department of Computer Science CS 565 Data Mining Midterm Exam Solutions Date: Oct 14, 2009 Time: 4:00 p.m. - 5:30 p.m. Write Your University Number Here: Answer all questions. Good luck! Problem 1 [25 points] True or False: 1. Maximal frequent itemsets are sufficient to determine all frequent itemsets with their supports. 2. The maximal frequent itemsets (and only those) constitute the positive border of a frequent-set collection. 3. Let D be the Euclidean distance between multidimensional points. Assume a set of n points X = { x 1 ,...,x n } in a d-dimensional space and project them into a lower- dimensional space k O (log n ). If Y = { y 1 ,...,y n } is the new set of k-dimensional points, then, the Johnson Lindenstrauss lemma states that for all pairs ( i,j ) it holds that S ( x i ,x j ) = D ( y i ,y j ). (All points x i and y i are normalized to have length 1.) 4. Computing the mean and a variance of a stream of numbers can be done using a single pass over the data and constant ( O (1)) space. 5. The disagreement distance between two clusterings is a metric. Answers: false, true, false, true, true Problem 2 [10 points] Consider a dictionary of n terms (words) T = { t 1 ,...,t n } . Each term t i is associated with its importance w ( t i ) (a positive real value). Additionally, assume a collection of m documents D = { d 1 ,...,d m } , such that each document d i uses a subset of terms in the dictionary (i.e., d i T ). You are asked to give a polynomial-time algorithm that finds a collection of)....
View Full Document

This document was uploaded on 10/05/2010.

Page1 / 4

mid_sol - Boston University Department of Computer Science...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online