lecture12-clustering-handout-6-per

For each cluster we have a cost c thus for a

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: {C,F}   Disjoint and exhaus*ve   Doesn t have a no*on of outliers by default   But can add outlier filtering Dhillon et al. ICDM 2002 – variation to fix some issues with small document clusters Introduc)on to Informa)on Retrieval How Many Clusters? K not specified in advance   Number of clusters K is given   Say, the results of a query.   Solve an op*miza*on problem: penalize having lots of clusters   Par**on n docs into predetermined number of clusters   Finding the right number of clusters is part of the problem   Given docs, par**on into an appropriate number of subsets.   E.g., for query results  ­ ideal value of K not known up front  ­ though UI may impose limits.   Can usually take an algorithm for one flavor and convert to the other. Introduc)on to Informa)on Retrieval   applica*on dependent, e.g.,...
View Full Document

This document was uploaded on 02/26/2014.

Ask a homework question - tutors are online