lecture12-clustering-handout-6-per

161 for improving search recall cluster hypothesis

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: r Introduc)on to Informa)on Retrieval Sec. 16.1 ScaVer/Gather: Cu_ng, Karger, and Pedersen Introduc)on to Informa)on Retrieval For visualizing a document collec*on and its themes S ec. 16.1 For improving search recall   Cluster hypothesis  ­ Documents in the same cluster behave similarly with respect to relevance to informa*on needs   Therefore, to improve search recall:   Cluster docs in corpus a priori   When a query matches a doc D, also return other docs in the cluster containing D   Hope if we do this: The query car will also return docs containing automobile   Because clustering grouped together docs containing car with those containing automobile.   Wise et al, Visualizing the non ­visual PNNL   ThemeScapes, Car*a   [Mountain height = cluster size] Why might this happen? Introduc)on to Informa)on Retrieval Introduc)on to Informa)on Retrieval Sec. 16.2 Issues for clustering   Representa*on for clustering   Document representa*on   Vector space? Normaliza*on?   Centroids a...
View Full Document

This document was uploaded on 02/26/2014.

Ask a homework question - tutors are online