lecture9-queryexpansion-handout-6-per

# 5 introducon to informaon retrieval sec 913

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: important   Posi*ve feedback is more valuable than nega*ve feedback (so, set γ < β; e.g. γ = 0.25, β = 0.75).   Many systems only allow posi*ve feedback (γ=0). ? hy W   Users can be expected to review results and to take *me to iterate Introduc)on to Informa)on Retrieval Aside: Vector Space can be Counterintui*ve. Doc x x “J. Snow & Cholera” x x x o q1 x x x x Query “cholera” x x x x x x x High ­dimensional Vector Spaces x xxx Introduc)on to Informa)on Retrieval x x q1 query “cholera” o www.ph.ucla.edu/epi/snow.html x other documents   The queries “cholera” and “john snow” are far from each other in vector space.   How can the document “John Snow and Cholera” be close to both of them?   Our intui*ons for 2 ­ and 3 ­dimensional space don't work in >10,000 dimensions.   3 dimensions: If a document is close to many queries, then some of these queries must be close to each other.   Doesn't hold for a high ­dimensional space. 5 Introduc)on to Informa)on Retrieval Sec. 9.1.3   A1: User has suﬃcient knowledge for ini*al query.   A2: Relevance prototypes are “well ­behaved”.   Term distribu*on in relevant documents will be similar   Term distribu*on in non ­relevant documents will be diﬀerent from those in relevant documents   Either: All relevant documents are *ghtly clustered around a single prototype.   Or: There are diﬀerent prototypes, but they have signiﬁcant vocabulary overlap.   Similari*es between relevant and irrelevant documents are small...
View Full Document

## This document was uploaded on 02/26/2014.

Ask a homework question - tutors are online