lecture9-queryexpansion-handout-6-per

5 introducon to informaon retrieval sec 913

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: important   Posi*ve feedback is more valuable than nega*ve feedback (so, set γ < β; e.g. γ = 0.25, β = 0.75).   Many systems only allow posi*ve feedback (γ=0). ? hy W   Users can be expected to review results and to take *me to iterate Introduc)on to Informa)on Retrieval Aside: Vector Space can be Counterintui*ve. Doc x x “J. Snow & Cholera” x x x o q1 x x x x Query “cholera” x x x x x x x High ­dimensional Vector Spaces x xxx Introduc)on to Informa)on Retrieval x x q1 query “cholera” o www.ph.ucla.edu/epi/snow.html x other documents   The queries “cholera” and “john snow” are far from each other in vector space.   How can the document “John Snow and Cholera” be close to both of them?   Our intui*ons for 2 ­ and 3 ­dimensional space don't work in >10,000 dimensions.   3 dimensions: If a document is close to many queries, then some of these queries must be close to each other.   Doesn't hold for a high ­dimensional space. 5 Introduc)on to Informa)on Retrieval Sec. 9.1.3   A1: User has sufficient knowledge for ini*al query.   A2: Relevance prototypes are “well ­behaved”.   Term distribu*on in relevant documents will be similar   Term distribu*on in non ­relevant documents will be different from those in relevant documents   Either: All relevant documents are *ghtly clustered around a single prototype.   Or: There are different prototypes, but they have significant vocabulary overlap.   Similari*es between relevant and irrelevant documents are small...
View Full Document

Ask a homework question - tutors are online