Unformatted text preview: Only about 4% of query sessions from a user used relevance feedback op*on   Pseudo ­relevance feedback automates the manual part of true relevance feedback.   Pseudo ­relevance algorithm:   Expressed as “More like this” link next to each result   Retrieve a ranked list of hits for the user s query   Assume that the top k documents are relevant.   Do relevance feedback (e.g., Rocchio)   But about 70% of users only looked at first page of results and didn’t pursue things further   So 4% is about 1/8 of people extending search   Relevance feedback improved results about 2/3 of the *me Introduc)on to Informa)on Retrieval Sec. 9.2.2 Query Expansion         Works very well on average But can go horribly wrong for some queries. Several itera*ons can cause query drik. Why? Introduc)on to Informa)on Retrieval Query assist   In relevance feedback, users give addi*onal input (relevant/non ­relevant) on documents, which is used to reweight terms in the documents   In query expansion, users give addi*onal input (good/bad search term) on words or phrases Would you expect such a feature to increase the query volume at a search engine? 7 Introduc)on to Informa)on Retrieval Sec. 9.2.2 How do we augment the user query? Introduc)on to Informa)on Retrieval Sec. 9.2.2 Example of manual thesaurus   Manual thesaurus   E.g. MedLine: physician, syn: doc, doctor, MD, medico   Can be query rather than just synonyms   Global Analysis: (sta*c; of all documents in collec*on)   Automa*cally derived thesaurus   (co ­occurrence sta*s*cs)   Refinements based on query log mining   Common on the web   Local Analysis: (dynamic)   Analysis of documents in result set Introduc)on to Informa)on Retrieval Sec. 9.2.2 Thesaurus ­based query expansion   For each term, t, in a query, expand the query with synonyms and related words of t from the thesaurus   feline → feline cat         May weight added terms less than original query terms. Generally increases recall Widely used in many science/engineering fields May significantly decrease precision, par*cularly with ambiguous terms.   “interest rate” → “interest rate fascinate evaluate”   There is a high cost of manually producing a thesaurus ...
