lec9 - Information Retrieval: Evaluation Fernando Diaz...

Info iconThis preview shows pages 1–9. Sign up to view the full content.

View Full Document Right Arrow Icon
Information Retrieval: Evaluation Fernando Diaz Yahoo! Labs April 5, 2011 1 / 70
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Outline Introduction to Evaluation Batch Evaluation Production Test 2 / 70
Background image of page 2
Introduction to Evaluation 3 / 70
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Information Retrieval given a query and a corpus , Fnd relevant documents . query : user’s expression of the information need corpus : the repository of retrievable items relevance : satisfaction of the information need 4 / 70
Background image of page 4
Evaluation Fundamental issue in information retrieval we can spend days discussing algorithms but we need to quantify if they are good. given a task is system A better than system B? Many methods of information retrieval evaluation User study Batch study Production test Each evaluation experiment has bene±ts and usually we conduct many types of evaluations before making a claim about a system. 5 / 70
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
User Study Method : Provide a small group of users with several retrieval systems and ask them to complete several search tasks; interview users afterward to learn about system performance. Advantages Very detailed data about users’ reaction to the systems. Can leverage experimental methodology from psychology. Disadvantages Costly to run user studies (pay users, scientist time, data coding). DifFcult to generalize from small studies to broad populations. Laboratory experiments are often not representative of the normal user context. Need to rerun an experiment when a new system is being considered. 6 / 70
Background image of page 6
Batch study Method : Gather a small pool of ‘test queries’ and judge the relevance of documents in the corpus; compare systems on their ability to rank relevant documents above non-relevant documents. Advantages Allows repeatable experiments; we can compare systems on the same queries and judgments. Can construct data sets large enough to conduct signiFcance tests on performance metrics. Drawbacks Costly to get judgments (pay editors). Judgments gathered in synthetic environment, often not by the users generating the queries. Assumes relevance is the same across users. Batch studies underly the majority of information retrieval evaluation (core evaluation in much of TREC). 7 / 70
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Method : In a production system, have x% of the trafFc use system A and y% use system B; compare system effects on logged user interaction. Advantages System usage is naturalistic; users are not situated in a lab and often are not aware that a test is being conducted. Can construct very large data sets. Drawbacks Requires a very good understanding of interpreting positive and negative user experience from logging data (do we want to just measure user retention? clicks?) Experiments are very difFcult to repeat. Increasingly, this is how real world systems are
Background image of page 8
Image of page 9
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 70

lec9 - Information Retrieval: Evaluation Fernando Diaz...

This preview shows document pages 1 - 9. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online