class06-eval

class06-eval - Today Evaluation • Why evaluation •...

Info iconThis preview shows pages 1–7. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Today: Evaluation • Why evaluation? • Evaluating a search engine • Unranked and Ranked evaluation • Evaluation benchmarks IR is an experimental science • Formulate a research question: the hypothesis • Design an experiment to answer the question • Perform the experiment • Compare with a baseline “control” • Does the experiment answer the question? • Are the results signi¡cant? Or is it just luck? • Report the results! Research questions • Does stemming improve retrieval performance? • Experiment: Build a “stemmed” index and compare against an “unstemmed” baseline • Does expanding the query with synonyms improve retrieval performance? • Experiment: Expand queries with synonyms and compare against baseline unexpanded queries Research questions • Does keyword highlighting help users evaluate document relevance? • Experiment: Build two different interfaces, one with highlighting, one without; run a user study • Is letting users weight search terms a good idea? • Experiment: Build two different interfaces, one with term weighting, one without: run a user study The importance of evaluation • The ability to measure differences underlies experimental science • How well do our systems work? • Is A better than B? • Really? • Under what conditions? • Evaluation drives what to research • Identify techniques that work and that don’t What we look for in evaluations ... • Insightful • Affordable • Repeatable • Explainable Evaluating a Search Engine Measures for a search engine • How fast does it index • Number of documents/hour • (Average document size) • How fast does it search • Latency as a function of index size • Expressiveness of query language • Ability to express complex information needs • Speed on complex queries Measures for a search engine • All of the preceding criteria are measurable : we can quantify speed/size; we can make expressiveness precise • The key measure: user happiness • What is this? • Speed of response/size of index are factors • But blindingly fast, useless answers won’t make a user happy • Need a way of quantifying user happiness Measuring user happiness • Issue: who is the user we are trying to make happy? • Depends on the setting • Web engine : user Fnds what they want and return to the engine • Can measure rate of return users • eCommerce site : user Fnds what they want and make a purchase • Is it the end-user, or the eCommerce site, whose happiness we measure? • Measure time to purchase, or fraction of searchers who become buyers? Measuring user happiness • Enterprise (company/govt/academic): Care about “user productivity” • How much time do my users save when looking for information?...
View Full Document

{[ snackBarMessage ]}

Page1 / 29

class06-eval - Today Evaluation • Why evaluation •...

This preview shows document pages 1 - 7. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online