lecture8-evaluation-handout-6-per

Most sophiscated nlp used to synthesize a summary

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: rank 10 right.         Sec. 8.6.3   NDCG (Normalized Cumula)ve Discounted Gain)       Search engines also use non ­relevance ­based measures.   Clickthrough on first result   Not very reliable if you look at a single clickthrough … but preUy reliable in the aggregate.   Studies of user behavior in the lab   A/B tes)ng     Purpose: Test a single innova)on Prerequisite: You have a large search engine up and running. Have most users use old system Divert a small propor)on of traffic (e.g., 1%) to the new system that includes the innova)on Evaluate with an “automa)c” measure like clickthrough on first result Now we can directly see if the innova)on does improve user happiness. Probably the evalua)on methodology that large search engines trust most In principle less powerful than doing a mul)variate regression analysis, but easier to understand 37 Introduc)on to Informa)on Retrieval Sec. 8.7 38 Introduc)on to Informa)on Retrieval Sec. 8.7 Result Summaries   Having ranked the documents matching a query, we wish to present a results list   Most commonly, a list of the document )tles plus a short summary, aka “10 blue links” RESULTS PRESENTATION 39 Introduc)on to...
View Full Document

This document was uploaded on 02/26/2014.

Ask a homework question - tutors are online