lecture8-evaluation-handout-6-per

# Impact on absolute performance measure can be

Unformatted text preview: Informa)on Retrieval Sec. 8.5 Kappa Example Introduc)on to Informa)on Retrieval TREC   TREC Ad Hoc task from ﬁrst 8 TRECs is standard IR task           P(A) = 370/400 = 0.925 P(nonrelevant) = (10+20+70+70)/800 = 0.2125 P(relevant) = (10+20+300+300)/800 = 0.7878 P(E) = 0.2125^2 + 0.7878^2 = 0.665 Kappa = (0.925 – 0.665)/(1 ­0.665) = 0.776         Kappa > 0.8 = good agreement 0.67 < Kappa < 0.8  ­> “tenta)ve conclusions” (CarleUa ’96) Depends on purpose of study For >2 judges: average pairwise kappas   50 detailed informa)on needs a year   Human evalua)on of pooled results returned   More recently other related things: Web track, HARD   A TREC query (TREC 5) <top> <num> Number: 225 <desc> Descrip)on: What is the main func)on of the Federal Emergency Management Agency (FEMA) and the funding level provided to meet emergencies? Also, what resources are available to FEMA such as people, equipment, facili)es? </top> 31 Introduc)on to Informa)on Retrieval Standard relevance benchmarks: Others Sec. 8.2 32 Introduc)on to Informa)on...
