4 Pages

lesson7.2_notes

Course: INLS 490, Fall 2008
School: UNC
Rating:
 
 
 
 
 

Word Count: 1001

Document Preview

490-154: INLS Information Retrieval Systems Design & Implementation. Spring 2009. 7.2. Evaluation-1 Chirag Shah School of Information & Library Science (SILS) UNC Chapel Hill NC 27599 chirag@unc.edu 1 Introduction While an ideal IR system should get the relevant information for any information request by any user in any situation, there is no system that satises all of these. The question is...

Register Now

Unformatted Document Excerpt

Coursehero >> North Carolina >> UNC >> INLS 490

Course Hero has millions of student submitted documents similar to the one
below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.

Course Hero has millions of student submitted documents similar to the one below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.
490-154: INLS Information Retrieval Systems Design & Implementation. Spring 2009. 7.2. Evaluation-1 Chirag Shah School of Information & Library Science (SILS) UNC Chapel Hill NC 27599 chirag@unc.edu 1 Introduction While an ideal IR system should get the relevant information for any information request by any user in any situation, there is no system that satises all of these. The question is then how well a given system is doing to match its expectations, or better yet, how well it is doing compared to some other system. Measuring the retrieval performance of an IR system has been the one of the biggest challenges for decades. In most situations, it is a hard problem to evaluate an IR system without user judgments. Here we will look at some of the ways in which we can talk about the goodness of an IR system. To keep things simple, we will only consider objective relevance, i.e., if a retrieved document is relevant or not. Also, the only context we will consider is the topic of the information need, and not the situation or other such factors. 2 Recall and precision revisited To begin our discussion, let us review the notion of recall and precision that we had seen before. In Figure 1, a van diagram is given showing a set of relevant documents (R) for an information need, and a set of retrieved documents (R ) by some IR system. Recall is the portion of relevant documents returned, and precision is the portion of the returned document that is relevant. Using Figure 1, this can be formulated as Recall = \ RR R (1) CC BY: $ = These notes for INLS 490-154 Spring 2009 by Chirag Shah (http://www.unc.edu/chirags) are licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License. 1 Figure 1: A model to understand recall and precision in IR RR (2) R Now the question is how we can extend these denitions that are based on set notion to rank lists that we usually get at the end of a retrieval process. We can create sets from the rank lists, for which we have several options, and measure recall and precision. These options are given below. Precision = 1. At every new document. 2. At every new relevant document. 3. At xed rank value cuto such as measuring precision at rank 10. 4. At xed recall points such as measuring precision at 20% recall. Let us understand this by an example. Figure 2 shows rankings as given by two dierent systems or algorithms. For each of these rankings, calculations of recall and precision are shown at every document. Similarly, we can compute these values at other points listed above. 3 Single value measures By looking at Figure 2, it is not very clear which ranking is better as they both have the same recall and precision values at the end of the list. Of course, this goodness depends on the task at hand, but it is often useful to come up with one nal number indicating the retrieval eectiveness. One simple way of doing this is averaging precision values. Average precision is calculated by averaging precision when recall increases. In Figure 2, these points are indicated circles by on the recall values. If we take precision values at those points and average them, we get 62.2% for Ranking #1 and 52.0% for Ranking #2 as average precision. Thus, using this measure we can immediately say that Ranking #1 is better than #2. 2 Figure 2: Calculating recall and precision with rank lists (Courtesy: James Allan, UMass Amherst) Often we have a number of queries to evaluate for a given system. For each query, we can calculate average precision, and if we take average of those averages for a given system, it gives us Mean Average Precision (MAP), which is a very popular measure to compare two systems. Another such single value measure is R-precision. It is dened as precision after R documents retrieved, where R is the total number of relevant documents for a given query. Average precision and R-precision are shown to be highly correlated. In Figure 2, since the number of relevant documents (R) is 5, R-precision for both the rankings is 0.4 (value of precision after 5 documents retrieved). 4 Evaluating using trec eval Let us now see how we can use a utility developed by NIST, called trec eval1 to compute the above measures. One important requirement for computing any of these measures is the availability of relevance judgments. As a part of TREC runs every year, NIST provides such relevance judgments using a large number of documents assessed by human assessors. This le is formatted as <topic id> 0 <doc id> <relevance> Where the rst column has the topic number, second column is redundant (but still preserved due to historic reasons) carrying value 0, third column has the document ID, and the fourth column has relevance judgment - 0 for non-relevant and 1 for relevant. Topics are also provided by NIST. Each topic has several title, description, and narrative components. These componen...

Find millions of documents on Course Hero - Study Guides, Lecture Notes, Reference Materials, Practice Exams and more. Course Hero has millions of course specific materials providing students with the best way to expand their education.

Below is a small sample set of documents:

Texas A&M - CPSC - 289
The RSA Public-Key CryptosystemAndreas Klappenecker CPSC 289 We will discuss in this lecture the basic principles of the RSA public-key cryptosystem, a system that is used in countless e-commerce applications. The RSA public-key cryptosystem nicely
Texas A&M - MATH - 615
MATH 615-08c, Quiz 4 1. Suppose that (an ) is a sequence of nonzero real numbers, and suppose that n=1 lim infnan+1 =a anandlim supnan+1 = A. an(a) (6 marks) Suppose that A &lt; 1. Prove that the seriesn=1 an is absolutely convergent. an
Texas A&M - MATH - 663
CHAPTER 1. FOURIER SERIES 1.1. Fourier coecients We begin with some basic denitions and notions. Denition 1.1.1. Let N be a xed nonnegative integer, and let R denote the set of real numbers. A trigonometric polynomial of degree N is a function of the
UNC - ECON - 381
Journal of Economic Literature Vol. XXXVII (June 1999) pp. 571608Journal of Economic and Consumption Lewis: Trying To Explain Home Bias in Equities Literature, Vol. XXXVII (June 1999)Trying to Explain Home Bias in Equities and ConsumptionKAREN K.
Texas A&M - MATH - 151
Week in Review Week 11 Spring 2007 1. Sketch the curve. 1 horizontal-intercepts at -3 and -1 vertical-intercept at 3 f (2) = 0 f (x) &gt; 0 on (2, ) f (x) &lt; 0 on (, 2) f (x) &gt; 0 for all real numbers 2 f (1) = 0, f (1) = 0 f (1) = 4,f (1) = 0 f (x
Texas A&M - MATH - 304
MATH 304 Linear Algebra Lecture 23: Similarity of matrices.Basis and coordinates If {v1 , v2, . . . , vn } is a basis for a vector space V , then any vector v V has a unique representation v = x 1 v1 + x 2 v2 + + x n vn , where xi R. The coeff
Texas A&M - MATH - 220
Math 220 - 902/9031.Exam 2 SolutionsNov. 6, 2008(15) Define or state the following: a. Integers a and b are congruent modulo n, a and b are congruent modulo n if n divides their difference.b.d is the greatest common divisor of two integers
Texas A&M - MATH - 222
MATH 222, TEST 2 Show all steps for credit. Q1Q4 12 pts, Q5Q8 13 pts each Q1. Let L : V W be a linear transformation. Prove that dim kerL + dim ranL = dimV. Q2. Let L : P2 R2 be a linear transformation such that L(x + 1) = 1 , 3 L(x + 2) = 1 . 4F
Texas A&M - MATH - 407
Texas A&M - MATH - 407
Texas A&M - MATH - 407
Texas A&M - MATH - 172
Here are some practice problems that are about the right level for the test. Ill discuss any that are causing difculties on Wednesday. I have decided that the test will only go through 8.2, not 8.4 as previously announced. I decided this after making
Texas A&M - MATH - 407
MATH 407, SPRING 2009, INFORMATIONINSTRUCTOR Dr. Roger Smith OFFICE Milner 315 PHONE 845-2129 E-MAIL rsmith@math.tamu.edu URL http:/www.math.tamu.edu/~rsmith/spring09/math407/homepage.html Bookmark this page! CLASS TIME MWF 11:3012:20, Blocker 164
Texas A&M - MATH - 407
MATH 407, FALL 2006, INFORMATIONINSTRUCTOR Dr. Roger Smith OFFICE Milner 315 PHONE 845-2129 E-MAIL rsmith@math.tamu.edu URL http:/www.math.tamu.edu/~rsmith/fall06/math407/homepage.html Bookmark this page! CLASS TIME TTh 89:15, Blocker 156 OFFICE HO
Texas A&M - MATH - 172
MATH 172, SPRING 2009, INFORMATIONINSTRUCTOR Dr. Roger Smith OFFICE Milner 315 PHONE 845-2129 E-MAIL rsmith@math.tamu.edu URL http:/www.math.tamu.edu/~rsmith/spring09/math172/homepage.html Bookmark this page! CLASS TIME MWF 1:502:40, Blocker 117, T
Texas A&M - MATH - 222
MATH 222, FINAL FALL 2001 14 pts per question plus 4 bonus points. Show all work Q1. If A is an invertible n n matrix, and {v1 , . . . , vk } is a set of linearly independent vectors in Rn , then prove that {Av1 , . . . , Avk } is a linearly indepen
Texas A&M - MATH - 222
Texas A&M - MATH - 446
MATH 446, HOMEWORK 3, DUE OCT 6 Everyone does Q1-Q5, honors students also do Q6, Q7 Q1. If E is any subset of a metric space (X, d), the closure E is dened to be the smallest closed set containing E. Prove that x E if and only if there is a sequence
Texas A&M - MATH - 407
Texas A&M - MATH - 222
MATH 222, TEST 1 Show all steps for credit. 10 pts. per question Q1. Find the value of a which makes the set of equations x1 + x2 + x3 = 2 x1 + 2x2 + 3x3 = 1 3x1 + 4x2 + 5x3 = aconsistent, and then find all solutions when a is replaced by this valu
Texas A&M - MATH - 407
Texas A&M - MATH - 222
Texas A&M - MATH - 222
Texas A&M - MATH - 447
MATH 447, HOMEWORK 5, DUE Feb 21st Q1. Prove that (X, d) is connected if and only if every continuous function f : X {0, 1} is constant. Prove that if U and V are connected subsets of X with nonempty intersection then U V is connected. Q2. Prove th
Texas A&M - MATH - 447
MATH 447, HOMEWORK 1, DUE THURSDAY JAN 24th Q1. Let f (x) be a bounded function on [a, b]. Suppose that there is a sequence of partitions Pn so thatnlim (S(f, Pn ) - S(f, Pn ) = 0.Prove that the upper and lower integrals are the same, and thatb
Texas A&M - MATH - 447
MATH 447, HOMEWORK 10, DUE APR 24 Q1. If f is bounded and measurable on [a, b], let f= inf{c 0 : |f (x)| c a.e.}.Prove that this is a norm, and thatplim fp= f.Q2. Prove that simple functions, step functions and continuous functions
Texas A&M - MATH - 222
Texas A&M - MATH - 447
MATH 447, HOMEWORK 3, DUE THURSDAY FEB 7th Q1. On [0, 1] [0, 1], let A = span{f (x)g(y) : f, g C[0, 1]}. Prove that A = C([0, 1] [0, 1]). Q2. Let A = span{1, x2 , x4 , x6 , . . .} on [-1, 1]. Prove that A is the set of even continuous functions on
Texas A&M - MATH - 446
MATH 446 AND 446H, FALL 2005, INFORMATIONINSTRUCTOR Dr. Roger Smith OFFICE Milner 315 PHONE 845-2129 E-MAIL rsmith@math.tamu.edu URL http:/www.math.tamu.edu/~rsmith/fall05/math446/homepage.html Bookmark this page! CLASS TIME TTh 3:555:10, Blocker 1
Texas A&M - M - 640
Math 640 Midterm practice examination1. Give a proof that if A, B M n and AB has rank n, then A has rank n. 2. We say a matrix A M n has a square root B M n if B 2 A. Prove that every diagonalizable matrix has a square root. 3. We say a matr
Texas A&M - M - 640
Math 640 Homework 51. Suppose that A M mk , B M kn and both have rank k. Show that the rank of AB is k. 2. Suppose that the n n matrix A is singular. Show that each column of the adjugate matrix is a solution of Ax 0. 3. Prove that matrix simi
Texas A&M - M - 222
April 17, 2008 Show necessary work!Name: Math 222-EXAM III1. (20 pts) Let T : R2 R2 be defined by T (x, y) = (2x - y, -x + y). (a) Compute the matrix A of T with respect to the standard basis of R2 . (b) What is the characteristic polynomial of
Texas A&M - M - 151
Final Exam, Math 151, Section 510512 FORM A Instructions. Work all problems in any order by any method. In the rst 14 problems your grade will be determined exclusively by whether you select the correct answer. Circle your answer on this page and als
Texas A&M - M - 302
Math 302- SampleEXAM III Missing material from chapter 12 and generating functions 1. Give exact definitions of each of the following, and following your definition, an example of each: (a) An antisymmetric relation.(b) f O(g), but f (g).(c) An
Texas A&M - M - 302
Math 302Sample EXAM I You must show your work to get credit. 1. 12 pts Construct a truth table and determine whether or not each of the following is a tautology, contigency or contradiction. (a) (p q) p(b) p (q r)2. 12 pts Given the following
Texas A&M - STAT - 610
STAT610, Semester I 2004-2005Assignment 1(Deadline: 09/12/2005 by 5:00pm) 1. (Ex1.8) Refer to the game of darts explained in the textbook. (a) Derive the general formula for the probability of scoring i points. (b) Show that P (scoring i points) i
UNC - HIST - 140
1Kennan's Telegram (Excerpt) George F. Kennan to Secretary of State James Byrnes22 February 1946 Part 1: Basic Features of Post War Soviet Outlook, as Put Forward by Official Propaganda Machine, Are as Follows: a.USSR still lives in antagonistic &quot;
UNC - ECON - 460
Economics 460 Case 2: The Danger of the Falling YenSuggested Answers1. What is the impact of the carry trade on the spot and forward exchange rates of Japanese yen in terms of Australian dollars? On the spot market, the desire to borrow in Japan
UNC - ECON - 460
Economics 460, section IV - 1IV. The interdependence of the BOP and the national economy. GOAL: Understand the linkages between foreign trade and payments and the national economic outcomes in output, inflation and unemployment. A. The linkage betw
UNC - ECON - 460
Economics 460 P. ConwaySuggested AnswersCase Study 3: The Problem of OffshoringFollowing upon (and based upon) our class discussion, use the blank area below (one side only) to answer these questions. Write Pledged and your signature at the bott
UNC - ECON - 460
Economics 460 P. Conway18 April 2007 Problem Set 3This problem set is due at the beginning of class on 25 April 2007. Late submission will be penalized. I. The US, China, and the developing world The US has been trading with developing countries
UNC - ECON - 460
Economics 161 P. ConwayFinal ExaminationThis examination is conducted under the UNC Honor Code; please comport yourself accordingly and sign the pledge on your answer book after you complete the exam. If you do not do so, I cannot grade your exam
UNC - RELI - 890
Hallaj, al-Husayn ibn Mansur. Diwan Al-Hallaj: Wa-Yalihi Akhbaruhu Wa-Tawasinuh. Ed. Sa`di Dannawi. 2nd ed., Beirut: Dar Sadir, 2003.
UNC - MODULE - 720
The Pharmacogenomics Journal (2003) 3, 1113 &amp; 2003 Nature Publishing Group All rights reserved 1470-269X/03 $25.00www.nature.com/tpjCLINICAL IMPLICATIONAttention-deficit/hyperactivity disorder: current aspects on pharmacogeneticsLA Rohde1, T Ro
UNC - MODULE - 720
ORIGINAL CONTRIBUTIONJAMA-EXPRESSFluoxetine, Cognitive-Behavioral Therapy, and Their Combination for Adolescents With DepressionTreatment for Adolescents With Depression Study (TADS) Randomized Controlled TrialTreatment for Adolescents With Dep
UNC - PSYC - 840
a bit on Markov Chain Monte Carlo !MCMC&quot;using the Constant Method example from Bock &amp; Jones, 1968The goal of !Bayesian&quot; MCMC estimation is to compute the mean #as opposed to the !frequentist&quot; ML mode$ of the posterior !likelihood for an uninformat
UNC - PSYC - 840
(The modied version of) IRTScore (for 285):IRT Scale Scores in C+(IRTScore special version for 285) Reads a le of IRT (graded model) item parameters Computes response pattern EAPs, response pattern MAPs, summed-score EAPs, and (an as-yet unpubl
UNC - PSYC - 840
For repeated measurements !learning trials&quot;the example variables involve scores at stages of learning on a two#hand coordination task$:Bock &amp; BargmannCase I: The Quasi#Simplex%According to the simplex model, each of these variables incorporates
UNC - PSYC - 840
Scaling: The Constant Methodfrom Fechner through Thurstone toBock &amp; Jones, 1968Compare each of several objects to a !constant,&quot; and judge X j &gt; Xc or not. A model for this #out of Fechner #1860$, through Thurstone #1927a and 1927b$ has !discrimin
UNC - PSYC - 840
Item Parameter Estimation III!largely&quot; Albert !1992&quot;: Markov chain Monte Carlo !MCMC&quot;Back to the Normal Ogive model, with slightly di#erent notation: d as the negative intercept$ Albert refers to it as . Here we choose d to keep it roman, and dis
UNC - PSYC - 840
The plan:Item Parameter Estimation IBock &amp; Lieberman &quot;1970# Bock &amp; Aitkin &quot;1981#I. Using R a# Bock$Lieberman ML, Normal Ogive b# Bock$Lieberman ML, 2PL c# Bock$Aitkin %EM,&amp; 2PL II. Using C+ a# Bock$Aitkin %EM,&amp; 3PL, Graded model III. Using R a#
UNC - PSYC - 840
Ponzo IllusionPoggendor) Illusion#Both ,gures are rotated relative to those in Bock&quot;s book.$0.60.4y.Poggendorffy.Ponzo0.30.20.00.151015200.00.51.01.5The Ponzo and Poggendor) Illusions and Age0.52.0Bock&quot;s Ch
UNC - BIOL - 145
Biology 145 Midterm Exam(15 pts) 1. The following study is described in Mead, R., R. N. Curnow, and A. M. Halstead (1993), Statistical Methods in Agriculture and Experimental Biology, London: Chapman &amp; Hall. Plants were selected based on their chara
UNC - BIOL - 145
A Flowchart for Choosing a Two-Sample Standard Hypothesis TestUse the test statistic Ratio of F=2 s1 2 s2Variances which has an F distribution with n1 1 and n2 1 degrees of freedom. 12 2 2What is being tested?Difference Do a ztest wit
UNC - BIOL - 145
HW 2-SolutionsQuestion 2 n.plant&lt;-by(Plant96,Coltype,length) Lower95&lt;-mean.plant+qt(.025,n.plant-1)* sd.plant/sqrt(n.plant) Upper95&lt;-mean.plant+qt(.975,n.plant-1)* sd.plant/sqrt(n.plant) cbind(Lower95,mean.plant,Upper95) Lower95 mean.plant Upper95 B
UNC - BIOL - 145
Chapter 3Typographic errors and Correctionsp. 37 Sec. 3.1.5 The discussion in the first paragraph of the two-sample t-test is confused. There are in fact two versions of the two-sample t-test, depending upon whether the variances of the two populati
UNC - BIOL - 145
UNC - BIOL - 145
Case Study 1 The Problem: The goal is to understand native and exotic plant recovery after removal of exotic herbivores in the Channel Islands. The Set-Up (1) To this end plant cover values (BraunBlanquet scale) were obtained for 28 relevs situated
UNC - BIOL - 145
Midterm Exam-SolutionsGrades: 85%-90% 70%-84% 50%-69% 30%-49% A B C D (n = 8) (n = 3) (n = 1) (n = 1)Problem 1 Some observations about the problem: 1. As is stated in many places in this problem, this is a paired design. Treatments were randomly a
UNC - BIOL - 145
Case Study 3 The Problem: A new fixative for plant cell preparations is to be evaluated with respect to a standard fixative. To compare the fixatives, their effect on measured plant cell wall thickness is to be evaluated. The Set-Up (1) 8 cells wer
UNC - BIOL - 145
HW 10-SolutionsThe Type II Regression Functions Reduced Major Axis Regression (Geometric Mean Version)rma.reg&lt;-function(x.in,y.in) { #check for missing data and eliminate good.list&lt;-!(is.na(x.in)|is.na(y.in) #place input variables in a matrix both.