Evaluation_And_Redundancy

Evaluation_And_Redundancy - 1 Practical Remarks: the...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
1 Practical Remarks: the problem of near duplicates or exact duplicates We are making some changes in the way that we ask you to evaluate systems, for your term paper, this Semester. These changes are intended to recognize the fact that a search engine may return several snippets which link to different instances of exactly [or essentially] the same information. Each instance is, in the most technical sense, „relevant‟. But only the first one is useful for you. Here is an example. We use capital letters to represent relevant web pages, and lower case to represent not relevant web page [as represented by their snippets]. If the first relevant item in the list is called “A”, we can call the second one A1, and the third A2. Here is what might happen just when we are comparing two systems. System S1 S2 x A3 A d g e A2 B2 h A4 B g m C A3 A B2 m x2 g Calculation rules, For computing precision, assume that only the first instance of each relevant page is counted as relevant for each search engine . Let‟s count all the others as “not relevant”. So it is as if the table really looked like this: System S1 S2 x A3 A d g e A2 B2 h A4 B g M C A3 A B2 m x2 g
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
We would like to calculate the set based similarity (sometimes called “overlap” or “Dice coefficient for the relevant items that have been retrieved. We have used strikethrough
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 3

Evaluation_And_Redundancy - 1 Practical Remarks: the...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online