jurafsky&martin_3rdEd_17 (1).pdf

The merging and ranking is actually run iteratively

Info icon This preview shows pages 415–417. Sign up to view the full content.

View Full Document Right Arrow Icon
The merging and ranking is actually run iteratively; first the candidates are ranked by the classifier, giving a rough first value for each candidate answer, then that value is used to decide which of the variants of a name to select as the merged answer, then the merged answers are re-ranked,. In summary, we’ve seen in the four stages of DeepQA that it draws on the in- tuitions of both the IR-based and knowledge-based paradigms. Indeed, Watson’s architectural innovation is its reliance on proposing a very large number of candi- date answers from both text-based and knowledge-based sources and then devel- oping a wide variety of evidence features for scoring these candidates —again both text-based and knowledge-based. Of course the Watson system has many more com- ponents for dealing with rare and complex questions, and for strategic decisions in playing Jeopardy!; see the papers mentioned at the end of the chapter for many more details. 27.4 Evaluation of Factoid Answers A common evaluation metric for factoid question answering, introduced in the TREC Q/A track in 1999, is mean reciprocal rank , or MRR . MRR assumes a test set of mean reciprocal rank MRR questions that have been human-labeled with correct answers. MRR also assumes that systems are returning a short ranked list of answers or passages containing an- swers. Each question is then scored according to the reciprocal of the rank of the first correct answer. For example if the system returned five answers but the first three are wrong and hence the highest-ranked correct answer is ranked fourth, the reciprocal rank score for that question would be 1 4 . Questions with return sets that do not contain any correct answers are assigned a zero. The score of a system is then the average of the score for each question in the set. More formally, for an evaluation of a system returning a set of ranked answers for a test set consisting of
Image of page 415

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
416 C HAPTER 27 Q UESTION A NSWERING N questions, the MRR is defined as MRR = 1 N N X i = 1 s.t. rank i 6 = 0 1 rank i (27.9) A number of test sets are available for question answering. Early systems used the TREC QA dataset; questions and hand-written answers for TREC competitions from 1999 to 2004 are publicly available. FREE917 (Cai and Yates, 2013) has 917 FREE917 questions manually created by annotators, each paired with a meaning representa- tion; example questions include: How many people survived the sinking of the Titanic? What is the average temperature in Sydney in August? When did Mount Fuji last erupt? WEBQUESTIONS (Berant et al., 2013) contains 5,810 questions asked by web WEBQUES- TIONS users, each beginning with a wh-word and containing exactly one entity. Questions are paired with hand-written answers drawn from the Freebase page of the question’s entity, and were extracted from Google Suggest by breadth-first search (start with a seed question, remove some words, use Google Suggest to suggest likely alternative question candidates, remove some words, etc.). Some examples: What character did Natalie Portman play in Star Wars?
Image of page 416
Image of page 417
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern