jurafsky&martin_3rdEd_17 (1).pdf

Therefore the next stage is to extract a set of

Info icon This preview shows pages 405–408. Sign up to view the full content.

Therefore, the next stage is to extract a set of potential answer passages from the retrieved set of documents. The definition of a passage is necessarily system dependent, but the typical units include sections, paragraphs, and sentences. We might run a paragraph segmentation algorithm on all the returned documents and treat each paragraph as a segment. We next perform passage retrieval . In this stage, we first filter out passages in passage retrieval the returned documents that don’t contain potential answers and then rank the rest according to how likely they are to contain an answer to the question. The first step in this process is to run a named entity or answer type classification on the retrieved passages. The answer type that we determined from the question tells us the possible answer types we expect to see in the answer. We can therefore filter out documents that don’t contain any entities of the right type. The remaining passages are then ranked, usually by supervised machine learn- ing, relying on a small set of features that can be easily extracted from a potentially large number of answer passages, such as: The number of named entities of the right type in the passage The number of question keywords in the passage The longest exact sequence of question keywords that occurs in the passage The rank of the document from which the passage was extracted
Image of page 405

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

406 C HAPTER 27 Q UESTION A NSWERING The proximity of the keywords from the original query to each other For each passage identify the shortest span that covers the keywords contained in that passage. Prefer smaller spans that include more keywords ( Pasca 2003 , Monz 2004 ). The N -gram overlap between the passage and the question Count the N -grams in the question and the N -grams in the answer passages. Prefer the passages with higher N -gram overlap with the question (Brill et al., 2002) . For question answering from the Web, instead of extracting passages from all returned documents, we can rely on the Web search to do passage extraction for us. We do this by using snippets produced by the Web search engine as the returned passages. For example, Fig. 27.5 shows snippets for the first five documents returned from Google for the query When was movable type metal printing invented in Korea? Figure 27.5 Five snippets from Google in response to the query When was movable type metal printing invented in Korea? 27.1.5 Answer Processing The final stage of question answering is to extract a specific answer from the passage so as to be able to present the user with an answer like 29,029 feet to the question “How tall is Mt. Everest?”
Image of page 406
27.1 IR- BASED F ACTOID Q UESTION A NSWERING 407 Two classes of algorithms have been applied to the answer-extraction task, one based on answer-type pattern extraction and one based on N-gram tiling .
Image of page 407

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

Image of page 408
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern