jurafsky&martin_3rdEd_17 (1).pdf

Finally many texts describe recurring stereotypical

Info icon This preview shows pages 349–351. Sign up to view the full content.

View Full Document Right Arrow Icon
Finally, many texts describe recurring stereotypical situations. The task of tem- plate filling is to find such situations in documents and fill the template slots with template filling appropriate material. These slot-fillers may consist of text segments extracted di- rectly from the text, or concepts like times, amounts, or ontology entities that have been inferred from text elements through additional processing. Our airline text is an example of this kind of stereotypical situation since airlines often raise fares and then wait to see if competitors follow along. In this situa- tion, we can identify United as a lead airline that initially raised its fares, $6 as the amount, Thursday as the increase date, and American as an airline that followed along, leading to a filled template like the following. F ARE -R AISE A TTEMPT : 2 6 6 6 4 L EAD A IRLINE : U NITED A IRLINES A MOUNT : $6 E FFECTIVE D ATE : 2006-10-26 F OLLOWER : A MERICAN A IRLINES 3 7 7 7 5 The following sections review current approaches to each of these problems. 21.1 Named Entity Recognition The first step in information extraction is to detect the entities in the text. A named entity is, roughly speaking, anything that can be referred to with a proper name: named entity a person, a location, an organization. The term is commonly extended to include things that aren’t entities per se, including dates, times, and other kinds of temporal expressions , and even numerical expressions like prices. Here’s the sample text temporal expressions introduced earlier with the named entities marked: Citing high fuel prices, [ ORG United Airlines ] said [ TIME Friday ] it has increased fares by [ MONEY $6 ] per round trip on flights to some cities also served by lower-cost carriers. [ ORG American Airlines ] , a unit of [ ORG AMR Corp.] , immediately matched the move, spokesman [ PER Tim Wagner ] said. [ ORG United] , a unit of [ ORG UAL Corp.] , said the increase took effect [ TIME Thursday] and applies to most routes where it competes against discount carriers, such as [ LOC Chicago] to [ LOC Dallas] and [ LOC Denver] to [ LOC San Francisco] .
Image of page 349

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
350 C HAPTER 21 I NFORMATION E XTRACTION The text contains 13 mentions of named entities including 5 organizations, 4 loca- tions, 2 times, 1 person, and 1 mention of money. In addition to their use in extracting events and the relationship between par- ticipants, named entities are useful for many other language processing tasks. In sentiment analysis we might want to know a consumer’s sentiment toward a partic- ular entity. Entities are a useful first stage in question answering, or for linking text to information in structured knowledge sources like wikipedia. Figure 21.1 shows typical generic named entity types. Many applications will also need to use specific entity types like proteins, genes, commercial products, or works of art.
Image of page 350
Image of page 351
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern