jurafsky&martin_3rdEd_17 (1).pdf

The task of selecting the correct sense for a word is

Info icon This preview shows pages 307–309. Sign up to view the full content.

View Full Document Right Arrow Icon
The task of selecting the correct sense for a word is called word sense dis- ambiguation , or WSD . Disambiguating word senses has the potential to improve word sense disambiguation WSD many natural language processing tasks, including machine translation , question answering , and information retrieval . WSD algorithms take as input a word in context along with a fixed inventory of potential word senses and return as output the correct word sense for that use. The input and the senses depends on the task. For machine translation from English to Spanish, the sense tag inventory for an English word might be the set of differ- ent Spanish translations. If our task is automatic indexing of medical articles, the sense-tag inventory might be the set of MeSH (Medical Subject Headings) thesaurus entries. When we are evaluating WSD in isolation, we can use the set of senses from a dictionary/thesaurus resource like WordNet. Figure 17.4 shows an example for the word bass , which can refer to a musical instrument or a kind of fish. 2 It is useful to distinguish two variants of the generic WSD task. In the lexi- cal sample task, a small pre-selected set of target words is chosen, along with an lexical sample inventory of senses for each word from some lexicon. Since the set of words and 2 The WordNet database includes eight senses; we have arbitrarily selected two for this example; we have also arbitrarily selected one of the many Spanish fishes that could translate English sea bass .
Image of page 307

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
308 C HAPTER 17 C OMPUTING WITH W ORD S ENSES WordNet Spanish Roget Sense Translation Category Target Word in Context bass 4 lubina FISH / INSECT . . . fish as Pacific salmon and striped bass and. . . bass 4 lubina FISH / INSECT . . . produce filets of smoked bass or sturgeon. . . bass 7 bajo MUSIC . . . exciting jazz bass player since Ray Brown. . . bass 7 bajo MUSIC . . . play bass because he doesn’t have to solo. . . Figure 17.5 Possible definitions for the inventory of sense tags for bass . the set of senses are small, supervised machine learning approaches are often used to handle lexical sample tasks. For each word, a number of corpus instances (con- text sentences) can be selected and hand-labeled with the correct sense of the target word in each. Classifier systems can then be trained with these labeled examples. Unlabeled target words in context can then be labeled using such a trained classifier. Early work in word sense disambiguation focused solely on lexical sample tasks of this sort, building word-specific algorithms for disambiguating single words like line , interest , or plant . In contrast, in the all-words task, systems are given entire texts and a lexicon all-words with an inventory of senses for each entry and are required to disambiguate every content word in the text. The all-words task is similar to part-of-speech tagging, ex- cept with a much larger set of tags since each lemma has its own set. A consequence of this larger set of tags is a serious data sparseness problem; it is unlikely that ade- quate training data for every word in the test set will be available. Moreover, given
Image of page 308
Image of page 309
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern