CONLL14 - Acquiring Knowledge from the Web to be used as...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon
CoNLL 2008: Proceedings of the 12th Conference on Computational Natural Language Learning, pages 105–112 Manchester, August 2008 Acquiring Knowledge from the Web to be used as Selectors for Noun Sense Disambiguation Hansen A. Schwartz and Fernando Gomez School of Electrical Engineering and Computer Science University of Central Florida { hschwartz, gomez } Abstract This paper presents a method of acquiring knowledge from the Web for noun sense disambiguation. Words, called selectors, are acquired which take the place of an instance of a target word in its local con- text. The selectors serve for the system to essentially learn the areas or concepts of WordNet that the sense of a target word should be a part of. The correct sense is chosen based on a combination of the strength given from similarity and related- ness measures over WordNet and the prob- ability of a selector occurring within the lo- cal context. Our method is evaluated using the coarse-grained all-words task from Se- mEval 2007. Experiments reveal that path- based similarity measures perform just as well as information content similarity mea- sures within our system. Overall, the re- sults show our system is out-performed only by systems utilizing training data or substantially more annotated data. 1 Introduction Recently, the Web has become the focus for many word sense disambiguation (WSD) systems. Due to the limited amount of sense tagged data avail- able for supervised approaches, systems which are typically referred to as unsupervised, have turned to the use of unannotated corpora including the Web. The advantage of these systems is that they can disambiguate all words, and not just a set of words for which training data has been provided. In this paper we present an unsupervised system which uses the Web in a novel fashion to perform c ± 2008. Licensed under the Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported li- cense ( Some rights reserved. sense disambiguation of any noun, incorporating both similarity and relatedness measures. As explained in (Brody et al., 2006), there are generally two approaches to unsupervised WSD. The first is referred to as token based, which com- pares the relatedness of a target word to other words in its context. The second approach is type based, which uses or identifies the most common sense of a word over a discourse or corpus, and an- notates all instances of a word with the most com- mon sense. Although the type based approach is clearly bound to fail occasionally, it is commonly found to produce the strongest results, rivaling su- pervised systems (McCarthy et al., 2004). We identify a third approach through the use of selec- tors , first introduced by (Lin, 1997), which help to disambiguate a word by comparing it to other words that may replace it within the same local context. We approach the problem of word sense dis-
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 2
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 03/06/2012 for the course CIS 630 taught by Professor Cis630 during the Spring '08 term at UPenn.

Page1 / 8

CONLL14 - Acquiring Knowledge from the Web to be used as...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online