Unformatted text preview: on problem for the labels of each word,
with the surrounding words as input feature vector. A frequent way of forming
the feature vector is a binary encoding scheme. Each feature component can be
considered as a test that asserts whether a certain pattern occurs at a speciﬁc
position or not. For example, a feature component takes the value 1 if the
previous word is the word "John" and 0 otherwise. Of course we may not only
test the presence of speciﬁc words but also whether the words starts with a
capital letter, has a speciﬁc sufﬁx or is a speciﬁc part-of-speech. In this way
results of previous analysis may be used.
Now we may employ any efﬁcient classiﬁcation method to classify the word
labels using the input feature vector. A good candidate is the Support Vector
Machine because of its ability to handle large sparse feature vectors efﬁciently.
Takeuchi & Collier (2002) used it to extract entities in the molecular biology
3.3.2 Hidden Markov Models One problem of standard classiﬁcation approaches is that...
View Full Document
This note was uploaded on 06/19/2011 for the course IT 2258 taught by Professor Aymenali during the Summer '11 term at Abu Dhabi University.
- Summer '11