jurafsky&martin_3rdEd_17 (1).pdf

The flight from denver has arrived classifier dt nn

Info icon This preview shows pages 208–210. Sign up to view the full content.

The flight from Denver has arrived Classifier DT NN NN IN NNP Corresponding feature representation The, DT, B_NP, morning, NN, I_NP, flight, NN, from, IN, Denver, NNP Label I_NP morning Figure 12.8 A sequential-classifier-based approach to chunking. The chunker slides a context window over the sentence, classifying words as it proceeds. At this point, the classifier is attempting to label flight . Features derived from the context typically include the words, part-of-speech tags as well as the previously assigned chunk tags. Figure 12.8 illustrates this scheme with the example given earlier. During train- ing, the classifier would be provided with a training vector consisting of the values of 13 features; the two words to the left of the decision point, their parts-of-speech and chunk tags, the word to be tagged along with its part-of-speech, the two words that follow along with their parts-of speech, and finally the correct chunk tag, in this case, I NP . During classification, the classifier is given the same vector without the answer and assigns the most appropriate tag from its tagset. 12.3.2 Chunking-System Evaluations As with the evaluation of part-of-speech taggers, the evaluation of chunkers pro- ceeds by comparing chunker output with gold-standard answers provided by human annotators. However, unlike part-of-speech tagging, word-by-word accuracy mea- sures are not appropriate. Instead, chunkers are evaluated according to the notions of precision, recall, and the F -measure borrowed from the field of information retrieval. Precision measures the percentage of system-provided chunks that were correct. Precision Correct here means that both the boundaries of the chunk and the chunk’s label are correct. Precision is therefore defined as Precision: = Number of correct chunks given by system Total number of chunks given by system Recall measures the percentage of chunks actually present in the input that were Recall correctly identified by the system. Recall is defined as Recall: = Number of correct chunks given by system Total number of actual chunks in the text The F -measure (van Rijsbergen, 1975) provides a way to combine these two F-measure
Image of page 208

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

12.4 S UMMARY 209 measures into a single metric. The F -measure is defined as F b = ( b 2 + 1 ) PR b 2 P + R The b parameter differentially weights the importance of recall and precision, based perhaps on the needs of an application. Values of b > 1 favor recall, while values of b < 1 favor precision. When b = 1, precision and recall are equally bal- anced; this is sometimes called F b = 1 or just F 1 : F 1 = 2 PR P + R (12.9) F -measure comes from a weighted harmonic mean of precision and recall. The harmonic mean of a set of numbers is the reciprocal of the arithmetic mean of recip- rocals: HarmonicMean ( a 1 , a 2 , a 3 , a 4 ,..., a n ) = n 1 a 1 + 1 a 2 + 1 a 3 + ... + 1 a n (12.10) and hence F -measure is F = 1 a 1 P +( 1 - a ) 1 R or with b 2 = 1 - a a F = ( b 2 + 1 ) PR b 2 P + R (12.11) Statistical significance results on sequence labeling tasks such as chunking can be computed using matched-pair tests such as McNemar’s test, or variants such
Image of page 209
Image of page 210
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern