jurafsky&martin_3rdEd_17 (1).pdf

A common way to use lexicons in the classifier is to

Info icon This preview shows pages 82–84. Sign up to view the full content.

A common way to use lexicons in the classifier is to use as one feature the total count of occurrences of any words in the positive lexicon, and as a second feature the total count of occurrences of words in the negative lexicon. Using just two features results in classifiers that are much less sparse to small amounts of training data, and may generalize better. 6.5 Naive Bayes as a Language Model Naive Bayes classifiers can use any sort of feature: dictionaries, URLs, email ad- dresses, network features, phrases, parse trees, and so on. But if, as in the previous section, we use only individual word features, and we use all of the words in the text (not a subset), then naive Bayes has an important similarity to language modeling. Specifically, a naive Bayes model can be viewed as a set of class-specific unigram language models, in which the model for each class instantiates a unigram language model. Since the likelihood features from the naive Bayes model assign a probability to each word P ( word | c ) , the model also assigns a probability to each sentence: P ( s | c ) = Y i 2 positions P ( w i | c ) (6.15) Thus consider a naive Bayes model with the classes positive (+) and negative (-) and the following model parameters:
Image of page 82

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

6.6 E VALUATION : P RECISION , R ECALL , F- MEASURE 83 w P(w | +) P(w | -) I 0.1 0.2 love 0.1 0.001 this 0.01 0.01 fun 0.05 0.005 film 0.1 0.1 ... ... ... Each of the two columns above instantiates a language model that can assign a probability to the sentence “I love this fun film”: P ( ”I love this fun film” | +) = 0 . 1 0 . 1 0 . 01 0 . 05 0 . 1 = 0 . 0000005 P ( ”I love this fun film” | - ) = 0 . 2 0 . 001 0 . 01 0 . 005 0 . 1 = . 0000000010 As it happens, the positive model assigns a higher probability to the sentence: P ( s | pos ) > P ( s | neg ) . Note that this is just the likelihood part of the naive Bayes model; once we multiply in the prior a full naive Bayes model might well make a different classification decision. 6.6 Evaluation: Precision, Recall, F-measure To introduce the methods for evaluating text classification, let’s first consider some simple binary detection tasks. For example, in spam detection, our goal is to label every text as being in the spam category (“positive”) or not in the spam category (“negative”). For each item (email document) we therefore need to know whether our system called it spam or not. We also need to know whether the email is actually spam or not, i.e. the human-defined labels for each document that we are trying to match. We will refer to these human labels as the gold labels . gold labels Or imagine you’re the CEO of the Delicious Pie Company and you need to know what people are saying about your pies on social media, so you build a system that detects tweets concerning Delicious Pie. Here the positive class is tweets about Delicious Pie and the negative class is all other tweets.
Image of page 83
Image of page 84
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern