jurafsky&martin_3rdEd_17 (1).pdf

But are instead a property of both the observation x

Info icon This preview shows pages 93–95. Sign up to view the full content.

View Full Document Right Arrow Icon
, but are instead a property of both the observation x and the candidate output class c . Thus, in MaxEnt, instead of the notation f i or f i ( x ) , we use the notation f i ( c , x ) , meaning feature i for a particular class c for a given observation x : p ( c | x ) = 1 Z exp X i w i f i ( c , x ) ! (7.6) Fleshing out the normalization factor Z , and specifying the number of features as N gives us the final equation for computing the probability of y being of class c given x in MaxEnt: p ( c | x ) = exp N X i = 1 w i f i ( c , x ) ! X c 0 2 C exp N X i = 1 w i f i ( c 0 , x ) ! (7.7) 7.1 Features in Multinomial Logistic Regression Let’s look at some sample features for a few NLP tasks to help understand this perhaps unintuitive use of features that are functions of both the observation x and the class c , Suppose we are doing text classification, and we would like to know whether to assign the sentiment class + , - , or 0 (neutral) to a document. Here are five potential features, representing that the document x contains the word great and the class is
Image of page 93

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
94 C HAPTER 7 L OGISTIC R EGRESSION + ( f 1 ), contains the word second-rate and the class is - ( f 2 ), and contains the word no and the class is - ( f 3 ). f 1 ( c , x ) = 1 if “great” 2 x & c = + 0 otherwise f 2 ( c , x ) = 1 if “second-rate” 2 x & c = - 0 otherwise f 3 ( c , x ) = 1 if “no” 2 x & c = - 0 otherwise f 4 ( c , x ) = 1 if “enjoy” 2 x & c = - 0 otherwise Each of these features has a corresponding weight, which can be positive or negative. Weight w 1 ( x ) indicates the strength of great as a cue for class + , w 2 ( x ) and w 3 ( x ) the strength of second-rate and no for the class - . These weights would likely be positive—logically negative words like no or nothing turn out to be more likely to occur in documents with negative sentiment (Potts, 2011) . Weight w 4 ( x ) , the strength of enjoy for - , would likely have a negative weight. We’ll discuss in the following section how these weights are learned. Since each feature is dependent on both a property of the observation and the class being labeled, we would have additional features for the links between great and the negative class - , or no and the neutral class 0, and so on. Similar features could be designed for other language processing classification tasks. For period disambiguation (deciding if a period is the end of a sentence or part of a word), we might have the two classes EOS (end-of-sentence) and not-EOS and features like f 1 below expressing that the current word is lower case and the class is EOS (perhaps with a positive weight), or that the current word is in our abbreviations dictionary (“Prof.”) and the class is EOS (perhaps with a negative weight). A feature can also express a quite complex combination of properties. For example a period following a upper cased word is a likely to be an EOS, but if the word itself is St.
Image of page 94
Image of page 95
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern