124.11.lec5

124.11.lec5 - CS 124/LINGUIST 180: From Languages to...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: CS 124/LINGUIST 180: From Languages to Information Dan Jurafsky Lecture 5: Sen3ment Analysis IP notice: many slides for today from Chris Potts and Janyce Wiebe, plus some from Marti Hearst and Marta Tatu Sentiment, Style, Identity Classification   Iden3ty:   Authorship iden3fica3on   Age/gender iden3fica3on   Sen3ment Analysis   Movie: is a review posi3ve or nega3ve   Products (new MacBook Pro)   Sen3ments over 3me (is anger increasing or decreasing?)   Poli3cs (is this editorial leH or right?)   Predic3on (elec3on outcomes, market trends). Will stock go up aHer this news report?   Style/Emo3on   Is this conversa3on (or blog) friendly, aggressive, polite, flirta3ous Part I: Author Identification (Stylometry) Slide from Marti Hearst Author Identification   Also called Stylometry in the humani3es   An example of a Classifica(on Problem   Classifiers:   Decide which of N buckets to put an item in   (Some classifiers allow for mul3ple buckets) Slide from Marti Hearst The Disputed Federalist Papers   In 1787 ­1788, Jay, Madison, and Hamilton wrote a series of anonymous essays to convince the voters of New York to ra3fy the new U. S. Cons3tu3on.   Scholars have consensus that:   5 authored by Jay   51 authored by Hamilton   14 authored by Madison   3 jointly by Hamilton and Madison   12 remain in dispute … Hamilton or Madison? Slide from Marti Hearst Author identification   Federalist papers   In 1963 Mosteller and Wallace solved the problem   They iden3fied func3on words as good candidates for authorships analysis   Using sta3s3cal inference they concluded the author was Madison   Since then, other sta3s3cal techniques have supported this conclusion. Slide from Marti Hearst Function vs. Content Words High rates for “by” favor M, low favor H High rates for “from” favor M, low says little High rats for “to” favor H, low favor M Slide from Marti Hearst Function vs. Content Words No consistent pattern for “war” Slide from Marti Hearst Federalist Papers Problem Fung, The Disputed Federalist Papers: SVM Feature Selection Via Concave Minimization, ACM TAPIA’03 Slide from Marti Hearst Part II: Sentiment Analysis   Extrac3on of opinions and subjec3ve feelings from text and speech   More generally, extrac3on of social meaning   Opinions (posi3ve or nega3ve) on a topic or generally   “Twieer mood predicts the stock market.”   Social iden3ty (Democrat, Republican, etc.)   Uncertainty (students in tutoring)   Annoyance and other emo3ons (telephone dialogues)   Decep3on   Intoxica3on   Flirta3on, Roman3c interest Example: Political Sentiment   Two examples of classifiers   Using words as features   And a Naïve Bayes or SVM classifier   To make a binary decision   About the poli3cal stance of a text Political Sentiment   Lin, Wilson, Wiebe, Hauptmann (2006)   Bieerlemon.com   A website designed to “contribute to mutual understanding [between Pales3nians and Israelis] through the open exchange of ideas”   Can we label Israeli & Pales3nian perspec3ve: 1.  “The inadvertent killing by Israeli forces of Pales3nian civilians – usually in the course of shoo3ng at Pales3nian terrorists – is considered no different at the moral and ethical level than the deliberate targe3ng of Israeli civilians by Pales3nian suicide bombers.” 2.  “In the first weeks of the In3fada, for example, Pales3nian public protests and civilian demonstra3ons were answered brutally by Israel, which killed tens of unarmed protesters.” Lin et al on Political Perspective   594 ar3cles from 2001 ­2005   Naïve Bayes classifier   Accuracy 89% ­99% Naïve Bayes: Top 20 words   Pales3nian   pales3nian, israel, state, poli3cs, peace, interna(onal, people, seele, occupa(on, sharon, right, govern, two, secure, end, conflict, process, side, nego(ate   Israeli   israel, pales3nian, state, seele, sharon, peace, arafat, arab, poli3cs, two, process, secure, conflict, lead, america, agree, right, gaza, govern Thomas, Pang, Lee: Get out the vote: Determining support or opposition from Congressional floor-debate transcripts   Goal: label a speech as pro or con a bill   Data: transcripts of all debates in House of Representa3ves in 2005   From GovTrack (hep://govtrack.us) website   Each speech segment (sequence of uninterrupted ueerances by speaker)   Labeled by the vote (“yea” or “nay”) cast   Labeled by SVM classifier, using all word unigrams as features Results   majority baseline   #(“support”) − #(“oppos”)   SVM classifier   Add network of agreements 58.37 62.67 66.05 70.81 Sentiment datasets on the Web IMDB slide from Chris Potts Amazon slide from Chris Potts OpenTable slide from Chris Potts TripAdvisor slide from Chris Potts Richer sentiment on the web (not just positive/negative)   Experience Project   hep://www.experienceproject.com/confessions.php? cid=184000   FMyLife   hep://www.fmylife.com/miscellaneous/14613102   My Life is Average   hep://mylifeisaverage.com/   It Made My Day   hep://immd.icanhascheezburger.com/ Pang and Lee’s (2004) movie review data from IMDB   Polarity data 2.0:   hep://www.cs.cornell.edu/people/pabo/movie ­review ­ data Pang and Lee IMDB data   Ra3ng: pos when _star wars_ came out some twenty years ago , the image of traveling throughout the starshas become a commonplace image . … when han solo goes light speed , the stars change to bright lines , going towards the viewer in lines that converge at an invisible point . cool . _october sky_ offers a much simpler image–that of a single white dot , traveling horizontally across the night sky .   [. . . ]   Ra3ng: neg “ snake eyes ” is the most aggrava3ng kind of movie : the kind that shows so much poten3al thenbecomes unbelievably disappoin3ng . it’s not just because this is a brian depalma film , and since he’s a great director and one who’s films are always greeted with at least some fanfare . and it’s not even because this was a film starring nicolas cage and since he gives a brauvara performance , this film is hardly worth his talents . Pang and Lee Algorithm   Classifica3on using different classifiers   Naïve Bayes   MaxEnt   SVM   Cross ­valida3on (they did 3 folds; you’ll do 10)   Break up data into 10 folds   For each fold   Choose the fold as a temporary “test set”   Train on 9 folds, compute performance on the test fold   Report the average performance of the 10 runs. Negation in Sentiment Analysis They have not succeeded, and will never succeed, in breaking the will of this valiant people. Slide from Janyce Wiebe Negation in Sentiment Analysis They have not succeeded, and will never succeed, in breaking the will of this valiant people. Slide from Janyce Wiebe Negation in Sentiment Analysis They have not succeeded, and will never succeed, in breaking the will of this valiant people. Slide from Janyce Wiebe Negation in Sentiment Analysis They have not succeeded, and will never succeed, in breaking the will of this valiant people. Slide from Janyce Wiebe Pang and Lee on Negation   added the tag NOT to every word between a nega3on word (“not”, “isn’t”, “didn’t”, etc.) and the first punctua3on mark following the nega3on word. didn’t like this movie, but I! didn’t NOT_like NOT_this NOT_movie! Pang and Lee interesting Observation   “Feature presence”   i.e. 1 if a word occurred in a document, 0 if it didn’t   worked beeer than unigram probability   Why might this be? Other difficulties in movie review classification   What makes movies hard to classify?   Sen3ment can be subtle:   Perfume review in “Perfumes: the Guide”:   “If you are reading this because it is your darling fragrance, please wear it at home exclusively, and tape the windows shut.”   “She runs the gamut of emo3ons from A to B” (Dorothy Parker on Katherine Hepburn)   Order effects   This film should be brilliant. It sounds like a great plot, the actors are first grade, and the suppor3ng cast is good as well, and Stallone is aeemp3ng to deliver a good performance. However, it can’t hold up. 32 Advanced sentiment: Using a lexicon   Key task: Vocabulary   The previous work (and your homework 3) uses all the words in a document   Can we do beeer by focusing on subset of words?   How to find words, phrases, paeerns that express sen3ment or polarity? 33 Use a pre-built dictionary   Harvard General Inquirer Database   Contains 3627 nega3ve and posi3ve word ­strings:   hep://www.wjh.harvard.edu/ ~inquirer/   Linguis3c Inquiry and Word Count   Pennebaker, Francis, & Booth, 2001   dic3onary of 2300 words grouped into > 70 classes   nega(ve emo(on (bad, weird, hate, problem, tough)   sexual (love, loves, lover, passion, passionate, sex,)   1st person pronouns (I me mine myself I’d I’ll I’m…) But could we build vocab automatically?  Adjec3ves   posi3ve: honest important mature large pa3ent   Ron Paul is the only honest man in Washington.   Kitchell’s wri3ng is unbelievably mature and is only likely to get beeer.   To humour me my pa3ent father agrees yet again to my choice of film   nega3ve: harmful hypocri3cal inefficient insecure   It was a macabre and hypocri3cal circus.   Why are they being so inefficient ? Slide from Janyce Wiebe 35 Other parts of speech   Verbs  posi3ve: praise, love  nega3ve: blame, cri(cize   Nouns  posi3ve: pleasure, enjoyment  nega3ve: pain, cri(cism Slide from Janyce Wiebe 36 Phrases   Phrases containing adjec3ves and adverbs   posi3ve: high intelligence, low cost   nega3ve: liele varia3on, many troubles Slide adapted form Janyce Wiebe 37 Intuition for identifying polarity words   Assume that contexts are coherent   Fair and legi3mate, corrupt and brutal   *fair and brutal, *corrupt and legi3mate Slide adapted from Janyce Wiebe 38 Hatzivassiloglou & McKeown 1997 Predicting the semantic orientation of adjectives Step 1 From 21 ­million word WSJ corpus For every adjec3ve with frequency > 20       Label for polarity Total of 1336 adjec3ves       657 posi3ve 679 nega3ve 39 Hatzivassiloglou & McKeown 1997   Step 2: Extract all conjoined adjectives nice and comfortable nice and scenic ICWSM 2008 Slide adapted from Janyce Wiebe 440 0 Hatzivassiloglou & McKeown 1997 3. A supervised learning algorithm builds a graph of adjec3ves linked by the same or different seman3c orienta3on scenic nice painful handsome terrible fun expensive Slide adapted from Janyce Wiebe comfortable 41 Hatzivassiloglou & McKeown 1997 4. A clustering algorithm par33ons the adjec3ves into two subsets + slow scenic nice terrible handsome painful fun expensive Slide from Janyce Wiebe comfortable 42 Hatzivassiloglou & McKeown 1997 Turney (2002): Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews Input: review   Iden(fy phrases that contain adjec3ves or adverbs by using a part ­of ­speech tagger   Es(mate the seman(c orienta(on of each phrase   Assign a class to the given review based on the average seman3c orienta3on of its phrases Output: classifica3on ( or ) Slide from Marta Tatu 44 Turney Step 1   Extract all two ­word phrases including an adjec3ve First Word Second Word Third Word (not extracted) 1. JJ NN or NNS Anything 2. RB, RBR, or RBS JJ Not NN nor NNS 3. JJ JJ Not NN nor NNS 4. NN or NNS JJ Not NN nor NNS 5. RB, RBR, or RBS VB, VBD, VBN, or VBG Anything 45 Slide from Marta Tatu Turney Step 2   Es3mate the seman3c orienta3on of the extracted phrases using Pointwise Mutual Informa3on 46 Slide from Marta Tatu Pointwise Mutual Information   Mutual informa(on: between 2 random variables X and Y   Pointwise mutual informa(on: measure of how oHen two events x and y occur, compared with what we would expect if they were independent: Weighting: Mutual Information   Pointwise mutual informa(on: measure of how oHen two events x and y occur, compared with what we would expect if they were independent: PMI( x, y ) = log 2 ( p ( x,y ) p( x ) p( y ) )   PMI between two words: how much more oHen they occur together than we would expect if they were independent € PMI( word1, word2 ) = log € ( p ( word 1 , word 2 ) 2 p ( word 1 ) p ( word 2 ) ) Turney Step 2   Seman3c Orienta3on of a phrase defined as:   Es3mate PMI by issuing queries to a search engine (Altavista, ~350 million pages) Slide from Marta Tatu 49 Turney Step 3   Calculate average seman3c orienta3on of phrases in review   Posi3ve:   Nega3ve: Phrase POS tags SO direct deposit JJ NN 1.288 local branch JJ NN 0.421 small part JJ NN 0.053 online service JJ NN 2.780 well other RB JJ 0.237 low fees JJ NNS 0.333 … true service JJ NN -0.732 other bank JJ NN -0.850 inconveniently located RB VBN -1.541 Average Semantic Orientation Slide adapted from Marta Tatu 0.322 50 Experiments   410 reviews from Epinions   170 (41%) ()   240 (59%) ()   Average phrases per review: 26   Baseline accuracy: 59% Domain Accuracy Correlation Automobiles 84.00% 0.4618 Banks 80.00% 0.6167 Movies 65.83% 0.3608 Travel Destinations 70.53% 0.4155 All 74.39% 0.5174 51 Slide from Marta Tatu It's Not You, It's Me: Automatically Extracting Social Meaning from Speed Dates Dan Jurafsky, Rajesh Ranganath, Dan McFarland Dan Jurafsky, Rajesh Ranganath, and Dan McFarland. 2009. Extrac3ng Social Meaning: Iden3fying Interac3onal Style in Spoken Conversa3on. Proceedings of NAACL HLT 2009. Rajesh Ranganath, Dan Jurafsky, and Dan McFarland. 2009. It's Not You, it's Me: Detec3ng Flir3ng and its Mispercep3on in Speed ­Dates. EMNLP ­2009 Detecting social meaning   Given speech and text from a conversa3on   Can we detect `styles’, like whether a speaker is   Awkward?   Flirta3ous?   Friendly?   Can we tell if the speakers like each other?   Dataset:   991 4 ­minute “speed ­dates”   Each par3cipant rated their partner and themselves for these styles Speed dating Our speed date setup The speed date setup What do you do for fun? Dance? Uh, dance, uh, I like to go, like camping. Uh, snowboarding, but I'm not good, but I like to go anyway. You like boarding. Yeah. I like to do anything. Like I, I'm up for anything. Really? Yeah. Are you open ­minded about most everything? Not everything, but a lot of stuff ­ What is not everything [laugh] I don't know. Think of something, and I'll say if I do it or not. [laugh] Okay. [unintelligible]. Skydiving. I wouldn't do skydiving I don't think. Yeah I'm afraid of heights. F: Yeah, yeah, me too. M: [laugh] Are you afraid of heights? F: [laugh] Yeah [laugh] The SpeedDate corpus   991 4 ­minute dates   3 events, each with ~20x20=400 dates, some data loss   Par3cipants: graduate student volunteers in 2005   par3cipated in return for the chance to date   Speech   ~60 hours, from shoulder sash recorders; high noise   Transcripts   ~800K words, hand ­transcribed, w/turn boundary 3mes   Surveys   (Pre ­test surveys, event scorecards, post ­test surveys)   Date percep3ons and follow ­up interest   General a•tudes, preferences, demographics   Largest experiment with audio, text, + survey info What we attempted to predict   Conversa(onal style:   How o%en did they behave in the following ways on this date?   On a scale of 1 ­10 (1=never, 10=constantly) 1.  awkward 2.  friendly 3.  flirta(ous 4.  asser3ve Features   pitch (min, mean, max, std)   intensity (min, max, mean, std)   dura3on of turn   rate of speech (words per second)   laughter   ques3ons   nega3ve emo3on (bad, weird, crazy, hate) words   storytelling words (past tense) + food words (eat, dinner)   love and sexual/emo3onal words (love, passionate, screw)   personal pronouns (I, you, we, us)   backchannels (“uh ­huh”, “yeah”)   apprecia3ons (“Wow!”, “ That’s great!”) Features extracted within turns F0 max in this turn F0 max in this turn F0 min in this turn Features: Pitch   F0 min, max, mean   Thus to compute, e.g., F0 min for a conversa3on side   Take F0 min of each turn (not coun3ng zero values)   Average over all turns in the side   “F0 min, F0 max, F0 mean”   We also compute measures of varia(on   Standard devia3on, pitch range   F0 min sd, F0 max sd, F0 mean sd   pitch range = (f0 max – f0 min) LIWC   Linguis3c Inquiry and Word Count   Pennebaker, Francis, & Booth, 2001   dic3onary of 2300 words grouped into > 70 classes   nega(ve emo(on (bad, weird, hate, problem, tough)   sexual (love, loves, lover, passion, passionate, sex,)   1st person pronouns (I me mine myself I’d I’ll I’m…)   1st person pronouns (I me mine myself I’d I’ll I’m…)   2nd person pronouns (you, you’d you’ll your you’ve…)   ingest (food, eat, eats, cook, dinner, drink, restaurant…)   swear (hell, sucks, damn, fuck,…)   … New word lists and regular expressions   Ques3ons   Academic words (research, advisor, lab)   Partying (party, wine, drunk, bar)   Sympathy   (that’s|that is|that seems|it is|that sounds)! (very|really|a little|sort of)? ! (terrible|awful|weird|sucks|a problem|tough|too bad)!   Posi3ve feedback (Oh)? (Awesome|Great|All right|Man|No kidding|wow|my god)! That (‘s|is|sounds|would be) (so|really)?! (great|funny|good|interesting|neat|amazing|nice|not bad| fun)! Architecture: 6 binary classifiers   Female ±Awkward, Male ±Awkward,   Female ±Friendly, Male ±Friendly,   Female ±Flirta(ous, Male ±Flirta(ous,   Mul(ple classifier experiments   L1 ­regularized logis3c regression   SVM w/RBF kernel Results with SVM: predicting flirt intention   Using my speech to predict whether I say I am flir3ng I say I’m flir3ng Male speaker 72% Female speaker 76% Results with SVM: Predicting flirt perception   Using my speech to predict whether partner says I am flir3ng Male speaker Partner says 80% I’m flir3ng Female speaker 68% Summary: flirt detection   Using my speech to predict whether I am flir3ng Male speaker 72% I say I’m flir3ng Partner says 80% I’m flir3ng Female speaker 76% 68% Fine, but how good is 72 or 76?   In NLP we use human performance as a “ceiling”   Checking human performance:   If John says Jane is flir3ng   And Jane says Jane is flir3ng   Then we say John is right. Male speaker Female speaker (female perceiver) (male perceiver) 64% 57% Implication #1  Females are beeer than males at detec3ng flir3ng  or males give off clearer flir3ng cues Male speaker Female speaker (female perceiver) (male perceiver) 64% 57% Implication #2: Machines are better than humans at detecting flirting Computer detector Human detector Overall Male Female speaker speaker 74% 72% 76% 61% 64% 57% How can this be?   Why are humans so bad at detec3ng flirta3on?   Intui3on: I am flir(ng Other is flir(ng Male 101 says: 8 7 Female 127 says: 1 1 What correlates with my perception of others flirting   Pearson correla3on coefficients Variable How I see other flir3ng & How other sees themself flir3ng How I see other flir3ng & How I see myself flir3ng ρ .15 .73 What correlates with my perception of others style   Pearson correla3on coefficients Variable Flir3ng My percep(on of other & self ­inten(on My percep(on of other & other ­inten(on .73 .15 Friendly .77 .05 Awkward .58 .07 Asser3ve .58 .09 “It’s not you, it’s me”   My percep3on of whether my date is flir3ng   Is the same as my percep3on of whether I am flir3ng   Why?   Speakers aren’t very good at capturing inten3ons of others in 4 minutes   Speakers instead base judgments on their own behavior/inten3ons Gender differences in flirt intention   Both genders when flir(ng:  raise their minimum pitch  say “you know” and “like” and “I mean”  use the word “I” more   Women when flir(ng:  don’t ask ques3ons  talk about bars and drinking   Men when flir(ng:  raise their pitch floor  don’t use words related to academics Likely (positive or negative) words for flirting   More likely to flirt:   Less likely to flirt:   party   academia   wine   interview   bar   teacher   alcohol   phd   drinks   advisor   beer   lab   bars   research   drunk   management Detecting awkward and friendly speakers   Using what I do & what my date does to predict what my date calls me   Simpler (logis3c regression) classifier Awkward Friendly M F M F 51 72 68 64 73 75 Using speaker 63% words/speech + partner 64 words/speech What makes someone seem friendly? “Collaborative conversational style”   Repeat ques3ons   F: I'm working at PoJery Barn this summer.   M: I'm sorry, who?   Other ques3ons   You   Laughter   Apprecia3ons (for women)   Overlaps (for men) Work in progress: Can we predict liking?   That is, can we predict the binary variable:  ‘willing to give this person my email’   Either for a single speaker (baseline 53%=no)   Or for a dyad (baseline 81% = no) What you do when you like someone: Preliminary results   Men when they like their date  Are sympathe3c   M: “Oh wow, that’s terrible”   M: “ That is awful”   M: “Wow, are you serious?”  Don’t talk about academics   Women when they like their date  vary their pitch and loudness more,  raise their max pitch  use “you know”, “I mean”, and “like” Who do you say yes to? (Preliminary)   Men say yes to women who:   don’t talk about academics   give posi3ve feedback (“that’s great!”)   talk about themselves   use more “no” and “not”   repeat the man’s words   Women say yes to men who:   are sympathe3c   talk about themselves   use more “no” and “not”   repeat the woman’s words   vary their pitch more on hedges   hep://blog.okcupid.com/index.php/online ­da3ng ­ advice ­exactly ­what ­to ­say ­in ­a ­first ­message/ Conclusions – for daters  Talking about your advisor is a bad idea on a date  Sympathy is a good idea, if you’re a guy  Listen to the other person and follow up their topics  Be nega3ve Conclusions – for psychology  Humans project their internal state on others  Men and women (at least in 4 minutes) seem to focus on the wrong verbal cues to flirta3on Conclusions – for computer science  We can do automa3c extrac3on of rich social variables from speech and text.  For at least one variable (“does speaker intend to flirt”) we beat human performance Conclusion   For Machines:   Automa3c extrac3on of rich social variables from speech and text.   For some variables (“does speaker intend to flirt”) we beat human performance   This is rare in AI, especially rare in NLP.   For humans   Speakers project their internal state on others   Men and women (in 4 minutes) seem to focus on the wrong cues to flirta3on Summary on Sentiment and Style   Func3on words are a good cue to iden3ty   All words work well for some tasks   Finding subsets of words may help in other tasks   Other features may also help   Ques3ons   Length of sentences   Speech features ...
View Full Document

This document was uploaded on 06/01/2011.

Ask a homework question - tutors are online