33 sentiment analysis a baseline algorithm sentiment

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: ­ Third Conference on Email and An+ ­Spam. K. ­M. Schneider. 2004. On word frequency informa+on and nega+ve evidence in Naive Bayes text classifica+on. ICANLP, 474 ­485. JD Rennie, L Shih, J Teevan. 2003. Tackling the poor assump+ons of naive bayes text classifiers. ICML 2003 •  Binary seems to work beXer than full word counts •  This is not the same as Mul+variate Bernoulli Naïve Bayes •  MBNB doesn’t work well for sen+ment or other text tasks •  Other possibility: log(freq(w)) 29 Dan Jurafsky Cross ­Valida%on Iteration •  Break up data into 10 folds •  (Equal posi+ve and nega+ve inside each fold?) 1 Test Training 2 Training Test •  For each fold •  Choose the fold as a temporary test set •  Train on 9 folds, compute performance on the test fold •  Report average performance of the 10 runs 3 4 5 Training Test Training Training Training Test Test Dan Jurafsky Other issues in Classifica%on •  MaxEnt and SVM tend to do beXer than Naïve Bayes 31 Dan Jurafsky Problems: What makes reviews hard to classify? •  Subtlety: •  Perfume review in Perfumes: the Guide: •  “If you are reading this because it is your darling fragrance, please wear it at home exclusively, and tape the windows shut.” •  Dorothy Parker on Katherine Hepburn •  “She runs the gamut of emo+ons from A to B” 32 Dan Jurafsky Thwarted Expecta%ons and Ordering Effects •  “This film should be brilliant. It sounds like a great plot, the actors are first grade, and the suppor+ng cast is good as well, and Stallone is aXemp+ng to deliver a good performance. However, it can’t hold up.” •  Well as usual Keanu Reeves is nothing special, but surprisingly, the very talented La...
View Full Document

{[ snackBarMessage ]}

Ask a homework question - tutors are online