languagemodeling

# K wi wi1 i dan jurafsky unknown words open versus

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 3 allega*ons 2 reports 1 claims 1 request 7 total allegations When we have sparse sta*s*cs: allegations allegations •  Dan Jurafsky Add ­one es1ma1on •  Also called Laplace smoothing •  Pretend we saw each word one more *me than we did •  Just add one to all the counts! c(wi!1, wi ) PMLE (wi | wi!1 ) = c(wi!1 ) •  MLE es*mate: •  Add ­1 es*mate: c(wi!1, wi ) + 1 PAdd !1 (wi | wi!1 ) = c(wi!1 ) + V Dan Jurafsky Maximum Likelihood Es1mates •  The maximum likelihood es*mate •  of some parameter of a model M from a training set T •  maximizes the likelihood of the training set T given the model M •  Suppose the word “bagel” occurs 400 *mes in a corpus of a million words •  What is the probability that a random word from some other text will be “bagel”? •  MLE es*mate is 400/1,000,000 = .004 •  This may be a bad es*mate for some other corpus •  But it is the es1mate that makes it most likely that “bagel” will occur 400 *mes in a million word corpus. Dan Jurafsky Berkeley Restaurant Corpus: Laplace smoothed bigram counts Dan Jurafsky Laplace-smoothed bigrams Dan Jurafsky Reconstituted counts Dan Jurafsky Compare with raw bigram counts Dan Jurafsky Add ­1 es1ma1on is a blunt instrument •  So add ­1 isn’t used for N ­grams: •  We’ll see be_er methods •  But add ­1 is used to smooth other NLP models •  For text classiﬁca*on •  In domains where the number of zeros isn’t so huge. Language Modeling Smoothing: Add ­one (Laplace) smoothing Language Modeling Interpola*on, Backoﬀ, and Web ­Scale LMs Dan Jurafsky Backoff and Interpolation •  Some*mes it helps to use less context •  Condi*on on less context for contexts you haven’t learned much about •  Backoﬀ: •  use trigram if you have goo...
View Full Document

{[ snackBarMessage ]}

### What students are saying

• As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

Kiran Temple University Fox School of Business ‘17, Course Hero Intern

• I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

Dana University of Pennsylvania ‘17, Course Hero Intern

• The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

Jill Tulane University ‘16, Course Hero Intern