languagemodeling

K wi wi1 i dan jurafsky unknown words open versus

Info icon This preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 3 allega*ons 2 reports 1 claims 1 request 7 total allegations When we have sparse sta*s*cs: allegations allegations •  Dan Jurafsky Add ­one es1ma1on •  Also called Laplace smoothing •  Pretend we saw each word one more *me than we did •  Just add one to all the counts! c(wi!1, wi ) PMLE (wi | wi!1 ) = c(wi!1 ) •  MLE es*mate: •  Add ­1 es*mate: c(wi!1, wi ) + 1 PAdd !1 (wi | wi!1 ) = c(wi!1 ) + V Dan Jurafsky Maximum Likelihood Es1mates •  The maximum likelihood es*mate •  of some parameter of a model M from a training set T •  maximizes the likelihood of the training set T given the model M •  Suppose the word “bagel” occurs 400 *mes in a corpus of a million words •  What is the probability that a random word from some other text will be “bagel”? •  MLE es*mate is 400/1,000,000 = .004 •  This may be a bad es*mate for some other corpus •  But it is the es1mate that makes it most likely that “bagel” will occur 400 *mes in a million word corpus. Dan Jurafsky Berkeley Restaurant Corpus: Laplace smoothed bigram counts Dan Jurafsky Laplace-smoothed bigrams Dan Jurafsky Reconstituted counts Dan Jurafsky Compare with raw bigram counts Dan Jurafsky Add ­1 es1ma1on is a blunt instrument •  So add ­1 isn’t used for N ­grams: •  We’ll see be_er methods •  But add ­1 is used to smooth other NLP models •  For text classifica*on •  In domains where the number of zeros isn’t so huge. Language Modeling Smoothing: Add ­one (Laplace) smoothing Language Modeling Interpola*on, Backoff, and Web ­Scale LMs Dan Jurafsky Backoff and Interpolation •  Some*mes it helps to use less context •  Condi*on on less context for contexts you haven’t learned much about •  Backoff: •  use trigram if you have goo...
View Full Document

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern