Lecture09-lmintro

Lecture09-lmintro - 9/19/2011 This work is licensed under a...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
9/19/2011 1 CS 479, section 1: Natural Language Processing Lecture #9: Language Modeling Intro. Thanks to Dan Klein of UC Berkeley for many of the materials used in this lecture. This work is licensed under a Creative Commons Attribution-Share Alike 3.0 Unported License . Announcements HW #0, Part 2 Due: today by 5pm Questions? Project #1, Part 1 Build an interpolated language model Pair programming! Pair up as teams by Wednesday Help session: Thursday (9/22) at 4pm in 1066 TMCB (Windows Help Lab) – new location Reading Report #5 M&S 6.3 end Due: next Monday Goals Introduce language modeling Understand language models as simple graphical models See examples of text generated from such models Discuss metrics for language models Motivate smoothing of language models How’d they do that? http://draft.blogger.com Hit the blue “pencil”/scribe button. Based on the former Google Labs “Scribe” project. Probabilistic Language Models Goal: to build models which assign scores to sentences. P(I saw a van) >> P(eyes awe of an) Not really grammaticality: P(artichokes intimidate zippers) 0 One option: empirical distribution over sentences? What’s that? Empirical Distribution Example corpus of N sentences: “the dog ran home” </s> “the dog ran to work” </s> Probabilities: P(“the dog ran home” </s>) = C(“the dog ran home” </s>) / N P(“the dog ran to work” </s>) = C(“the dog ran home” </s>) / N Using this model, what is P(“the dog ran to the boy” </s>) ? Also called: Maximum Likelihood Estimate (MLE)
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
9/19/2011 2 Probabilistic Language Models Goal: to build models which assign scores to sentences. P(I saw a van) >> P(eyes awe of an) Not really grammaticality: P(artichokes intimidate zippers) 0 One option: empirical distribution over sentences? Problem: doesn’t generalize (at all) How to generalize? Decomposition : sentences generated in small steps (e.g., indiv. words); steps can be recombined in other ways Smoothing : allow for the possibility of unseen events Other ideas? Let’s see … N Gram Language Models No loss of generality to break sentence probability down with the chain rule: Histories are too long and probably too infrequent!
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 10/18/2011 for the course CS 479 taught by Professor Ericringger during the Fall '11 term at BYU.

Page1 / 3

Lecture09-lmintro - 9/19/2011 This work is licensed under a...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online