languagemodeling

Dan jurafsky intui1on of perplexity the shannon game

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: d is our model? •  Does our language model prefer good sentences to bad ones? •  Assign higher probability to “real” or “frequently observed” sentences •  Than “ungramma*cal” or “rarely observed” sentences? •  We train parameters of our model on a training set. •  We test the model’s performance on data we haven’t seen. •  A test set is an unseen dataset that is different from our training set, totally unused. •  An evalua1on metric tells us how well our model does on the test set. Dan Jurafsky Extrinsic evalua1on of N ­gram models •  Best evalua*on for comparing models A and B •  Put each model in a task •  spelling corrector, speech recognizer, MT system •  Run the task, get an accuracy for A and for B •  How many misspelled words corrected properly •  How many words translated correctly •  Compare accuracy for A and B Dan Jurafsky Difficulty of extrinsic (in ­vivo) evalua1on of N ­gram models •  Extrinsic evalua*on •  Time ­consuming; can take days or weeks •  So •  Some*mes use intrinsic evalua*on: perplexity •  Bad approxima*on •  unless the test data looks just like the training data •  So generally only useful in pilot experiments •  But is helpful to think about. Dan Jurafsky Intui1on of Perplexity •  The Shannon Game: •  How well can we predict the next word? I always order pizza with cheese and ____ The 33rd President of the US was ____ I saw a ____ •  Unigrams are terrible at this game. (Why?) mushrooms 0.1 pepperoni 0.1 Claude Shannon anchovies 0.01 …. fried rice 0.0001 …. and 1e-100 •  A be_er model of a text •  is one which assigns a higher probability to the word that actu...
View Full Document

Ask a homework question - tutors are online