Dan jurafsky faithfulness pfe spanish maria no

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: afsky Fluency: P(E) •  We need a metric that ranks this sentence That car almost crash to me! as less fluent than this one: That car almost hit me.! •  Answer: language models (N ­grams!) P(me|hit) > P(to|crash) •  And we can use any other more sophis1cated model of grammar •  Advantage: this is monolingual knowledge! Dan Jurafsky Faithfulness: P(F|E) •  Spanish: •  Maria no dió una bofetada a la bruja verde •  English candidate transla1ons: •  Mary didn’t slap the green witch •  Mary not give a slap to the witch green •  The green witch didn’t slap Mary •  Mary slapped the green witch •  More faithful transla1ons will be composed of phrases that are high probability transla1ons •  How ofen was “slapped” translated as “dió una bofetada” in a large bitext (parallel English ­Spanish corpus) •  We’ll need to align phrases and words to each other in bitext Dan Jurafsky We treat Faithfulness and Fluency as independent factors •  P(F|E)’s job is to model “bag of words”; which words come from English to Spanish. •  P(F|E) doesn’t have to worry about internal facts about English word order. •  P(E)’s job is to do bag genera1on: put the following words in order: •  a ground there in the hobbit hole lived a in Dan Jurafsky Three Problems for Sta$s$cal MT •  Language Model: given E, compute P(E) good English string → high P(E) random word sequence → low P(E) •  Transla1on Model: given (F,E) compute P(F | E) (F,E) look like transla1ons → high P(F | E) (F.E) don’t look like transla1ons → low P(F | E) •  Decoding algorithm: given LM, TM, F, find Ê Find transla1on E that maximizes P(E) * P(F | E) Dan Jurafsky Language Model •  Use a standard n ­gram language model for P(E). •  Can be trained on a large mono ­lingual corpus •  5 ­gram grammar of English from terabytes of web data •  More sophis1cated parser ­based language models can also help Machine Translation Introduction to Statistical MT Machine Translation Alignment and IBM Model 1 Dan Jurafs...
View Full Document

Ask a homework question - tutors are online