Unformatted text preview: asa verde casa verde la casa la casa 1/2 1/2 1/2 1/2 verde casa la Compute 1/2 0 green 1/2 weighted 1/2 + 1/2 1/2 house 1/2 transla1on 1/2 1/2 the 0 counts C(fj,ea(j)) += P(a|e,f) verde casa la green Normalize 1/2 1/2 0 rows to sum house 1/4 1/2 1/4 to one to the 0 1/2 1/2 es1mate P(f | e) Dan Jurafsky EM example con\$nued J P( A, F | E ) = ! t ( f j ea j ) j =1 verde casa la green 1/2 Transla1on house 1/4 Probabili1es the 0 1/2 0 1/2 1/4 1/2 P( A E, F ) = P( A, F E ) 1/2 ! A P( A, F E ) Recompute green house green house the house the house Alignment casa verde casa verde la casa la casa Probabili1es P(A, F | E) ½ × ¼ =⅛ ½ × ½ =¼ ½ × ½ =¼ ½ × ¼=⅛ Normalize to get P(A | F, E) 1/ 8 1 = 3/8 3 1/ 4 2 = 3/8 3 1/ 4 2 = 3/8 3 1/ 8 1 = 3/8 3 Con1nue EM itera1ons un1l transla1on parameters converge Machine Translation Learning Word Alignments in IBM Model 1 Machine Translation Phrase Alignments and the Phrase Table Dan Jurafsky The Transla\$on Phrase Table Philipp Koehn’s phrase transla1ons for den Vorschlag Learned from the Europarl corpus (this table is φ(ē|f); normally we want φ(f |ē)): English the proposal ‘s proposal a proposal the idea this proposal proposal of the proposal the proposals φ(ē|f) 0.6227 0.1068 0.0341 0.0250 0.0227 0.0205 0.0159 0.0159 English the sugges1ons the proposed the mo1on the idea of the proposal its proposal it … φ(ē|f) 0.0114 0.0114 0.0091 0.0091 0.0068 0.0068 0.0068 … Dan Jurafsky Learning the Transla\$on Phrase Table 1.  Get a bitext (a parallel corpus) 2.  Align the sentences → E ­F sentence pairs 3.  Use IBM Model 1 to learn word alignments E→F and F →E 4.  Symmetrize the alignments 5. Extract phrases 6.  Assign scores Dan Jurafsky Step 1: Parallel corpora...
