# Ldcupennedu large amounts of parallel englishchinese

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: asa verde casa verde la casa la casa 1/2 1/2 1/2 1/2 verde casa la Compute 1/2 0 green 1/2 weighted 1/2 + 1/2 1/2 house 1/2 transla1on 1/2 1/2 the 0 counts C(fj,ea(j)) += P(a|e,f) verde casa la green Normalize 1/2 1/2 0 rows to sum house 1/4 1/2 1/4 to one to the 0 1/2 1/2 es1mate P(f | e) Dan Jurafsky EM example con\$nued J P( A, F | E ) = ! t ( f j ea j ) j =1 verde casa la green 1/2 Transla1on house 1/4 Probabili1es the 0 1/2 0 1/2 1/4 1/2 P( A E, F ) = P( A, F E ) 1/2 ! A P( A, F E ) Recompute green house green house the house the house Alignment casa verde casa verde la casa la casa Probabili1es P(A, F | E) ½ × ¼ =⅛ ½ × ½ =¼ ½ × ½ =¼ ½ × ¼=⅛ Normalize to get P(A | F, E) 1/ 8 1 = 3/8 3 1/ 4 2 = 3/8 3 1/ 4 2 = 3/8 3 1/ 8 1 = 3/8 3 Con1nue EM itera1ons un1l transla1on parameters converge Machine Translation Learning Word Alignments in IBM Model 1 Machine Translation Phrase Alignments and the Phrase Table Dan Jurafsky The Transla\$on Phrase Table Philipp Koehn’s phrase transla1ons for den Vorschlag Learned from the Europarl corpus (this table is φ(ē|f); normally we want φ(f |ē)): English the proposal ‘s proposal a proposal the idea this proposal proposal of the proposal the proposals φ(ē|f) 0.6227 0.1068 0.0341 0.0250 0.0227 0.0205 0.0159 0.0159 English the sugges1ons the proposed the mo1on the idea of the proposal its proposal it … φ(ē|f) 0.0114 0.0114 0.0091 0.0091 0.0068 0.0068 0.0068 … Dan Jurafsky Learning the Transla\$on Phrase Table 1.  Get a bitext (a parallel corpus) 2.  Align the sentences → E ­F sentence pairs 3.  Use IBM Model 1 to learn word alignments E→F and F →E 4.  Symmetrize the alignments 5. Extract phrases 6.  Assign scores Dan Jurafsky Step 1: Parallel corpora...
View Full Document

## This document was uploaded on 02/14/2014.

Ask a homework question - tutors are online