VogelEtAl_HMMAlignment - H M M B a s e d Word Alignment in...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon
HMM-Based Word Alignment in Statistical Translation Stephan Vogel Hermann Ney Christoph Tillmann Lehrstuhl ffir Informatik V, RWTH Aachen D-52056 Aachen, Germany {vogel, ney, t [email protected] ormat ik. rwth-aachen, de Abstract In this paper, we describe a new model for word alignment in statistical trans- lation and present experimental results. The idea of the model is to make the alignment probabilities dependent on the differences in the alignment positions rather than on the absolute positions. To achieve this goal, the approach us- es a first-order Hidden Markov model (HMM) for the word alignment problem as they are used successfully in speech recognition for the time alignment prob- lem. The difference to the time align- ment HMM is that there is no monotony constraint for the possible word order- ings. We describe the details of the mod- el and test the model on several bilingual corpora. 1 Introduction In this paper, we address the problem of word alignments for a bilingual corpus. In the recent years, there have been a number of papers con- sidering this or similar problems: (Brown et al., 1990), (Dagan et al., 1993), (Kay et al., 1993), (Fung et al., 1993). In our approach, we use a first-order Hidden Markov model (HMM) (aelinek, 1976), which is similar, but not identical to those used in speech recognition. The key component of this approach is to make the alignment probabilities dependent not on the absolute position of the word align- ment, but on its relative position; i.e. we consider the differences in the index of the word positions rather than the index itself. The organization of the paper is as follows. After reviewing the statistical approach to ma- chine translation, we first describe the convention- al model (mixture model). We then present our first-order HMM approach in lull detail. Finally we present some experimental results and compare our model with the conventional model. 2 Review: Translation Model The goal is the translation of a text given in some language F into a target language E. For conve- nience, we choose for the following exposition as language pair French and English, i.e. we are giv- en a French string f~ = fx ...fj. ..fJ, which is to be translated into an English string e / = el. ..ei. ..cl. Among all possible English strings, we will choose the one with the highest probability which is given by Bayes' decision rule: a{ = argmax{P,.(c{lAa)} q = argmax {Pr(ejt) .l'r(f•le[)} el ~ Pr(e{) is the language model of the target lan- guage, whereas Pr(fJle{) is the string translation model. The argmax operation denotes the search problem. In this paper, we address the problem of introducing structures into the probabilistic de- pendencies in order to model the string translation probability Pr(f~ le{). 3 Alignment Models A key issne in modeling the string translation Pr(J'~le I) is the question of how we define the correspondence between the words of the English sentence and the words of the French sentence. In typical cases, we can assume a sort of pairwise dependence by considering all word pairs (fj, ei) for a given sentence pair I.-/1[~'J', elqlj' We fur-
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 2
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 10/18/2011 for the course CS 479 taught by Professor Ericringger during the Fall '11 term at BYU.

Page1 / 6

VogelEtAl_HMMAlignment - H M M B a s e d Word Alignment in...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online