Activate Activate Activate
contact  
Hello. Sign in to personalize your visit. New user? Register now.  

In
By author
Computational Linguistics

Quarterly (March, June, September, December)
160 pp. per issue
6 3/4 x 10
Founded: 1974
ISSN 0891-2017
E-ISSN 1530-9312
2008 ISI Impact Factor: 2.656

Computational Linguistics

March 2003, Vol. 29, No. 1, Pages 19-51
Posted Online March 13, 2006.
(doi:10.1162/089120103321337421)
© 2003 Association for Computational Linguistics
A Systematic Comparison of Various Statistical Alignment Models

Franz Josef Och

University of Southern California, Information Science Institute (USC/ISI), 4029 Via Marina, Suite 1001, Marina del Rey, CA 90292.

Hermann Ney

RWTH Aachen, Lehrstuhl für Informatik VI, Computer Science Department, RWTH Aachen-University of Technology, D-52056 Aachen, Germany.

PDF (293.888 KB) PDF Plus (263.447 KB)

We present and compare various methods for computing word alignments using statistical or heuristic models. We consider the five alignment models presented in Brown, Della Pietra, Della Pietra, and Mercer (1993), the hidden Markov alignment model, smoothing techniques, and refinements. These statistical models are compared with two heuristic models based on the Dice coefficient. We present different methods for combining word alignments to perform a symmetrization of directed statistical alignment models. As evaluation criterion, we use the quality of the resulting Viterbi alignment compared to a manually produced reference alignment. We evaluate the models on the German-English Verbmobil task and the French-English Hansards task. We perform a detailed analysis of various design decisions of our statistical alignment system and evaluate these on training corpora of various sizes. An important result is that refined alignment models with a first-order dependence and a fertility model yield significantly better results than simple heuristic models. In the Appendix, we present an efficient training algorithm for the alignment models presented.

Cited by

Srinivas Bangalore, Michael Johnston. Robust Understanding in Multimodal Interfaces. Computational Linguistics 0:0, 1-53
Abstract | PDF (1676 KB) | PDF Plus (1694 KB) 
Sergio Barrachina, Oliver Bender, Francisco Casacuberta, Jorge Civera, Elsa Cubel, Shahram Khadivi, Antonio Lagarda, Hermann Ney, Jesús Tomás, Enrique Vidal, Juan-Miguel Vilar. (2009) Statistical Approaches to Computer-Assisted Translation. Computational Linguistics 35:1, 3-28
Online publication date: 1-Mar-2009.
Abstract | PDF (254 KB) | PDF Plus (277 KB) 
Trevor Cohn, Chris Callison-Burch, Mirella Lapata. (2008) Constructing Corpora for the Development and Evaluation of Paraphrase Systems. Computational Linguistics 34:4, 597-614
Online publication date: 1-Dec-2008.
Abstract | PDF (394 KB) | PDF Plus (413 KB) 
Alexander Fraser, Daniel Marcu. (2007) Measuring Word Alignment Quality for Statistical Machine Translation. Computational Linguistics 33:3, 293-303
Online publication date: 1-Sep-2007.
Abstract | PDF (107 KB) | PDF Plus (130 KB) 
José B. Mariño, Rafael E. Banchs, Josep M. Crego, Adrià de Gispert, Patrik Lambert, José A. R. Fonollosa, Marta R. Costa-jussà. (2006) N-gram-based Machine Translation. Computational Linguistics 32:4, 527-549
Online publication date: 1-Dec-2006.
Abstract | PDF (285 KB) | PDF Plus (288 KB) 
Hal Daumé III, Daniel Marcu. (2005) Induction of Word and Phrase Alignments for Automatic Document Summarization. Computational Linguistics 31:4, 505-530
Online publication date: 1-Dec-2005.
Abstract | PDF (504 KB) | PDF Plus (416 KB) 
Dragos Stefan Munteanu, Daniel Marcu. (2005) Improving Machine Translation Performance by Exploiting Non-Parallel Corpora. Computational Linguistics 31:4, 477-504
Online publication date: 1-Dec-2005.
Abstract | PDF (1058 KB) | PDF Plus (590 KB) 
Franz Josef Och, Hermann Ney. (2004) The Alignment Template Approach to Statistical Machine Translation. Computational Linguistics 30:4, 417-449
Online publication date: 1-Dec-2004.
Abstract | PDF (498 KB) | PDF Plus (478 KB) 
Technology Partner - Atypon Systems, Inc.
  CrossRef member COUNTER member