2014 ◽  
Author(s):  
Yin-Wen Chang ◽  
Alexander M. Rush ◽  
John DeNero ◽  
Michael Collins
Keyword(s):  

2021 ◽  
Vol 11 (4) ◽  
pp. 1868
Author(s):  
Sari Dewi Budiwati ◽  
Al Hafiz Akbar Maulana Siagian ◽  
Tirana Noor Fatyanosa ◽  
Masayoshi Aritsugi

Phrase table combination in pivot approaches can be an effective method to deal with low-resource language pairs. The common practice to generate phrase tables in pivot approaches is to use standard symmetrization, i.e., grow-diag-final-and. Although some researchers found that the use of non-standard symmetrization could improve bilingual evaluation understudy (BLEU) scores, the use of non-standard symmetrization has not been commonly employed in pivot approaches. In this study, we propose a strategy that uses the non-standard symmetrization of word alignment in phrase table combination. The appropriate symmetrization is selected based on the highest BLEU scores in each direct translation of source–target, source–pivot, and pivot–target of Kazakh–English (Kk–En) and Japanese–Indonesian (Ja–Id). Our experiments show that our proposed strategy outperforms the direct translation in Kk–En with absolute improvements of 0.35 (a 11.3% relative improvement) and 0.22 (a 6.4% relative improvement) BLEU points for 3-gram and 5-gram, respectively. The proposed strategy shows an absolute gain of up to 0.11 (a 0.9% relative improvement) BLEU points compared to direct translation for 3-gram in Ja–Id. Our proposed strategy using a small phrase table obtains better BLEU scores than a strategy using a large phrase table. The size of the target monolingual and feature function weight of the language model (LM) could reduce perplexity scores.


Author(s):  
Peng Yin ◽  
Zhou Shu ◽  
Yingjun Xia ◽  
Tianmei Shen ◽  
Xiao Guan ◽  
...  
Keyword(s):  

2010 ◽  
Vol 36 (3) ◽  
pp. 295-302 ◽  
Author(s):  
Sujith Ravi ◽  
Kevin Knight

Word alignment is a critical procedure within statistical machine translation (SMT). Brown et al. (1993) have provided the most popular word alignment algorithm to date, one that has been implemented in the GIZA (Al-Onaizan et al., 1999) and GIZA++ (Och and Ney 2003) software and adopted by nearly every SMT project. In this article, we investigate whether this algorithm makes search errors when it computes Viterbi alignments, that is, whether it returns alignments that are sub-optimal according to a trained model.


2016 ◽  
Vol 23 (4) ◽  
pp. 327-351
Author(s):  
Hidetaka Kamigaito ◽  
Taro Watanabe ◽  
Hiroya Takamura ◽  
Manabu Okumura ◽  
Eiichiro Sumita

2010 ◽  
Vol 36 (3) ◽  
pp. 481-504 ◽  
Author(s):  
João V. Graça ◽  
Kuzman Ganchev ◽  
Ben Taskar

Word-level alignment of bilingual text is a critical resource for a growing variety of tasks. Probabilistic models for word alignment present a fundamental trade-off between richness of captured constraints and correlations versus efficiency and tractability of inference. In this article, we use the Posterior Regularization framework (Graça, Ganchev, and Taskar 2007) to incorporate complex constraints into probabilistic models during learning without changing the efficiency of the underlying model. We focus on the simple and tractable hidden Markov model, and present an efficient learning algorithm for incorporating approximate bijectivity and symmetry constraints. Models estimated with these constraints produce a significant boost in performance as measured by both precision and recall of manually annotated alignments for six language pairs. We also report experiments on two different tasks where word alignments are required: phrase-based machine translation and syntax transfer, and show promising improvements over standard methods.


Sign in / Sign up

Export Citation Format

Share Document