Transformation from Discontinuous to Continuous Word Alignment Improves Translation Quality

Alignment-Enhanced Transformer for Constraining NMT with Pre-Specified Translations

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6418 ◽

2020 ◽

Vol 34 (05) ◽

pp. 8886-8893

Author(s):

Kai Song ◽

Kun Wang ◽

Heng Yu ◽

Yue Zhang ◽

Zhongqiang Huang ◽

...

Keyword(s):

Industrial Applications ◽

Practical Significance ◽

Word Alignment ◽

Translation Quality ◽

Highly Effective ◽

Word Alignments ◽

Lexical Constraints

We investigate the task of constraining NMT with pre-specified translations, which has practical significance for a number of research and industrial applications. Existing works impose pre-specified translations as lexical constraints during decoding, which are based on word alignments derived from target-to-source attention weights. However, multiple recent studies have found that word alignment derived from generic attention heads in the Transformer is unreliable. We address this problem by introducing a dedicated head in the multi-head Transformer architecture to capture external supervision signals. Results on five language pairs show that our method is highly effective in constraining NMT with pre-specified translations, consistently outperforming previous methods in translation quality.

Download Full-text

Corpus Augmentation for Neural Machine Translation with Chinese-Japanese Parallel Corpora

Applied Sciences ◽

10.3390/app9102036 ◽

2019 ◽

Vol 9 (10) ◽

pp. 2036

Author(s):

Jinyi Zhang ◽

Tadahiro Matsumoto

Keyword(s):

Machine Translation ◽

Scientific Paper ◽

Training Data ◽

Word Alignment ◽

Sentence Pair ◽

Neural Machine Translation ◽

Parallel Corpora ◽

Translation Quality ◽

Parallel Data ◽

Source Sentence

The translation quality of Neural Machine Translation (NMT) systems depends strongly on the training data size. Sufficient amounts of parallel data are, however, not available for many language pairs. This paper presents a corpus augmentation method, which has two variations: one is for all language pairs, and the other is for the Chinese-Japanese language pair. The method uses both source and target sentences of the existing parallel corpus and generates multiple pseudo-parallel sentence pairs from a long parallel sentence pair containing punctuation marks as follows: (1) split the sentence pair into parallel partial sentences; (2) back-translate the target partial sentences; and (3) replace each partial sentence in the source sentence with the back-translated target partial sentence to generate pseudo-source sentences. The word alignment information, which is used to determine the split points, is modified with “shared Chinese character rates” in segments of the sentence pairs. The experiment results of the Japanese-Chinese and Chinese-Japanese translation with ASPEC-JC (Asian Scientific Paper Excerpt Corpus, Japanese-Chinese) show that the method substantially improves translation performance. We also supply the code (see Supplementary Materials) that can reproduce our proposed method.

Download Full-text

Large-scale Word Alignment Using Soft Dependency Cohesion Constraints

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00228 ◽

2013 ◽

Vol 1 ◽

pp. 291-300 ◽

Cited By ~ 1

Author(s):

Zhiguo Wang ◽

Chengqing Zong

Keyword(s):

Large Scale ◽

Target Language ◽

Model Parameters ◽

Word Alignment ◽

Soft Constraint ◽

Alignment Quality ◽

Source Language ◽

Discriminative Models ◽

Translation Quality ◽

Gibbs Sampling Algorithm

Dependency cohesion refers to the observation that phrases dominated by disjoint dependency subtrees in the source language generally do not overlap in the target language. It has been verified to be a useful constraint for word alignment. However, previous work either treats this as a hard constraint or uses it as a feature in discriminative models, which is ineffective for large-scale tasks. In this paper, we take dependency cohesion as a soft constraint, and integrate it into a generative model for large-scale word alignment experiments. We also propose an approximate EM algorithm and a Gibbs sampling algorithm to estimate model parameters in an unsupervised manner. Experiments on large-scale Chinese-English translation tasks demonstrate that our model achieves improvements in both alignment quality and translation quality.

Download Full-text

A Relationship: Word Alignment, Phrase Table, and Translation Quality

The Scientific World JOURNAL ◽

10.1155/2014/438106 ◽

2014 ◽

Vol 2014 ◽

pp. 1-13 ◽

Cited By ~ 3

Author(s):

Liang Tian ◽

Derek F. Wong ◽

Lidia S. Chao ◽

Francisco Oliveira

Keyword(s):

Machine Translation ◽

Ad Hoc ◽

Significant Loss ◽

The Other ◽

Word Alignment ◽

Translation Quality ◽

Theoretical Support ◽

Word Alignments ◽

The Relationship ◽

Pruning Technique

In the last years, researchers conducted several studies to evaluate the machine translation quality based on the relationship between word alignments and phrase table. However, existing methods usually employ ad-hoc heuristics without theoretical support. So far, there is no discussion from the aspect of providing a formula to describe the relationship among word alignments, phrase table, and machine translation performance. In this paper, on one hand, we focus on formulating such a relationship for estimating the size of extracted phrase pairs given one or more word alignment points. On the other hand, a corpus-motivated pruning technique is proposed to prune the default large phrase table. Experiment proves that the deduced formula is feasible, which not only can be used to predict the size of the phrase table, but also can be a valuable reference for investigating the relationship between the translation performance and phrase tables based on different links of word alignment. The corpus-motivated pruning results show that nearly 98% of phrases can be reduced without any significant loss in translation quality.

Download Full-text

Efficient Word Alignment with Markov Chain Monte Carlo

Prague Bulletin of Mathematical Linguistics ◽

10.1515/pralin-2016-0013 ◽

2016 ◽

Vol 106 (1) ◽

pp. 125-146 ◽

Cited By ~ 1

Author(s):

Robert Östling ◽

Jörg Tiedemann

Keyword(s):

Monte Carlo ◽

Markov Chain ◽

Markov Chain Monte Carlo ◽

Critical Word ◽

Statistical Machine Translation ◽

Monte Carlo Sampling ◽

Word Alignment ◽

Translation Quality ◽

Word Alignments ◽

Selection Of

Abstract We present EFMARAL, a new system for efficient and accurate word alignment using a Bayesian model with Markov Chain Monte Carlo (MCMC) inference. Through careful selection of data structures and model architecture we are able to surpass the fast_align system, commonly used for performance-critical word alignment, both in computational efficiency and alignment accuracy. Our evaluation shows that a phrase-based statistical machine translation (SMT) system produces translations of higher quality when using word alignments from EFMARAL than from fast_align, and that translation quality is on par with what is obtained using GIZA++, a tool requiring orders of magnitude more processing time. More generally we hope to convince the reader that Monte Carlo sampling, rather than being viewed as a slow method of last resort, should actually be the method of choice for the SMT practitioner and others interested in word alignment.

Download Full-text