Improving Context-Aware Neural Machine Translation with Target-Side Context

Neural machine translation (NMT) is one of the text generation tasks which has achieved significant improvement with the rise of deep neural networks. However, language-specific problems such as handling the translation of honorifics received little attention. In this paper, we propose a context-aware NMT to promote translation improvements of Korean honorifics. By exploiting the information such as the relationship between speakers from the surrounding sentences, our proposed model effectively manages the use of honorific expressions. Specifically, we utilize a novel encoder architecture that can represent the contextual information of the given input sentences. Furthermore, a context-aware post-editing (CAPE) technique is adopted to refine a set of inconsistent sentence-level honorific translations. To demonstrate the efficacy of the proposed method, honorific-labeled test data is required. Thus, we also design a heuristic that labels Korean sentences to distinguish between honorific and non-honorific styles. Experimental results show that our proposed method outperforms sentence-level NMT baselines both in overall translation quality and honorific translations.

Download Full-text

Effective Use of Target-side Context for Neural Machine Translation

Journal of Natural Language Processing ◽

10.5715/jnlp.28.731 ◽

2021 ◽

Vol 28 (2) ◽

pp. 731-735

Author(s):

Hideya Mino

Keyword(s):

Machine Translation ◽

Neural Machine Translation ◽

Effective Use ◽

Target Side

Download Full-text

Context-Aware Neural Machine Translation Learns Anaphora Resolution

10.18653/v1/p18-1117 ◽

2018 ◽

Cited By ~ 3

Author(s):

Elena Voita ◽

Pavel Serdyukov ◽

Rico Sennrich ◽

Ivan Titov

Keyword(s):

Machine Translation ◽

Anaphora Resolution ◽

Context Aware ◽

Neural Machine Translation

Download Full-text

Efficient Context-Aware Neural Machine Translation with Layer-Wise Weighting and Input-Aware Gating

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/544 ◽

2020 ◽

Author(s):

Hongfei Xu ◽

Deyi Xiong ◽

Josef van Genabith ◽

Qiuhui Liu

Keyword(s):

Machine Translation ◽

Contextual Information ◽

Computational Cost ◽

Representation Learning ◽

Vital Role ◽

Context Aware ◽

Neural Machine Translation ◽

Gating Mechanism ◽

Sentence Level ◽

Parallel Data

Existing Neural Machine Translation (NMT) systems are generally trained on a large amount of sentence-level parallel data, and during prediction sentences are independently translated, ignoring cross-sentence contextual information. This leads to inconsistency between translated sentences. In order to address this issue, context-aware models have been proposed. However, document-level parallel data constitutes only a small part of the parallel data available, and many approaches build context-aware models based on a pre-trained frozen sentence-level translation model in a two-step training manner. The computational cost of these approaches is usually high. In this paper, we propose to make the most of layers pre-trained on sentence-level data in contextual representation learning, reusing representations from the sentence-level Transformer and significantly reducing the cost of incorporating contexts in translation. We find that representations from shallow layers of a pre-trained sentence-level encoder play a vital role in source context encoding, and propose to perform source context encoding upon weighted combinations of pre-trained encoder layers' outputs. Instead of separately performing source context and input encoding, we propose to iteratively and jointly encode the source input and its contexts and to generate input-aware context representations with a cross-attention layer and a gating mechanism, which resets irrelevant information in context encoding. Our context-aware Transformer model outperforms the recent CADec [Voita et al., 2019c] on the English-Russian subtitle data and is about twice as fast in training and decoding.

Download Full-text

Tied Transformers: Neural Machine Translation with Shared Encoder and Decoder

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33015466 ◽

2019 ◽

Vol 33 ◽

pp. 5466-5473 ◽

Cited By ~ 2

Author(s):

Yingce Xia ◽

Tianyu He ◽

Xu Tan ◽

Fei Tian ◽

Di He ◽

...

Keyword(s):

Machine Translation ◽

English Translation ◽

State Of The Art ◽

Compact Model ◽

Word Embeddings ◽

Simple Method ◽

Neural Machine Translation ◽

German Translation ◽

One Step ◽

Target Side

Sharing source and target side vocabularies and word embeddings has been a popular practice in neural machine translation (briefly, NMT) for similar languages (e.g., English to French or German translation). The success of such wordlevel sharing motivates us to move one step further: we consider model-level sharing and tie the whole parts of the encoder and decoder of an NMT model. We share the encoder and decoder of Transformer (Vaswani et al. 2017), the state-of-the-art NMT model, and obtain a compact model named Tied Transformer. Experimental results demonstrate that such a simple method works well for both similar and dissimilar language pairs. We empirically verify our framework for both supervised NMT and unsupervised NMT: we achieve a 35.52 BLEU score on IWSLT 2014 German to English translation, 28.98/29.89 BLEU scores on WMT 2014 English to German translation without/with monolingual data, and a 22.05 BLEU score on WMT 2016 unsupervised German to English translation.

Download Full-text

Synchronous Bidirectional Neural Machine Translation

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00256 ◽

2019 ◽

Vol 7 ◽

pp. 91-105 ◽

Cited By ~ 8

Author(s):

Long Zhou ◽

Jiajun Zhang ◽

Chengqing Zong

Keyword(s):

Machine Translation ◽

Large Scale ◽

State Of The Art ◽

Target Language ◽

Single Model ◽

Neural Machine Translation ◽

German Translation ◽

Transformer Model ◽

Target Side ◽

Future Information

Existing approaches to neural machine translation (NMT) generate the target language sequence token-by-token from left to right. However, this kind of unidirectional decoding framework cannot make full use of the target-side future contexts which can be produced in a right-to-left decoding direction, and thus suffers from the issue of unbalanced outputs. In this paper, we introduce a synchronous bidirectional–neural machine translation (SB-NMT) that predicts its outputs using left-to-right and right-to-left decoding simultaneously and interactively, in order to leverage both of the history and future information at the same time. Specifically, we first propose a new algorithm that enables synchronous bidirectional decoding in a single model. Then, we present an interactive decoding model in which left-to-right (right-to-left) generation does not only depend on its previously generated outputs, but also relies on future contexts predicted by right-to-left (left-to-right) decoding. We extensively evaluate the proposed SB-NMT model on large-scale NIST Chinese-English, WMT14 English-German, and WMT18 Russian-English translation tasks. Experimental results demonstrate that our model achieves significant improvements over the strong Transformer model by 3.92, 1.49, and 1.04 BLEU points, respectively, and obtains the state-of-the-art per- formance on Chinese-English and English- German translation tasks. 1

Download Full-text

A study of BERT for context-aware neural machine translation

Machine Learning ◽

10.1007/s10994-021-06070-y ◽

2022 ◽

Author(s):

Xueqing Wu ◽

Yingce Xia ◽

Jinhua Zhu ◽

Lijun Wu ◽

Shufang Xie ◽

...

Keyword(s):

Machine Translation ◽

Context Aware ◽

Neural Machine Translation

Download Full-text

Tag-less Back-Translation

10.21203/rs.3.rs-465941/v1 ◽

2021 ◽

Author(s):

Idris Abdulmumin ◽

Bashir Shehu Galadanci ◽

Aliyu Garba

Keyword(s):

Machine Translation ◽

Domain Adaptation ◽

Fine Tuning ◽

Huge Amount ◽

Neural Machine Translation ◽

Translation Model ◽

Parallel Data ◽

Back Translation ◽

Authentic Data ◽

Target Side

Abstract An effective method to generate a large number of parallel sentences for training improved neural machine translation (NMT) systems is the use of the back-translations of the target-side monolingual data. The standard back-translation method has been shown to be unable to efficiently utilize the available huge amount of existing monolingual data because of the inability of translation models to differentiate between the authentic and synthetic parallel data during training. Tagging, or using gates, has been used to enable translation models to distinguish between synthetic and authentic data, improving standard back-translation and also enabling the use of iterative back-translation on language pairs that underperformed using standard back-translation. In this work, we approach back-translation as a domain adaptation problem, eliminating the need for explicit tagging. In the approach - tag-less back-translation - the synthetic and authentic parallel data are treated as out-of-domain and in-domain data respectively and, through pre-training and fine-tuning, the translation model is shown to be able to learn more efficiently from them during training. Experimental results have shown that the approach outperforms the standard and tagged back-translation approaches on low resource English-Vietnamese and English-German neural machine translation.

Download Full-text

Breaking the Corpus Bottleneck for Context-Aware Neural Machine Translation with Cross-Task Pre-training

10.18653/v1/2021.acl-long.222 ◽

2021 ◽

Author(s):

Linqing Chen ◽

Junhui Li ◽

Zhengxian Gong ◽

Boxing Chen ◽

Weihua Luo ◽

...

Keyword(s):

Machine Translation ◽

Context Aware ◽

Neural Machine Translation

Download Full-text