A study of BERT for context-aware neural machine translation

2022 ◽  
Author(s):  
Xueqing Wu ◽  
Yingce Xia ◽  
Jinhua Zhu ◽  
Lijun Wu ◽  
Shufang Xie ◽  
...  
Electronics ◽  
2021 ◽  
Vol 10 (13) ◽  
pp. 1589
Author(s):  
Yongkeun Hwang ◽  
Yanghoon Kim ◽  
Kyomin Jung

Neural machine translation (NMT) is one of the text generation tasks which has achieved significant improvement with the rise of deep neural networks. However, language-specific problems such as handling the translation of honorifics received little attention. In this paper, we propose a context-aware NMT to promote translation improvements of Korean honorifics. By exploiting the information such as the relationship between speakers from the surrounding sentences, our proposed model effectively manages the use of honorific expressions. Specifically, we utilize a novel encoder architecture that can represent the contextual information of the given input sentences. Furthermore, a context-aware post-editing (CAPE) technique is adopted to refine a set of inconsistent sentence-level honorific translations. To demonstrate the efficacy of the proposed method, honorific-labeled test data is required. Thus, we also design a heuristic that labels Korean sentences to distinguish between honorific and non-honorific styles. Experimental results show that our proposed method outperforms sentence-level NMT baselines both in overall translation quality and honorific translations.


Author(s):  
Hongfei Xu ◽  
Deyi Xiong ◽  
Josef van Genabith ◽  
Qiuhui Liu

Existing Neural Machine Translation (NMT) systems are generally trained on a large amount of sentence-level parallel data, and during prediction sentences are independently translated, ignoring cross-sentence contextual information. This leads to inconsistency between translated sentences. In order to address this issue, context-aware models have been proposed. However, document-level parallel data constitutes only a small part of the parallel data available, and many approaches build context-aware models based on a pre-trained frozen sentence-level translation model in a two-step training manner. The computational cost of these approaches is usually high. In this paper, we propose to make the most of layers pre-trained on sentence-level data in contextual representation learning, reusing representations from the sentence-level Transformer and significantly reducing the cost of incorporating contexts in translation. We find that representations from shallow layers of a pre-trained sentence-level encoder play a vital role in source context encoding, and propose to perform source context encoding upon weighted combinations of pre-trained encoder layers' outputs. Instead of separately performing source context and input encoding, we propose to iteratively and jointly encode the source input and its contexts and to generate input-aware context representations with a cross-attention layer and a gating mechanism, which resets irrelevant information in context encoding. Our context-aware Transformer model outperforms the recent CADec [Voita et al., 2019c] on the English-Russian subtitle data and is about twice as fast in training and decoding.


2021 ◽  
Author(s):  
Linqing Chen ◽  
Junhui Li ◽  
Zhengxian Gong ◽  
Boxing Chen ◽  
Weihua Luo ◽  
...  

2019 ◽  
Author(s):  
Takumi Ohtani ◽  
Hidetaka Kamigaito ◽  
Masaaki Nagata ◽  
Manabu Okumura

2020 ◽  
Author(s):  
Bei Li ◽  
Hui Liu ◽  
Ziyang Wang ◽  
Yufan Jiang ◽  
Tong Xiao ◽  
...  

2017 ◽  
Vol 25 (12) ◽  
pp. 2424-2432 ◽  
Author(s):  
Biao Zhang ◽  
Deyi Xiong ◽  
Jinsong Su ◽  
Hong Duan

Sign in / Sign up

Export Citation Format

Share Document