scholarly journals A Document-Level Neural Machine Translation Model with Dynamic Caching Guided by Theme-Rheme Information

Author(s):  
Yiqi Tong ◽  
Jiangbin Zheng ◽  
Hongkang Zhu ◽  
Yidong Chen ◽  
Xiaodong Shi
Author(s):  
Xiaomian Kang ◽  
Yang Zhao ◽  
Jiajun Zhang ◽  
Chengqing Zong

Document-level neural machine translation (DocNMT) has yielded attractive improvements. In this article, we systematically analyze the discourse phenomena in Chinese-to-English translation, and focus on the most obvious ones, namely lexical translation consistency. To alleviate the lexical inconsistency, we propose an effective approach that is aware of the words which need to be translated consistently and constrains the model to produce more consistent translations. Specifically, we first introduce a global context extractor to extract the document context and consistency context, respectively. Then, the two types of global context are integrated into a encoder enhancer and a decoder enhancer to improve the lexical translation consistency. We create a test set to evaluate the lexical consistency automatically. Experiments demonstrate that our approach can significantly alleviate the lexical translation inconsistency. In addition, our approach can also substantially improve the translation quality compared to sentence-level Transformer.


2018 ◽  
Author(s):  
Sachith Sri Ram Kothur ◽  
Rebecca Knowles ◽  
Philipp Koehn

2018 ◽  
Author(s):  
Lesly Miculicich ◽  
Dhananjay Ram ◽  
Nikolaos Pappas ◽  
James Henderson

Author(s):  
Binh Nguyen ◽  
Binh Le ◽  
Long H.B. Nguyen ◽  
Dien Dinh

 Word representation plays a vital role in most Natural Language Processing systems, especially for Neural Machine Translation. It tends to capture semantic and similarity between individual words well, but struggle to represent the meaning of phrases or multi-word expressions. In this paper, we investigate a method to generate and use phrase information in a translation model. To generate phrase representations, a Primary Phrase Capsule network is first employed, then iteratively enhancing with a Slot Attention mechanism. Experiments on the IWSLT English to Vietnamese, French, and German datasets show that our proposed method consistently outperforms the baseline Transformer, and attains competitive results over the scaled Transformer with two times lower parameters.


2018 ◽  
Vol 9 (28) ◽  
pp. 6091-6098 ◽  
Author(s):  
Philippe Schwaller ◽  
Théophile Gaudin ◽  
Dávid Lányi ◽  
Costas Bekas ◽  
Teodoro Laino

Using a text-based representation of molecules, chemical reactions are predicted with a neural machine translation model borrowed from language processing.


2019 ◽  
Vol 1237 ◽  
pp. 052020
Author(s):  
Mengyao Chen ◽  
Yong Li ◽  
Runqi Li

Author(s):  
Zi-Yi Dou ◽  
Zhaopeng Tu ◽  
Xing Wang ◽  
Longyue Wang ◽  
Shuming Shi ◽  
...  

With the promising progress of deep neural networks, layer aggregation has been used to fuse information across layers in various fields, such as computer vision and machine translation. However, most of the previous methods combine layers in a static fashion in that their aggregation strategy is independent of specific hidden states. Inspired by recent progress on capsule networks, in this paper we propose to use routing-by-agreement strategies to aggregate layers dynamically. Specifically, the algorithm learns the probability of a part (individual layer representations) assigned to a whole (aggregated representations) in an iterative way and combines parts accordingly. We implement our algorithm on top of the state-of-the-art neural machine translation model TRANSFORMER and conduct experiments on the widely-used WMT14 sh⇒German and WMT17 Chinese⇒English translation datasets. Experimental results across language pairs show that the proposed approach consistently outperforms the strong baseline model and a representative static aggregation model.


Sign in / Sign up

Export Citation Format

Share Document