Replacement of Unknown Words Using an Attention Model in Japanese to English Neural Machine Translation

Saki Ibe; Yoshitatsu Matsuda; Kazunori Yamaguchi

doi:10.5715/jnlp.25.511

Deep Recurrent Models with Fast-Forward Connections for Neural Machine Translation

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00105 ◽

2016 ◽

Vol 4 ◽

pp. 371-383 ◽

Cited By ~ 40

Author(s):

Jie Zhou ◽

Ying Cao ◽

Xuguang Wang ◽

Peng Li ◽

Wei Xu

Keyword(s):

Machine Translation ◽

Short Term Memory ◽

Short Term ◽

Neural Machine Translation ◽

Attention Model ◽

Linear Connections ◽

New Type ◽

Long Short Term Memory ◽

Unknown Words ◽

First Time

Neural machine translation (NMT) aims at solving machine translation (MT) problems using neural networks and has exhibited promising results in recent years. However, most of the existing NMT models are shallow and there is still a performance gap between a single NMT model and the best conventional MT system. In this work, we introduce a new type of linear connections, named fast-forward connections, based on deep Long Short-Term Memory (LSTM) networks, and an interleaved bi-directional architecture for stacking the LSTM layers. Fast-forward connections play an essential role in propagating the gradients and building a deep topology of depth 16. On the WMT’14 English-to-French task, we achieve BLEU=37.7 with a single attention model, which outperforms the corresponding single shallow model by 6.2 BLEU points. This is the first time that a single NMT model achieves state-of-the-art performance and outperforms the best conventional model by 0.7 BLEU points. We can still achieve BLEU=36.3 even without using an attention mechanism. After special handling of unknown words and model ensembling, we obtain the best score reported to date on this task with BLEU=40.4. Our models are also validated on the more difficult WMT’14 English-to-German task.

Download Full-text

Handling Unknown Words in Neural Machine Translation System

2020 International Conference on Decision Aid Sciences and Application (DASA) ◽

10.1109/dasa51403.2020.9317169 ◽

2020 ◽

Author(s):

Kamal Deep Garg ◽

Jatin Gupta ◽

Vandana Saini

Keyword(s):

Machine Translation ◽

Translation System ◽

Neural Machine Translation ◽

Machine Translation System ◽

Unknown Words

Download Full-text

Modeling Past and Future for Neural Machine Translation

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00011 ◽

2018 ◽

Vol 6 ◽

pp. 145-157 ◽

Cited By ~ 2

Author(s):

Zaixiang Zheng ◽

Hao Zhou ◽

Shujian Huang ◽

Lili Mou ◽

Xinyu Dai ◽

...

Keyword(s):

Machine Translation ◽

Alignment Error ◽

Source Information ◽

Neural Machine Translation ◽

Attention Model ◽

Translation Quality ◽

The Past ◽

Proposed Model ◽

Coverage Model ◽

Translation Systems

Existing neural machine translation systems do not explicitly model what has been translated and what has not during the decoding phase. To address this problem, we propose a novel mechanism that separates the source information into two parts: translated Past contents and untranslated Future contents, which are modeled by two additional recurrent layers. The Past and Future contents are fed to both the attention model and the decoder states, which provides Neural Machine Translation (NMT) systems with the knowledge of translated and untranslated contents. Experimental results show that the proposed approach significantly improves the performance in Chinese-English, German-English, and English-German translation tasks. Specifically, the proposed model outperforms the conventional coverage model in terms of both the translation quality and the alignment error rate.

Download Full-text

Knowledge Graphs Effectiveness in Neural Machine Translation Improvement

Computer Science ◽

10.7494/csci.2020.21.3.3701 ◽

2020 ◽

Vol 21 (3) ◽

Author(s):

Benyamin Ahmadnia ◽

Bonnie J. Dorr ◽

Parisa Kordjamshidi

Keyword(s):

Machine Translation ◽

Semantic Representation ◽

Language Translation ◽

Semantic Relations ◽

Training Data ◽

Target Language ◽

Neural Machine Translation ◽

Source Language ◽

Knowledge Graphs ◽

Unknown Words

Neural Machine Translation (NMT) systems require a massive amount of Maintaining semantic relations between words during the translation process yields more accurate target-language output from Neural Machine Translation (NMT). Although difficult to achieve from training data alone, it is possible to leverage Knowledge Graphs (KGs) to retain source-language semantic relations in the corresponding target-language translation. The core idea is to use KG entity relations as embedding constraints to improve the mapping from source to target. This paper describes two embedding constraints, both of which employ Entity Linking (EL)---assigning a unique identity to entities---to associate words in training sentences with those in the KG: (1) a monolingual embedding constraint that supports an enhanced semantic representation of the source words through access to relations between entities in a KG; and (2) a bilingual embedding constraint that forces entity relations in the source-language to be carried over to the corresponding entities in the target-language translation. The method is evaluated for English-Spanish translation exploiting Freebase as a source of knowledge. Our experimental results show that exploiting KG information not only decreases the number of unknown words in the translation but also improves translation quality.

Download Full-text

Neural Machine Translation with Target-Attention Model

IEICE Transactions on Information and Systems ◽

10.1587/transinf.2019edp7157 ◽

2020 ◽

Vol E103.D (3) ◽

pp. 684-694

Author(s):

Mingming YANG ◽

Min ZHANG ◽

Kehai CHEN ◽

Rui WANG ◽

Tiejun ZHAO

Keyword(s):

Machine Translation ◽

Neural Machine Translation ◽

Attention Model

Download Full-text

A Semantic Concept Based Unknown Words Processing Method in Neural Machine Translation

Natural Language Processing and Chinese Computing - Lecture Notes in Computer Science ◽

10.1007/978-3-319-73618-1_20 ◽

2018 ◽

pp. 233-242 ◽

Cited By ~ 3

Author(s):

Shaotong Li ◽

Jinan Xu ◽

Guoyi Miao ◽

Yujie Zhang ◽

Yufeng Chen

Keyword(s):

Machine Translation ◽

Processing Method ◽

Semantic Concept ◽

Neural Machine Translation ◽

Unknown Words

Download Full-text

The Solution of the Problem of Unknown Words Under Neural Machine Translation of the Kazakh Language

Communications in Computer and Information Science - Intelligent Information and Database Systems ◽

10.1007/978-981-15-3380-8_28 ◽

2020 ◽

pp. 319-328 ◽

Cited By ~ 1

Author(s):

Aliya Turganbayeva ◽

Ualsher Tukeyev

Keyword(s):

Machine Translation ◽

Neural Machine Translation ◽

Unknown Words

Download Full-text

Research on Unknown Words Processing of Mongolian-Chinese Neural Machine Translation Based on Semantic Similarity

2019 IEEE 4th International Conference on Computer and Communication Systems (ICCCS) ◽

10.1109/ccoms.2019.8821725 ◽

2019 ◽

Author(s):

Hasigaowa ◽

Siriguleng Wang

Keyword(s):

Machine Translation ◽

Semantic Similarity ◽

Neural Machine Translation ◽

Unknown Words

Download Full-text

Research on Mongolian-Chinese machine translation based on the end-to-end neural network

International Journal of Wavelets Multiresolution and Information Processing ◽

10.1142/s0219691319410030 ◽

2019 ◽

Vol 18 (01) ◽

pp. 1941003 ◽

Cited By ~ 1

Author(s):

Ren Qing-Dao-Er-Ji ◽

Yila Su ◽

Nier Wu

Keyword(s):

Neural Network ◽

Machine Translation ◽

Language Processing ◽

Conditional Random Field ◽

Target Language ◽

Word Alignment ◽

Neural Machine Translation ◽

Attention Model ◽

End To End ◽

Improved Model

With the development of natural language processing and neural machine translation, the neural machine translation method of end-to-end (E2E) neural network model has gradually become the focus of research because of its high translation accuracy and strong semantics of translation. However, there are still problems such as limited vocabulary and low translation loyalty, etc. In this paper, the discriminant method and the Conditional Random Field (CRF) model were used to segment and label the stem and affixes of Mongolian in the preprocessing stage of Mongolian-Chinese bilingual corpus. Aiming at the low translation loyalty problem, a decoding model combining Convolution Neural Network (CNN) and Gated Recurrent Unit (GRU) was constructed. The target language decoding was performed by using the GRU. A global attention model was used to obtain the bilingual word alignment information in the process of bilingual word alignment processing. Finally, the quality of the translation was evaluated by Bilingual Evaluation Understudy (BLEU) values and Perplexity (PPL) values. The improved model yields a BLEU value of 25.13 and a PPL value of [Formula: see text]. The experimental results show that the E2E Mongolian-Chinese neural machine translation model was improved in terms of translation quality and semantic confusion compared with traditional statistical methods and machine translation models based on Recurrent Neural Networks (RNN).

Download Full-text

Neural Machine Translation With GRU-Gated Attention Model

IEEE Transactions on Neural Networks and Learning Systems ◽

10.1109/tnnls.2019.2957276 ◽

2020 ◽

Vol 31 (11) ◽

pp. 4688-4698 ◽

Cited By ~ 1

Author(s):

Biao Zhang ◽

Deyi Xiong ◽

Jun Xie ◽

Jinsong Su

Keyword(s):

Machine Translation ◽

Neural Machine Translation ◽

Attention Model

Download Full-text