Efficient and High-Quality Neural Machine Translation with OpenNMT

We present PARABANK, a large-scale English paraphrase dataset that surpasses prior work in both quantity and quality. Following the approach of PARANMT (Wieting and Gimpel, 2018), we train a Czech-English neural machine translation (NMT) system to generate novel paraphrases of English reference sentences. By adding lexical constraints to the NMT decoding procedure, however, we are able to produce multiple high-quality sentential paraphrases per source sentence, yielding an English paraphrase resource with more than 4 billion generated tokens and exhibiting greater lexical diversity. Using human judgments, we also demonstrate that PARABANK’s paraphrases improve over PARANMT on both semantic similarity and fluency. Finally, we use PARABANK to train a monolingual NMT model with the same support for lexically-constrained decoding for sentence rewriting tasks.

Download Full-text

Generating Alignments Using Target Foresight in Attention-Based Neural Machine Translation

Prague Bulletin of Mathematical Linguistics ◽

10.1515/pralin-2017-0006 ◽

2017 ◽

Vol 108 (1) ◽

pp. 27-36

Author(s):

Jan-Thorsten Peter ◽

Arne Nix ◽

Hermann Ney

Keyword(s):

Target Word ◽

Machine Translation ◽

Explicit Knowledge ◽

Attention Mechanism ◽

Target Information ◽

High Quality ◽

Neural Machine Translation ◽

Translation Quality ◽

Successful Approach

AbstractNeural machine translation (NMT) has shown large improvements in recent years. The currently most successful approach in this area relies on the attention mechanism, which is often interpreted as an alignment, even though it is computed without explicit knowledge of the target word. This limitation is the most likely reason that the quality of attention-based alignments is inferior to the quality of traditional alignment methods. Guided alignment training has shown that alignments are still capable of improving translation quality. In this work, we propose an extension of the attention-based NMT model that introduces target information into the attention mechanism to produce high-quality alignments. In comparison to the conventional attention-based alignments, our model halves the Aer with an absolute improvement of 19.1% Aer. Compared to GIZA++ it shows an absolute improvement of 2.0% Aer.

Download Full-text

Reinforced Curriculum Learning on Pre-Trained Neural Machine Translation Models

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6513 ◽

2020 ◽

Vol 34 (05) ◽

pp. 9652-9659

Author(s):

Mingjun Zhao ◽

Haijiang Wu ◽

Di Niu ◽

Xiaoli Wang

Keyword(s):

Machine Translation ◽

Expert Knowledge ◽

Model Performance ◽

Training Data ◽

Original Training ◽

Training Set ◽

Learning Problem ◽

High Quality ◽

Neural Machine Translation ◽

Selection Framework

The competitive performance of neural machine translation (NMT) critically relies on large amounts of training data. However, acquiring high-quality translation pairs requires expert knowledge and is costly. Therefore, how to best utilize a given dataset of samples with diverse quality and characteristics becomes an important yet understudied question in NMT. Curriculum learning methods have been introduced to NMT to optimize a model's performance by prescribing the data input order, based on heuristics such as the assessment of noise and difficulty levels. However, existing methods require training from scratch, while in practice most NMT models are pre-trained on big data already. Moreover, as heuristics, they do not generalize well. In this paper, we aim to learn a curriculum for improving a pre-trained NMT model by re-selecting influential data samples from the original training set and formulate this task as a reinforcement learning problem. Specifically, we propose a data selection framework based on Deterministic Actor-Critic, in which a critic network predicts the expected change of model performance due to a certain sample, while an actor network learns to select the best sample out of a random batch of samples presented to it. Experiments on several translation datasets show that our method can further improve the performance of NMT when original batch training reaches its ceiling, without using additional new training data, and significantly outperforms several strong baseline methods.

Download Full-text

METHOD OF SYSTEM ENGINEERING OF NEURAL MACHINE TRANSLATION SYSTEMS

KPI Science News ◽

10.20535/kpisn.2021.2.236939 ◽

2021 ◽

Author(s):

Pavlo P. Maslianko ◽

Yevhenii P. Sielskyi

Keyword(s):

Machine Translation ◽

Data Science ◽

System Engineering ◽

Translation System ◽

High Quality ◽

Neural Machine Translation ◽

Meta Level ◽

Machine Translation System ◽

Productive Process ◽

Translation Systems

Background. There are not many machine translation companies on the market whose products are in demand. These are, for example, free and commercial products such as “GoogleTranslate”, “DeepLTranslator”, “ModernMT”, “Apertium”, “Trident”, to name a few. To implement a more efficient and productive process for developing high-quality neural machine translation systems (NMTS), appropriate scientifically based methods of NMTS engineering are needed in order to get a high-quality and competitive product as quickly as possible. Objective. The purpose of this article is to apply the Eriksson-Penker business profile to the development and formalization of a method for system engineering of NMTS. Methods. The idea behind the neural machine translation system engineering method is to apply the Eriksson-Penker system engineering methodology and business profile to formalize an ordered way to develop NMT systems. Results. The method of developing NMT systems based on the use of system engineering techniques consists of three main stages. At the first stage, the structure of the NMT system is modelled in the form of an Eriksson-Penker business profile. At the second stage, a set of processes is determined that is specific to the class of Data Science systems, and the international CRISP-DM standard. At the third stage, verification and validation of the developed NMTS is carried out. Conclusions. The article proposes a method of system engineering of NMTS based on the modified Erickson-Penker business profile representation of the system at the meta-level, as well as international process standards of Data Science and Data Mining. The effectiveness of using this method was studied on the example of developing a bidirectional English-Ukrainian NMTS EUMT (English-Ukrainian Machine Translator) and it was found that the EUMT system is at least as good as the quality of English-Ukrainian translation of the popular Google Translate translator. The full version code of the EUMT system is published on the GitHub platform and is available at: https://github.com/EugeneSel/EUMT.

Download Full-text