Evaluation of the impact of controlled language on neural machine translation compared to other MT architectures

We identify a number of aspects that can boost the performance of Neural Fuzzy Repair (NFR), an easy-to-implement method to integrate translation memory matches and neural machine translation (NMT). We explore various ways of maximising the added value of retrieved matches within the NFR paradigm for eight language combinations, using Transformer NMT systems. In particular, we test the impact of different fuzzy matching techniques, sub-word-level segmentation methods and alignment-based features on overall translation quality. Furthermore, we propose a fuzzy match combination technique that aims to maximise the coverage of source words. This is supplemented with an analysis of how translation quality is affected by input sentence length and fuzzy match score. The results show that applying a combination of the tested modifications leads to a significant increase in estimated translation quality over all baselines for all language combinations.

Download Full-text

Pre-Reordering for Neural Machine Translation: Helpful or Harmful?

Prague Bulletin of Mathematical Linguistics ◽

10.1515/pralin-2017-0018 ◽

2017 ◽

Vol 108 (1) ◽

pp. 171-182 ◽

Cited By ~ 5

Author(s):

Jinhua Du ◽

Andy Way

Keyword(s):

Machine Translation ◽

Statistical Machine Translation ◽

Word Class ◽

Word Embeddings ◽

Neural Machine Translation ◽

Parts Of Speech ◽

Translation Quality ◽

The Impact ◽

Japanese English ◽

Target Side

AbstractPre-reordering, a preprocessing to make the source-side word orders close to those of the target side, has been proven very helpful for statistical machine translation (SMT) in improving translation quality. However, is it the case in neural machine translation (NMT)? In this paper, we firstly investigate the impact of pre-reordered source-side data on NMT, and then propose to incorporate features for the pre-reordering model in SMT as input factors into NMT (factored NMT). The features, namely parts-of-speech (POS), word class and reordered index, are encoded as feature vectors and concatenated to the word embeddings to provide extra knowledge for NMT. Pre-reordering experiments conducted on Japanese↔English and Chinese↔English show that pre-reordering the source-side data for NMT is redundant and NMT models trained on pre-reordered data deteriorate translation performance. However, factored NMT using SMT-based pre-reordering features on Japanese→English and Chinese→English is beneficial and can further improve by 4.48 and 5.89 relative BLEU points, respectively, compared to the baseline NMT system.

Download Full-text

Are Ellipses important for Machine Translation?

Computational Linguistics ◽

10.1162/coli_a_00414 ◽

2021 ◽

pp. 1-10

Author(s):

Payal Khullar

Keyword(s):

Machine Translation ◽

Neural Machine Translation ◽

Different Types ◽

Theoretical Linguistics ◽

Disparate Treatment ◽

Manual Methods ◽

Target Languages ◽

The Impact

Abstract This article describes an experiment to evaluate the impact of different types of ellipses discussed in theoretical linguistics on Neural Machine Translation (NMT), using English to Hindi/Telugu as source and target languages. Evaluation with manual methods shows that most of the errors made by Google NMT are located in the clause containing the ellipsis, the frequency of such errors is slightly more in Telugu than Hindi, and the translation adequacy shows improvement when ellipses are reconstructed with their antecedents. These findings not only confirm the importance of ellipses and their resolution for MT, but also hint towards a possible correlation between the translation of discourse devices like ellipses with the morphological incongruity of the source and target. We also observe that not all ellipses are translated poorly and benefit from reconstruction, advocating for a disparate treatment of different ellipses in MT research.

Download Full-text

On the Impact of Various Types of Noise on Neural Machine Translation

10.18653/v1/w18-2709 ◽

2018 ◽

Cited By ~ 2

Author(s):

Huda Khayrallah ◽

Philipp Koehn

Keyword(s):

Machine Translation ◽

Neural Machine Translation ◽

The Impact

Download Full-text

Beyond Noise: Mitigating the Impact of Fine-grained Semantic Divergences on Neural Machine Translation

10.18653/v1/2021.acl-long.562 ◽

2021 ◽

Author(s):

Eleftheria Briakou ◽

Marine Carpuat

Keyword(s):

Machine Translation ◽

Neural Machine Translation ◽

Fine Grained ◽

The Impact

Download Full-text

A Systematic Study of Inner-Attention-Based Sentence Representations in Multilingual Neural Machine Translation

Computational Linguistics ◽

10.1162/coli_a_00377 ◽

2020 ◽

Vol 46 (2) ◽

pp. 387-424 ◽

Cited By ~ 1

Author(s):

Raúl Vázquez ◽

Alessandro Raganato ◽

Mathias Creutz ◽

Jörg Tiedemann

Keyword(s):

Machine Translation ◽

Improve Performance ◽

Neural Machine Translation ◽

Translation Quality ◽

Intermediate Layers ◽

Depth Analysis ◽

Classification Tasks ◽

The Impact ◽

Meaning Representation

Neural machine translation has considerably improved the quality of automatic translations by learning good representations of input sentences. In this article, we explore a multilingual translation model capable of producing fixed-size sentence representations by incorporating an intermediate crosslingual shared layer, which we refer to as attention bridge. This layer exploits the semantics from each language and develops into a language-agnostic meaning representation that can be efficiently used for transfer learning. We systematically study the impact of the size of the attention bridge and the effect of including additional languages in the model. In contrast to related previous work, we demonstrate that there is no conflict between translation performance and the use of sentence representations in downstream tasks. In particular, we show that larger intermediate layers not only improve translation quality, especially for long sentences, but also push the accuracy of trainable classification tasks. Nevertheless, shorter representations lead to increased compression that is beneficial in non-trainable similarity tasks. Similarly, we show that trainable downstream tasks benefit from multilingual models, whereas additional language signals do not improve performance in non-trainable benchmarks. This is an important insight that helps to properly design models for specific applications. Finally, we also include an in-depth analysis of the proposed attention bridge and its ability to encode linguistic properties. We carefully analyze the information that is captured by individual attention heads and identify interesting patterns that explain the performance of specific settings in linguistic probing tasks.

Download Full-text

Experimental study of the impact of using a neural machine translation engine on the quality of translation of texts in the field of pharmacognosy

10.26565/2227-8877-2020-92-15 ◽

2020 ◽

Keyword(s):

Experimental Study ◽

Machine Translation ◽

Information And Communication Technologies ◽

Positive Impact ◽

Translation System ◽

Important Place ◽

Neural Machine Translation ◽

Machine Translation System ◽

The Impact

The article is devoted to the study of the impact of using the neural machine translation system Google Translate on the quality of translation of texts in the field of pharmacognosy. At the present stage, the work of a translator is impossible to imagine without the use of information and communication technologies, an important place among which is attributed to machine translation. It is considered that neural machine translation systems perform translation at a fairly high level, so that its use by a human translator can have a positive impact. That is why the aim of the study was to conduct an experiment to determine the impact of using a neural machine translation system on the quality of translation of texts in the field of pharmacognosy in terms of the number of errors and correctness of translating terminology. The article formulates a research hypothesis, describes the text chosen to conduct the study and the neural machine translation system, which was selected for this purpose, discloses the procedure for estimating the number of errors in translations and calculating the percentage of correctness of translating terminology, provides quantitative experimental data, and the results are illustrated in tables and drawings. The experimental study was conducted in the first semester of the 2020/2021 academic year (September) on the basis of an excerpt from a text in the field of pharmacognosy, which was translated by the neural machine translation system Google Translate and a translation student of the bachelor’s level. Both translations were checked in terms of quantity and quality (types) of errors, as well as in terms of correctness of translating domain-specific terminology. The results refuted our hypothesis, as the translation performed by the neural machine translation system Google Translate was worse, both in terms of the number of errors and the percentage of correctness of translating terminology as compared to the results demonstrated by the student.

Download Full-text