scholarly journals An Oblivious Approach to Machine Translation Quality Estimation

Mathematics ◽  
2021 ◽  
Vol 9 (17) ◽  
pp. 2090
Author(s):  
Itamar Elmakias ◽  
Dan Vilenchik

Machine translation (MT) is being used by millions of people daily, and therefore evaluating the quality of such systems is an important task. While human expert evaluation of MT output remains the most accurate method, it is not scalable by any means. Automatic procedures that perform the task of Machine Translation Quality Estimation (MT-QE) are typically trained on a large corpus of source–target sentence pairs, which are labeled with human judgment scores. Furthermore, the test set is typically drawn from the same distribution as the train. However, recently, interest in low-resource and unsupervised MT-QE has gained momentum. In this paper, we define and study a further restriction of the unsupervised MT-QE setting that we call oblivious MT-QE. Besides having no access no human judgment scores, the algorithm has no access to the test text’s distribution. We propose an oblivious MT-QE system based on a new notion of sentence cohesiveness that we introduce. We tested our system on standard competition datasets for various language pairs. In all cases, the performance of our system was comparable to the performance of the non-oblivious baseline system provided by the competition organizers. Our results suggest that reasonable MT-QE can be carried out even in the restrictive oblivious setting.

Author(s):  
A.V. Kozina ◽  
Yu.S. Belov

Automatically assessing the quality of machine translation is an important yet challenging task for machine translation research. Translation quality assessment is understood as predicting translation quality without reference to the source text. Translation quality depends on the specific machine translation system and often requires post-editing. Manual editing is a long and expensive process. Since the need to quickly determine the quality of translation increases, its automation is required. In this paper, we propose a quality assessment method based on ensemble supervised machine learning methods. The bilingual corpus WMT 2019 for the EnglishRussian language pair was used as data. The text data volume is 17089 sentences, 85% of the data was used for training, and 15% for testing the model. Linguistic functions extracted from the text in the source and target languages were used as features for training the system, since it is these characteristics that can most accurately characterize the translation in terms of quality. The following tools were used for feature extraction: a free language modeling tool based on SRILM and a Stanford POS Tagger parts of speech tagger. Before training the system, the text was preprocessed. The model was trained using three regression methods: Bagging, Extra Tree, and Random Forest. The algorithms were implemented in the Python programming language using the Scikit learn library. The parameters of the random forest method have been optimized using a grid search. The performance of the model was assessed by the mean absolute error MAE and the root mean square error RMSE, as well as by the Pearsоn coefficient, which determines the correlation with human judgment. Testing was carried out using three machine translation systems: Google and Bing neural systems, Mouses statistical machine translation systems based on phrases and based on syntax. Based on the results of the work, the method of additional trees showed itself best. In addition, for all categories of indicators under consideration, the best results are achieved using the Google machine translation system. The developed method showed good results close to human judgment. The system can be used for further research in the task of assessing the quality of translation.


Informatics ◽  
2021 ◽  
Vol 8 (3) ◽  
pp. 61
Author(s):  
Hannah Béchara ◽  
Constantin Orăsan ◽  
Carla Parra Escartín ◽  
Marcos Zampieri ◽  
William Lowe

As Machine Translation (MT) becomes increasingly ubiquitous, so does its use in professional translation workflows. However, its proliferation in the translation industry has brought about new challenges in the field of Post-Editing (PE). We are now faced with a need to find effective tools to assess the quality of MT systems to avoid underpayments and mistrust by professional translators. In this scenario, one promising field of study is MT Quality Estimation (MTQE), as this aims to determine the quality of an automatic translation and, indirectly, its degree of post-editing difficulty. However, its impact on the translation workflows and the translators’ cognitive load is still to be fully explored. We report on the results of an impact study engaging professional translators in PE tasks using MTQE. To assess the translators’ cognitive load we measure their productivity both in terms of time and effort (keystrokes) in three different scenarios: translating from scratch, post-editing without using MTQE, and post-editing using MTQE. Our results show that good MTQE information can improve post-editing efficiency and decrease the cognitive load on translators. This is especially true for cases with low MT quality.


2017 ◽  
Vol 108 (1) ◽  
pp. 343-354 ◽  
Author(s):  
Carla Parra Escartín ◽  
Hanna Béchara ◽  
Constantin Orăsan

AbstractPost-Editing of Machine Translation (MT) has become a reality in professional translation workflows. In order to optimize the management of projects that use post-editing and avoid underpayments and mistrust from professional translators, effective tools to assess the quality of Machine Translation (MT) systems need to be put in place. One field of study that could address this problem is Machine Translation Quality Estimation (MTQE), which aims to determine the quality of MT without an existing reference. Accurate and reliable MTQE can help project managers and translators alike, as it would allow estimating more precisely the cost of post-editing projects in terms of time and adequate fares by discarding those segments that are not worth post-editing (PE) and have to be translated from scratch.In this paper, we report on the results of an impact study which engages professional translators in PE tasks using MTQE. We measured translators’ productivity in different scenarios: translating from scratch, post-editing without using MTQE, and post-editing using MTQE. Our results show that QE information, when accurate, improves post-editing efficiency.


2013 ◽  
Vol 27 (3-4) ◽  
pp. 281-301 ◽  
Author(s):  
Jesús González-Rubio ◽  
J. Ramón Navarro-Cerdán ◽  
Francisco Casacuberta

2021 ◽  
pp. 1-10
Author(s):  
Zhiqiang Yu ◽  
Yuxin Huang ◽  
Junjun Guo

It has been shown that the performance of neural machine translation (NMT) drops starkly in low-resource conditions. Thai-Lao is a typical low-resource language pair of tiny parallel corpus, leading to suboptimal NMT performance on it. However, Thai and Lao have considerable similarities in linguistic morphology and have bilingual lexicon which is relatively easy to obtain. To use this feature, we first build a bilingual similarity lexicon composed of pairs of similar words. Then we propose a novel NMT architecture to leverage the similarity between Thai and Lao. Specifically, besides the prevailing sentence encoder, we introduce an extra similarity lexicon encoder into the conventional encoder-decoder architecture, by which the semantic information carried by the similarity lexicon can be represented. We further provide a simple mechanism in the decoder to balance the information representations delivered from the input sentence and the similarity lexicon. Our approach can fully exploit linguistic similarity carried by the similarity lexicon to improve translation quality. Experimental results demonstrate that our approach achieves significant improvements over the state-of-the-art Transformer baseline system and previous similar works.


Author(s):  
Raj Dabre ◽  
Atsushi Fujita

In encoder-decoder based sequence-to-sequence modeling, the most common practice is to stack a number of recurrent, convolutional, or feed-forward layers in the encoder and decoder. While the addition of each new layer improves the sequence generation quality, this also leads to a significant increase in the number of parameters. In this paper, we propose to share parameters across all layers thereby leading to a recurrently stacked sequence-to-sequence model. We report on an extensive case study on neural machine translation (NMT) using our proposed method, experimenting with a variety of datasets. We empirically show that the translation quality of a model that recurrently stacks a single-layer 6 times, despite its significantly fewer parameters, approaches that of a model that stacks 6 different layers. We also show how our method can benefit from a prevalent way for improving NMT, i.e., extending training data with pseudo-parallel corpora generated by back-translation. We then analyze the effects of recurrently stacked layers by visualizing the attentions of models that use recurrently stacked layers and models that do not. Finally, we explore the limits of parameter sharing where we share even the parameters between the encoder and decoder in addition to recurrent stacking of layers.


2018 ◽  
Vol 8 (6) ◽  
pp. 3512-3514
Author(s):  
D. Chopra ◽  
N. Joshi ◽  
I. Mathur

Machine translation (MT) has been a topic of great research during the last sixty years, but, improving its quality is still considered an open problem. In the current paper, we will discuss improvements in MT quality by the use of the ensemble approach. We performed MT from English to Hindi using 6 MT different engines described in this paper. We found that the quality of MT is improved by using a combination of various approaches as compared to the simple baseline approach for performing MT from source to target text.


Author(s):  
Yang Zhao ◽  
Jiajun Zhang ◽  
Yu Zhou ◽  
Chengqing Zong

Knowledge graphs (KGs) store much structured information on various entities, many of which are not covered by the parallel sentence pairs of neural machine translation (NMT). To improve the translation quality of these entities, in this paper we propose a novel KGs enhanced NMT method. Specifically, we first induce the new translation results of these entities by transforming the source and target KGs into a unified semantic space. We then generate adequate pseudo parallel sentence pairs that contain these induced entity pairs. Finally, NMT model is jointly trained by the original and pseudo sentence pairs. The extensive experiments on Chinese-to-English and Englishto-Japanese translation tasks demonstrate that our method significantly outperforms the strong baseline models in translation quality, especially in handling the induced entities.


2020 ◽  
Vol 30 (01) ◽  
pp. 2050002
Author(s):  
Taichi Aida ◽  
Kazuhide Yamamoto

Current methods of neural machine translation may generate sentences with different levels of quality. Methods for automatically evaluating translation output from machine translation can be broadly classified into two types: a method that uses human post-edited translations for training an evaluation model, and a method that uses a reference translation that is the correct answer during evaluation. On the one hand, it is difficult to prepare post-edited translations because it is necessary to tag each word in comparison with the original translated sentences. On the other hand, users who actually employ the machine translation system do not have a correct reference translation. Therefore, we propose a method that trains the evaluation model without using human post-edited sentences and in the test set, estimates the quality of output sentences without using reference translations. We define some indices and predict the quality of translations with a regression model. For the quality of the translated sentences, we employ the BLEU score calculated from the number of word [Formula: see text]-gram matches between the translated sentence and the reference translation. After that, we compute the correlation between quality scores predicted by our method and BLEU actually computed from references. According to the experimental results, the correlation with BLEU is the highest when XGBoost uses all the indices. Moreover, looking at each index, we find that the sentence log-likelihood and the model uncertainty, which are based on the joint probability of generating the translated sentence, are important in BLEU estimation.


Sign in / Sign up

Export Citation Format

Share Document