scholarly journals Quality assessment of machine translation output

2020 ◽  
Vol 13 (2) ◽  
pp. 271-285
Author(s):  
Ramunė Kasperavičienė ◽  
Jurgita Motiejūnienė ◽  
Irena Patašienė

Despite fast development of machine translation, the output quality is less than acceptable in certain language pairs. The aim of this paper is to determine the types of errors in machine translation output that cause comprehension problems to potential readers. The study is based on a reading task experiment using eye tracking and a retrospective survey as a complementary method to add more value to the research as eye tracking as a method is considered to be problematic and challenging (O’BRIEN, 2009; ALVES et al., 2009). The cognitive evaluation approach is used in an eye tracking experiment to determine the complexity of the errors in the English–Lithuanian language pair from easiest to hardest as seen by the readers of a machine-translated text. The tested parameters – gaze time and fixation count – demonstrate that a different amount of cognitive effort is required to process different types of errors in machine-translated texts. The current work aims at contributing to other research in the Translation Studies field by providing the analysis of error assessment of machine translation output.

2018 ◽  
Vol 5 (1) ◽  
pp. 37-45
Author(s):  
Darryl Yunus Sulistyan

Machine Translation is a machine that is going to automatically translate given sentences in a language to other particular language. This paper aims to test the effectiveness of a new model of machine translation which is factored machine translation. We compare the performance of the unfactored system as our baseline compared to the factored model in terms of BLEU score. We test the model in German-English language pair using Europarl corpus. The tools we are using is called MOSES. It is freely downloadable and use. We found, however, that the unfactored model scored over 24 in BLEU and outperforms the factored model which scored below 24 in BLEU for all cases. In terms of words being translated, however, all of factored models outperforms the unfactored model.


2016 ◽  
Vol 1 (1) ◽  
pp. 45-49
Author(s):  
Avinash Singh ◽  
Asmeet Kour ◽  
Shubhnandan S. Jamwal

The objective behind this paper is to analyze the English-Dogri parallel corpus translation. Machine translation is the translation from one language into another language. Machine translation is the biggest application of the Natural Language Processing (NLP). Moses is statistical machine translation system allow to train translation models for any language pair. We have developed translation system using Statistical based approach which helps in translating English to Dogri and vice versa. The parallel corpus consists of 98,973 sentences. The system gives accuracy of 80% in translating English to Dogri and the system gives accuracy of 87% in translating Dogri to English system.


2021 ◽  
pp. 1-10
Author(s):  
Zhiqiang Yu ◽  
Yuxin Huang ◽  
Junjun Guo

It has been shown that the performance of neural machine translation (NMT) drops starkly in low-resource conditions. Thai-Lao is a typical low-resource language pair of tiny parallel corpus, leading to suboptimal NMT performance on it. However, Thai and Lao have considerable similarities in linguistic morphology and have bilingual lexicon which is relatively easy to obtain. To use this feature, we first build a bilingual similarity lexicon composed of pairs of similar words. Then we propose a novel NMT architecture to leverage the similarity between Thai and Lao. Specifically, besides the prevailing sentence encoder, we introduce an extra similarity lexicon encoder into the conventional encoder-decoder architecture, by which the semantic information carried by the similarity lexicon can be represented. We further provide a simple mechanism in the decoder to balance the information representations delivered from the input sentence and the similarity lexicon. Our approach can fully exploit linguistic similarity carried by the similarity lexicon to improve translation quality. Experimental results demonstrate that our approach achieves significant improvements over the state-of-the-art Transformer baseline system and previous similar works.


2009 ◽  
Vol 54 (3) ◽  
pp. 588-604 ◽  
Author(s):  
Barbara Dragsted ◽  
Inge Gorm Hansen

Abstract This article reports on a comparative study of written translation and sight translation, drawing on experimental data combining keystroke logging, eye-tracking and quality ratings of spoken and written output produced by professional translators and interpreters. Major differences in output rate were observed when comparing oral and written modalities. Evaluation of the translation products showed that the lower output rate in the written condition was not justified by significantly higher quality in the written products. Observations from the combination of data sources point to fundamental behavioural differences between interpreters and translators. Overall, working in the oral modality seems to have a lot to offer in terms of saving time and effort without compromising the output quality, and there seems to be a case for increasing the role of oral translation in translator training, incorporating it as a deliberate practice activity.


2016 ◽  
Vol 42 (2) ◽  
pp. 163-205 ◽  
Author(s):  
Arianna Bisazza ◽  
Marcello Federico

Word reordering is one of the most difficult aspects of statistical machine translation (SMT), and an important factor of its quality and efficiency. Despite the vast amount of research published to date, the interest of the community in this problem has not decreased, and no single method appears to be strongly dominant across language pairs. Instead, the choice of the optimal approach for a new translation task still seems to be mostly driven by empirical trials. To orient the reader in this vast and complex research area, we present a comprehensive survey of word reordering viewed as a statistical modeling challenge and as a natural language phenomenon. The survey describes in detail how word reordering is modeled within different string-based and tree-based SMT frameworks and as a stand-alone task, including systematic overviews of the literature in advanced reordering modeling. We then question why some approaches are more successful than others in different language pairs. We argue that besides measuring the amount of reordering, it is important to understand which kinds of reordering occur in a given language pair. To this end, we conduct a qualitative analysis of word reordering phenomena in a diverse sample of language pairs, based on a large collection of linguistic knowledge. Empirical results in the SMT literature are shown to support the hypothesis that a few linguistic facts can be very useful to anticipate the reordering characteristics of a language pair and to select the SMT framework that best suits them.


Sign in / Sign up

Export Citation Format

Share Document