Elimination of Machine Translation Errors in English Language Transformation

Author(s):  
Panpan Du
2018 ◽  
Vol 5 (1) ◽  
pp. 37-45
Author(s):  
Darryl Yunus Sulistyan

Machine Translation is a machine that is going to automatically translate given sentences in a language to other particular language. This paper aims to test the effectiveness of a new model of machine translation which is factored machine translation. We compare the performance of the unfactored system as our baseline compared to the factored model in terms of BLEU score. We test the model in German-English language pair using Europarl corpus. The tools we are using is called MOSES. It is freely downloadable and use. We found, however, that the unfactored model scored over 24 in BLEU and outperforms the factored model which scored below 24 in BLEU for all cases. In terms of words being translated, however, all of factored models outperforms the unfactored model.


Relay Journal ◽  
2019 ◽  
pp. 228-235
Author(s):  
Paul J. Moore ◽  
Phil Murphy ◽  
Luann Pascucci ◽  
Scott Sustenance

This paper reports on an ongoing study into the affordances of free online machine translation for students learning English as a foreign language (EFL) at the tertiary level in Japan. The researchers are currently collecting data from a questionnaire, task performance, and interviews with 10-15 EFL learners in an English Language Institute in a university in Japan. The paper provides some background on the changing role of translation in language learning theory and pedagogy, before focusing literature related to technical developments in machine translation technology, and its application to foreign language learning. An overview of the research methodology is provided, along with some insights into potential findings. Findings will be presented in subsequent publications.


2010 ◽  
Vol 7 (2) ◽  
pp. 57-73
Author(s):  
Primož Jurko

Phraseology is seen as one of the key elements and arguably the most productive part of any language. %e paper is focused on collocations and separates them from other phraseological units, such as idioms or compounds. Highlighting the difference between a monolingual and a bilingual (i.e. contrastive) approach to collocation, the article presents two distinct classes of collocations: grammatical and lexical. %e latter, treated contrastively, represent the focal point of the paper, since they are an unending source of translation errors to both students of translation and professional translators. %e author introduces a methodology of systematic classification of lexical collocations applied on the Slovene-English language pair and based on structural (lexical congruence) and semantic (translational predictability) criteria.


2020 ◽  
Vol 34 (2-3) ◽  
pp. 149-195
Author(s):  
Rejwanul Haque ◽  
Mohammed Hasanuzzaman ◽  
Andy Way

Babel ◽  
2020 ◽  
Vol 66 (4-5) ◽  
pp. 867-881
Author(s):  
Yanlin Guo

Abstract Since entering the new era, the translation model has gradually changed with the widespread application of machine translation technology and the rapid development of a translation industry. The mismatch between the demand of employers and the talents trained by universities has become a major problem facing the translation major nowadays. To this end, we should attach more importance to the readjustment of the existent curriculum; students’ practical ability in translation; grasp of the skill of detecting and correcting machine translation errors; combination of translation and relevant professional knowledge.


Entropy ◽  
2019 ◽  
Vol 21 (12) ◽  
pp. 1213
Author(s):  
Guanghao Xu ◽  
Youngjoong Ko ◽  
Jungyun Seo

Synthetic data has been shown to be effective in training state-of-the-art neural machine translation (NMT) systems. Because the synthetic data is often generated by back-translating monolingual data from the target language into the source language, it potentially contains a lot of noise—weakly paired sentences or translation errors. In this paper, we propose a novel approach to filter this noise from synthetic data. For each sentence pair of the synthetic data, we compute a semantic similarity score using bilingual word embeddings. By selecting sentence pairs according to these scores, we obtain better synthetic parallel data. Experimental results on the IWSLT 2017 Korean→English translation task show that despite using much less data, our method outperforms the baseline NMT system with back-translation by up to 0.72 and 0.62 Bleu points for tst2016 and tst2017, respectively.


Sign in / Sign up

Export Citation Format

Share Document