Errors in machine translation from German into Russian (based on articles of German-language media and their translations)

Author(s):  
T. A. Ivanchenko

The article is devoted to the study of errors and inaccuracies made by machine translation systems. The reasons for the appearance of errors of various types in the texts of machine translations of German-language articles of well-known mass media into Russian, made by popular translation programs, are analyzed. A classification of errors is given. The lexical-semantic and lexical-stylistic, normative-usual, grammatical, punctuation and spelling errors are highlighted. Typical “weaknesses” of machine translation from German into Russian are revealed, which should be paid attention to during post-editing of the text of such a translation. It is pointed out that the analysis of typical errors in automatic translation can also be taken into account in the process of improving the algorithm of their work by the developers of machine translation systems.

Author(s):  
Carlos Eduardo Silva ◽  
Lincoln Fernandes

This paper describes COPA-TRAD Version 2.0, a parallel corpus-based system developed at the Universidade Federal de Santa Catarina (UFSC) for translation research, teaching and practice. COPA-TRAD enables the user to investigate the practices of professional translators by identifying translational patterns related to a particular element or linguistic pattern. In addition, the system allows for the comparison between human translation and automatic translation provided by three well-known machine translation systems available on the Internet (Google Translate, Microsoft Translator and Yandex). Currently, COPA-TRAD incorporates five subcorpora (Children's Literature, Literary Texts, Meta-Discourse in Translation, Subtitles and Legal Texts) and provides the following tools: parallel concordancer, monolingual concordancer, wordlist and a DIY Tool that enables the user to create his own parallel disposable corpus. The system also provides a POS-tagging tool interface to analyze and classify the parts of speech of a text.


2021 ◽  
Vol 10 (34) ◽  
Author(s):  
A.N SAK ◽  
◽  
E.V BESSONOVA ◽  

When constructing machine translation systems, an important task is to represent data using graphs, where words act as vertices, and relations between words in a sentence act as edges. One of these tasks at the first stage of the analysis is the classification of words as parts of speech, and at the next stage of the analysis to determine the belonging of words to the sentence members’ classes. The article discusses methods of parsing both on the basis of rules determined in advance by means of traditional object-oriented programming, and on the basis of analysis by means of graph convolutional neural networks with their subsequent training. Online dictionaries act as a thesaurus.


2017 ◽  
Author(s):  
AWEJ for Translation & Literary Studies ◽  
Zakaryia Mustafa Almahasees

Machine translation (MT) systems are widely used throughout the world freely or at low cost. The spread of MT entails a thorough analysis of translation produced by such translation systems. The present study evaluates the capacity of two MT systems-Google Translate and Microsoft Bing translator- in translation from Arabic into English of Khalil Gibran’s literary masterpiece - The Prophet (2000). The question that arises in the study is could we trust MT in the translation of literary masterpieces across languages and particularly from Arabic to English? How close does MT output to human translation? To conduct that, the study is adopted Bilingual Evaluation Understudy (BLEU) of Papineni (2000). MT output analysis showed that MT is not accurate, intelligible and natural in translating literary texts due to the difficulty of literary texts, as they are full of metaphors and cultural specifications. Besides, there are some linguistic errors: lexical, syntactic and misinformation. The study also found that both systems provided similar translation for the same input due to either the use of similar MT approach or learning from previous translated texts. Moreover, both systems in some instances, achieve good results at the word level, but bad results at collocation units. The study also showed that automatic translation is insufficient for providing a full analysis of MT output because all automatic metrics are misleading due to dependence on text similarity to a reference human translation. For future research, the study recommended conducting a correlative study that combines manual and automatic evaluation methods to ensure best analysis of MT output. Machine Translation (MT) is still far from reaching fully automatic translation of a quality obtained by human translators.


2021 ◽  
Vol 2021 ◽  
pp. 1-8
Author(s):  
Jing Ning ◽  
Haidong Ban

With the development of linguistics and the improvement of computer performance, the effect of machine translation is getting better and better, and it is widely used. The automatic expression translation method based on the Chinese-English machine takes short sentences as the basic translation unit and makes full use of the order of short sentences. Compared with word-based statistical machine translation methods, the effect is greatly improved. The performance of machine translation is constantly improving. This article aims to study the design of phrase-based automatic machine translation systems by introducing machine translation methods and Chinese-English phrase translation, explore the design and testing of machine automatic translation systems based on the combination of Chinese-English phrase translation, and explain the role of machine automatic translation in promoting the development of translation. In this article, through the combination of machine translation experiments and machine automatic translation system design methods, the design and testing of machine automatic translation systems based on Chinese-English phrase translation combinations are studied to cultivate people's understanding of language, knowledge, and intelligence and then help solve other problems. Language processing issues promote the development of corpus linguistics. The experimental results in this article show that when the Chinese-English phrase translation probability table is changed from 82% to 51%, the BLEU translation evaluation system for the combination of Chinese-English phrases is improved. Automatic machine translation saves time and energy of translation work, which shows that machine translation shows its advantages due to its short development cycle and easy processing of large-scale corpora.


2021 ◽  
Vol 284 ◽  
pp. 08001
Author(s):  
Ilya Ulitkin ◽  
Irina Filippova ◽  
Natalia Ivanova ◽  
Alexey Poroykov

We report on various approaches to automatic evaluation of machine translation quality and describe three widely used methods. These methods, i.e. methods based on string matching and n-gram models, make it possible to compare the quality of machine translation to reference translation. We employ modern metrics for automatic evaluation of machine translation quality such as BLEU, F-measure, and TER to compare translations made by Google and PROMT neural machine translation systems with translations obtained 5 years ago, when statistical machine translation and rule-based machine translation algorithms were employed by Google and PROMT, respectively, as the main translation algorithms [6]. The evaluation of the translation quality of candidate texts generated by Google and PROMT with reference translation using an automatic translation evaluation program reveal significant qualitative changes as compared with the results obtained 5 years ago, which indicate a dramatic improvement in the work of the above-mentioned online translation systems. Ways to improve the quality of machine translation are discussed. It is shown that modern systems of automatic evaluation of translation quality allow errors made by machine translation systems to be identified and systematized, which will enable the improvement of the quality of translation by these systems in the future.


Author(s):  
Yana Fedorko ◽  
Tetiana Yablonskaya

The article is focused on peculiarities of English and Chinese political discourse translation into Ukrainian. The advantages and disadvantages of machine translation are described on the basis of linguistic analysis of online Google Translate and M-Translate systems. The reasons of errors in translation are identified and the need of post-correction to improve the quality of translation is wanted. Key words: political discourse, automatic translation, online machine translation systems, machine translation quality assessment.


2017 ◽  
Vol 9 (2) ◽  
pp. 86
Author(s):  
Tomáš Svoboda

The article deals with the concept of borders and barriers in considering scenarios where the linguistic barrier is eventually lifted by technology one day. It begins with reflections on the biblical narrative of the Tower of Babel as an ancient representation of the concept of linguistic barriers between language communities. It gives numerous examples of the uptake of this narrative, from Translation Studies, to project calls and the marketing statements of machine translation technology. In the following section, examples of existing technology are presented, which could be considered as a first generation of automatic translation/interpretation systems. In the main section, several trends are predicted for both the translators’ profession and general economic/business/political/societal developments. The consequences are anticipated of a situation where ordinary cross-language communication will eventually have been almost fully taken over by automated systems. The article points to both the technology’s positive potential and, by showing the eventual risks involved, it equally rejects an attitude of the technology’s uncritical uptake. The article closes by pointing to the ethical dimension of machine translation systems linked with their types of uses and the choices reserved for their users.


2021 ◽  
Vol 263 ◽  
pp. 03013
Author(s):  
Alexander Sak

When designing machine translation systems, an important task is to represent data using graphs, where words act as vertices, and relations between words in a sentence act as edges. One of these tasks at the first stage of the analysis is the classification of words as parts of speech, and at the next stage of the analysis is to determine the belonging of words to the sentence members’ classes. A robust approach to carry out such a classification is to determine words embeddings by using Graph Convolutional Networks at the beginning of the analysis and then to apply k-means clustering which is an algorithm that splits objects (words) into groups. To determine weights an ordinary network is applied to obtained hidden layers in order to use these weights in subsequent analysis.


2021 ◽  
Vol 1 (1) ◽  
pp. 124-133
Author(s):  
Ani Ananyan ◽  
Roza Avagyan

Along with the development and widespread dissemination of translation by artificial intelligence, it is becoming increasingly important to continuously evaluate and improve its quality and to use it as a tool for the modern translator. In our research, we compared five sentences translated from Armenian into Russian and English by Google Translator, Yandex Translator and two models of the translation system of the Armenian company Avromic to find out how effective these translation systems are when working in Armenian. It was necessary to find out how effective it would be to use them as a translation tool and in the learning process by further editing the translation. As there is currently no comprehensive and successful method of human metrics for machine translation, we have developed our own evaluation method and criteria by studying the world's most well-known methods of evaluation for automatic translation. We have used the post-editorial distance evaluation criterion as well. In the example of one sentence in the article, we have presented in detail the evaluation process according to the selected and developed criteria. At the end we have presented the results of the research and made appropriate conclusions.


2020 ◽  
Vol 10 (4) ◽  
pp. 408
Author(s):  
Noureldin Mohamed Abdelaal ◽  
Abdulkhaliq Alazzawie

This study aims at identifying the common types of errors in Google Translate (GT) in the translation of informative news texts from Arabic to English, to measure the translation errors quality and to assess the fluency and the semantic adequacy of the translation output, and therefore to explain the extent a human translator is needed to rectify the output translation. For this purpose, some examples were purposively selected from online newspapers. The collected data was analyzed using a mixed method approach, as the errors were qualitatively identified, guided by Hsu’s (2014) classification of machine translation errors. Quantitative descriptive approach was used to measure the translation errors quality, using the Multidimensional Quality Metrics and Localization Quality Evaluation. As for assessing the semantic adequacy and fluency, a questionnaire that was adapted from Dorr, Snover, and Madnani (2011) was used. The results of the analysis show that omission, which is a lexical error and inappropriate lexical choice, which is a semantic error are the most common errors. Inappropriate lexical choice is sometimes a result of the homophonic nature of some source text words which can be misinterpreted by the machine translation system. This study concludes that it is useful to use machine translation systems to expedite the translation process, but that accuracy is sacrificed for the sake of ease (less work for the human) and speed of translation. If greater accuracy is required, or desired, a human translator must at least proofread and work on the material.


Sign in / Sign up

Export Citation Format

Share Document