scholarly journals The challenges of machine translation of academic publications

Author(s):  
D. A. Rew ◽  
N. G. Popova

Clear translation remains a major challenge to better communication and understanding of the international academic literature, despite advances in Machine Translation (MT). Automatic translation systems which captured the detail and the sense of any manuscript in any language for a reader from any other linguistic background would find global applications.In this article, we discuss the current opportunities and constraints to the wider use of machine translation and computer-assisted human translation (CAT). At the present stage of technology development, these instruments offer a number of advantages to specialists working with scientific texts. These include the facility to skim and scan large amounts of information in foreign languages, and to act as digital dictionaries, thesauri and encyclopedias. Word-to-word and phrase-to-phrase translation between many languages and scripts is now well advanced.The availability of modern machine translation has therefore changed the work of specialist scientific translators, placing greater emphasis on more advanced text and sense editing skills. However, machine translation is still challenged by the nuances of language and culture from one society to another, particularly in the freestyle literature of the arts and humanities. Scientific papers are generally much more structured, but the quality of machine translation still largely depends on the quality of the source text. This varies considerably between different scientific disciplines and from one author to another.The most advanced translation systems are making steady progress. It is timely to revisit traditional training programmes in the field of written translation to focus on the development of higher-level research competencies, such as terminology search, and so to make best use of evolving machine translation technologies.More widely, we consider that there is a challenge across the higher education systems in all countries to develop a simple, clear and consistent “international” writing style to assist fast, reliable and low-cost machine translation and hence to advance mutual understanding across the global scientific literature.

2020 ◽  
Author(s):  
Adrián Fuentes-Luque ◽  
Alexandra Santamaría Urbieta

Computer-assisted translation tools are increasingly supplemented by the presence of machine translation (MT) in different areas and working environments, from technical translation to translation in international organizations. MT is also present in the translation of tourism texts, from brochures to food menus, websites and tourist guides. Its need or suitability for use is the subject of growing debate. This article presents a comparative analysis of tourist guides translated by a human translator and three machine translation systems. The aims are to determine a first approach to the level of quality of machine translation in tourist texts and to establish whether some tourist texts can be translated using machine translation alone or whether human participation is necessary, either for the complete translation of the text or only for post-editing tasks.


2021 ◽  
Vol 7 (Extra-C) ◽  
pp. 714-721
Author(s):  
Zulfiya Akhatovna Usmanova ◽  
Ekaterina Nikolayevna Zudilova ◽  
Pavel Alekseevich Arkatov ◽  
Nataliaya Grigorievna Vitkovskaya ◽  
Ekaterina Vladimirovna Kravets

The main specificity of the modern translation market is the translation of large volumes of technical texts and business documents in the shortest time possible. The purpose of the study is to conduct an experiment on the impact of machine translation systems (in terms of using term bases) on the efficiency of future translators. The study provides a literature review on the problem under study and presents the advantages of computer-assisted translation tools in translation practice. Based on the experimental study, the analysis of the influence of computer-assisted translation tools on the quality of written translations of student translators was carried out.


Author(s):  
A.V. Kozina ◽  
Yu.S. Belov

Automatically assessing the quality of machine translation is an important yet challenging task for machine translation research. Translation quality assessment is understood as predicting translation quality without reference to the source text. Translation quality depends on the specific machine translation system and often requires post-editing. Manual editing is a long and expensive process. Since the need to quickly determine the quality of translation increases, its automation is required. In this paper, we propose a quality assessment method based on ensemble supervised machine learning methods. The bilingual corpus WMT 2019 for the EnglishRussian language pair was used as data. The text data volume is 17089 sentences, 85% of the data was used for training, and 15% for testing the model. Linguistic functions extracted from the text in the source and target languages were used as features for training the system, since it is these characteristics that can most accurately characterize the translation in terms of quality. The following tools were used for feature extraction: a free language modeling tool based on SRILM and a Stanford POS Tagger parts of speech tagger. Before training the system, the text was preprocessed. The model was trained using three regression methods: Bagging, Extra Tree, and Random Forest. The algorithms were implemented in the Python programming language using the Scikit learn library. The parameters of the random forest method have been optimized using a grid search. The performance of the model was assessed by the mean absolute error MAE and the root mean square error RMSE, as well as by the Pearsоn coefficient, which determines the correlation with human judgment. Testing was carried out using three machine translation systems: Google and Bing neural systems, Mouses statistical machine translation systems based on phrases and based on syntax. Based on the results of the work, the method of additional trees showed itself best. In addition, for all categories of indicators under consideration, the best results are achieved using the Google machine translation system. The developed method showed good results close to human judgment. The system can be used for further research in the task of assessing the quality of translation.


2018 ◽  
Vol 34 (4) ◽  
pp. 752-771
Author(s):  
Chen-li Kuo

Abstract Statistical approaches have become the mainstream in machine translation (MT), for their potential in producing less rigid and more natural translations than rule-based approaches. However, on closer examination, the uses of function words between statistical machine-translated Chinese and the original Chinese are different, and such differences may be associated with translationese as discussed in translation studies. This article examines the distribution of Chinese function words in a comparable corpus consisting of MTs and the original Chinese texts extracted from Wikipedia. An attribute selection technique is used to investigate which types of function words are significant in discriminating between statistical machine-translated Chinese and the original texts. The results show that statistical MT overuses the most frequent function words, even when alternatives exist. To improve the quality of the end product, developers of MT should pay close attention to modelling Chinese conjunctions and adverbial function words. The results also suggest that machine-translated Chinese shares some characteristics with human-translated texts, including normalization and being influenced by the source language; however, machine-translated texts do not exhibit other characteristics of translationese such as explicitation.


1985 ◽  
Vol 9 (1) ◽  
pp. 16-21
Author(s):  
W. Malcolm Gill

AbstractThe characteristics of computer-assisted instruction (CAI) which makes it so worthy of consideration for use by the teacher of pupils with special needs, whose time for individual attention with her pupils is limited, are briefly discussed, and comment is made about the limitations of this approach. Attention is drawn to the alternative approach of peer tutoring, where the desirable one-to-one characteristic is attainable, and where there is the possibility that the sensitivity to the quality of the computer program, which is one of the limitations of the approach of CAI, may be contrasted with the not-so-sensitive-to-quality-of-program characteristic of peer tutoring. This possibility is examined in the context of a sample of poor reading Year eight pupils who were given a period of peer tutoring by Year eleven and twelve pupils in the same school, where the demonstrably not high quality program consisted of relatively few meetings between tutor and tutee, of short duration, at which it was expected that there would be talk, reading and being read to. Use of a control group enabled the conclusions to be reached that this peer tutoring resulted in gains for the tutees in reading ability and in attitude towards and behaviour in school, this being suggestive of the power of this low staff input, low cost, low technology technique in the area of special education.


2017 ◽  
Author(s):  
AWEJ for Translation & Literary Studies ◽  
Zakaryia Mustafa Almahasees

Machine translation (MT) systems are widely used throughout the world freely or at low cost. The spread of MT entails a thorough analysis of translation produced by such translation systems. The present study evaluates the capacity of two MT systems-Google Translate and Microsoft Bing translator- in translation from Arabic into English of Khalil Gibran’s literary masterpiece - The Prophet (2000). The question that arises in the study is could we trust MT in the translation of literary masterpieces across languages and particularly from Arabic to English? How close does MT output to human translation? To conduct that, the study is adopted Bilingual Evaluation Understudy (BLEU) of Papineni (2000). MT output analysis showed that MT is not accurate, intelligible and natural in translating literary texts due to the difficulty of literary texts, as they are full of metaphors and cultural specifications. Besides, there are some linguistic errors: lexical, syntactic and misinformation. The study also found that both systems provided similar translation for the same input due to either the use of similar MT approach or learning from previous translated texts. Moreover, both systems in some instances, achieve good results at the word level, but bad results at collocation units. The study also showed that automatic translation is insufficient for providing a full analysis of MT output because all automatic metrics are misleading due to dependence on text similarity to a reference human translation. For future research, the study recommended conducting a correlative study that combines manual and automatic evaluation methods to ensure best analysis of MT output. Machine Translation (MT) is still far from reaching fully automatic translation of a quality obtained by human translators.


2021 ◽  
Vol 284 ◽  
pp. 08001
Author(s):  
Ilya Ulitkin ◽  
Irina Filippova ◽  
Natalia Ivanova ◽  
Alexey Poroykov

We report on various approaches to automatic evaluation of machine translation quality and describe three widely used methods. These methods, i.e. methods based on string matching and n-gram models, make it possible to compare the quality of machine translation to reference translation. We employ modern metrics for automatic evaluation of machine translation quality such as BLEU, F-measure, and TER to compare translations made by Google and PROMT neural machine translation systems with translations obtained 5 years ago, when statistical machine translation and rule-based machine translation algorithms were employed by Google and PROMT, respectively, as the main translation algorithms [6]. The evaluation of the translation quality of candidate texts generated by Google and PROMT with reference translation using an automatic translation evaluation program reveal significant qualitative changes as compared with the results obtained 5 years ago, which indicate a dramatic improvement in the work of the above-mentioned online translation systems. Ways to improve the quality of machine translation are discussed. It is shown that modern systems of automatic evaluation of translation quality allow errors made by machine translation systems to be identified and systematized, which will enable the improvement of the quality of translation by these systems in the future.


2013 ◽  
Vol 39 (4) ◽  
pp. 999-1023 ◽  
Author(s):  
Gennadi Lembersky ◽  
Noam Ordan ◽  
Shuly Wintner

Translation models used for statistical machine translation are compiled from parallel corpora that are manually translated. The common assumption is that parallel texts are symmetrical: The direction of translation is deemed irrelevant and is consequently ignored. Much research in Translation Studies indicates that the direction of translation matters, however, as translated language (translationese) has many unique properties. It has already been shown that phrase tables constructed from parallel corpora translated in the same direction as the translation task outperform those constructed from corpora translated in the opposite direction. We reconfirm that this is indeed the case, but emphasize the importance of also using texts translated in the “wrong” direction. We take advantage of information pertaining to the direction of translation in constructing phrase tables by adapting the translation model to the special properties of translationese. We explore two adaptation techniques: First, we create a mixture model by interpolating phrase tables trained on texts translated in the “right” and the “wrong” directions. The weights for the interpolation are determined by minimizing perplexity. Second, we define entropy-based measures that estimate the correspondence of target-language phrases to translationese, thereby eliminating the need to annotate the parallel corpus with information pertaining to the direction of translation. We show that incorporating these measures as features in the phrase tables of statistical machine translation systems results in consistent, statistically significant improvement in the quality of the translation.


2020 ◽  
Vol 44 (1) ◽  
pp. 33-50
Author(s):  
Ivan Dunđer

Machine translation is increasingly becoming a hot research topic in information and communication sciences, computer science and computational linguistics, due to the fact that it enables communication and transferring of meaning across different languages. As the Croatian language can be considered low-resourced in terms of available services and technology, development of new domain-specific machine translation systems is important, especially due to raised interest and needs of industry, academia and everyday users. Machine translation is not perfect, but it is crucial to assure acceptable quality, which is purpose-dependent. In this research, different statistical machine translation systems were built – but one system utilized domain adaptation in particular, with the intention of boosting the output of machine translation. Afterwards, extensive evaluation has been performed – in form of applying several automatic quality metrics and human evaluation with focus on various aspects. Evaluation is done in order to assess the quality of specific machine-translated text.


Sign in / Sign up

Export Citation Format

Share Document