On the Translation of Literature as a Human Activity par Excellence

The quality of state-of-the-art machine translation systems have prompted a number of scholars to tap into the readiness of such systems for “literary” translation. However, studies on literary machine translation have not overtly stated what they consider as literature and mistakenly assume that literary translation is a matter of transferring meaning and/or form from one language into another. By approaching literature as art and literary translation as an artistic work of re-creation, we counterpoint, in this article, the notion that literary machine translation can be seen as an indisputable evolution within translation technology. Ethical concerns may well be utilitarian in studies to date, but by advocating for a deontological approach, we consider that aesthetical value, cultural mediation (which includes the use of paratexts), and authorship of literary translation (should) rank higher in our ethical assessments of the feasibility and actual contributions of literary machine translation.

Download Full-text

Extracting Parallel Sentences from Nonparallel Corpora Using Parallel Hierarchical Attention Network

Computational Intelligence and Neuroscience ◽

10.1155/2020/8823906 ◽

2020 ◽

Vol 2020 ◽

pp. 1-9

Author(s):

Shaolin Zhu ◽

Yong Yang ◽

Chun Xu

Keyword(s):

Neural Network ◽

Natural Language Processing ◽

Natural Language ◽

Machine Translation ◽

Language Processing ◽

State Of The Art ◽

Research Problem ◽

Shared Task ◽

Translation Systems

Collecting parallel sentences from nonparallel data is a long-standing natural language processing research problem. In particular, parallel training sentences are very important for the quality of machine translation systems. While many existing methods have shown encouraging results, they cannot learn various alignment weights in parallel sentences. To address this issue, we propose a novel parallel hierarchical attention neural network which encodes monolingual sentences versus bilingual sentences and construct a classifier to extract parallel sentences. In particular, our attention mechanism structure can learn different alignment weights of words in parallel sentences. Experimental results show that our model can obtain state-of-the-art performance on the English-French, English-German, and English-Chinese dataset of BUCC 2017 shared task about parallel sentences’ extraction.

Download Full-text

Unsupervised Sub-tree Alignment for Tree-to-Tree Translation

Journal of Artificial Intelligence Research ◽

10.1613/jair.4033 ◽

2013 ◽

Vol 48 ◽

pp. 733-782 ◽

Cited By ~ 3

Author(s):

T. Xiao ◽

J. Zhu

Keyword(s):

Machine Translation ◽

State Of The Art ◽

Experimental Results ◽

Alignment Accuracy ◽

Syntactic Structures ◽

Tree Alignment ◽

Matrix Encoding ◽

Word Alignments ◽

Alignment Model ◽

Translation Systems

This article presents a probabilistic sub-tree alignment model and its application to tree-to-tree machine translation. Unlike previous work, we do not resort to surface heuristics or expensive annotated data, but instead derive an unsupervised model to infer the syntactic correspondence between two languages. More importantly, the developed model is syntactically-motivated and does not rely on word alignments. As a by-product, our model outputs a sub-tree alignment matrix encoding a large number of diverse alignments between syntactic structures, from which machine translation systems can efficiently extract translation rules that are often filtered out due to the errors in 1-best alignment. Experimental results show that the proposed approach outperforms three state-of-the-art baseline approaches in both alignment accuracy and grammar quality. When applied to machine translation, our approach yields a +1.0 BLEU improvement and a -0.9 TER reduction on the NIST machine translation evaluation corpora. With tree binarization and fuzzy decoding, it even outperforms a state-of-the-art hierarchical phrase-based system.

Download Full-text

Development of a method for assessing the quality of machine translation based on ensemble methods in machine learning

Science Intensive Technologies ◽

10.18127/j19998465-202102-06 ◽

2021 ◽

Author(s):

A.V. Kozina ◽

Yu.S. Belov

Keyword(s):

Machine Learning ◽

Random Forest ◽

Quality Assessment ◽

Machine Translation ◽

Translation System ◽

Human Judgment ◽

Translation Quality ◽

Machine Translation System ◽

Translation Systems

Automatically assessing the quality of machine translation is an important yet challenging task for machine translation research. Translation quality assessment is understood as predicting translation quality without reference to the source text. Translation quality depends on the specific machine translation system and often requires post-editing. Manual editing is a long and expensive process. Since the need to quickly determine the quality of translation increases, its automation is required. In this paper, we propose a quality assessment method based on ensemble supervised machine learning methods. The bilingual corpus WMT 2019 for the EnglishRussian language pair was used as data. The text data volume is 17089 sentences, 85% of the data was used for training, and 15% for testing the model. Linguistic functions extracted from the text in the source and target languages were used as features for training the system, since it is these characteristics that can most accurately characterize the translation in terms of quality. The following tools were used for feature extraction: a free language modeling tool based on SRILM and a Stanford POS Tagger parts of speech tagger. Before training the system, the text was preprocessed. The model was trained using three regression methods: Bagging, Extra Tree, and Random Forest. The algorithms were implemented in the Python programming language using the Scikit learn library. The parameters of the random forest method have been optimized using a grid search. The performance of the model was assessed by the mean absolute error MAE and the root mean square error RMSE, as well as by the Pearsоn coefficient, which determines the correlation with human judgment. Testing was carried out using three machine translation systems: Google and Bing neural systems, Mouses statistical machine translation systems based on phrases and based on syntax. Based on the results of the work, the method of additional trees showed itself best. In addition, for all categories of indicators under consideration, the best results are achieved using the Google machine translation system. The developed method showed good results close to human judgment. The system can be used for further research in the task of assessing the quality of translation.

Download Full-text

Function words in statistical machine-translated Chinese and original Chinese: A study into the translationese of machine translation systems

Digital Scholarship in the Humanities ◽

10.1093/llc/fqy050 ◽

2018 ◽

Vol 34 (4) ◽

pp. 752-771

Author(s):

Chen-li Kuo

Keyword(s):

Machine Translation ◽

Attribute Selection ◽

Close Attention ◽

Function Words ◽

Rule Based ◽

Source Language ◽

Statistical Mt ◽

Chinese Texts ◽

Translation Systems

Abstract Statistical approaches have become the mainstream in machine translation (MT), for their potential in producing less rigid and more natural translations than rule-based approaches. However, on closer examination, the uses of function words between statistical machine-translated Chinese and the original Chinese are different, and such differences may be associated with translationese as discussed in translation studies. This article examines the distribution of Chinese function words in a comparable corpus consisting of MTs and the original Chinese texts extracted from Wikipedia. An attribute selection technique is used to investigate which types of function words are significant in discriminating between statistical machine-translated Chinese and the original texts. The results show that statistical MT overuses the most frequent function words, even when alternatives exist. To improve the quality of the end product, developers of MT should pay close attention to modelling Chinese conjunctions and adverbial function words. The results also suggest that machine-translated Chinese shares some characteristics with human-translated texts, including normalization and being influenced by the source language; however, machine-translated texts do not exhibit other characteristics of translationese such as explicitation.

Download Full-text

Machine translation systems and guidebooks: an approach to the importance of the role of the human translator

10.7764/onomazein.ne7.04 ◽

2020 ◽

Author(s):

Adrián Fuentes-Luque ◽

Alexandra Santamaría Urbieta

Keyword(s):

Machine Translation ◽

Computer Assisted ◽

Working Environments ◽

Translation Tools ◽

Technical Translation ◽

Complete Translation ◽

Translation Systems ◽

Tourist Guides

Computer-assisted translation tools are increasingly supplemented by the presence of machine translation (MT) in different areas and working environments, from technical translation to translation in international organizations. MT is also present in the translation of tourism texts, from brochures to food menus, websites and tourist guides. Its need or suitability for use is the subject of growing debate. This article presents a comparative analysis of tourist guides translated by a human translator and three machine translation systems. The aims are to determine a first approach to the level of quality of machine translation in tourist texts and to establish whether some tourist texts can be translated using machine translation alone or whether human participation is necessary, either for the complete translation of the text or only for post-editing tasks.

Download Full-text

Unsupervised Quality Estimation for Neural Machine Translation

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00330 ◽

2020 ◽

Vol 8 ◽

pp. 539-555

Author(s):

Marina Fomicheva ◽

Shuo Sun ◽

Lisa Yankovskaya ◽

Frédéric Blain ◽

Francisco Guzmán ◽

...

Keyword(s):

Machine Translation ◽

Real World ◽

State Of The Art ◽

Black Box ◽

Test Time ◽

Quality Estimation ◽

Neural Machine Translation ◽

Real World Applications ◽

Unsupervised Approach

Quality Estimation (QE) is an important component in making Machine Translation (MT) useful in real-world applications, as it is aimed to inform the user on the quality of the MT output at test time. Existing approaches require large amounts of expert annotated data, computation, and time for training. As an alternative, we devise an unsupervised approach to QE where no training or access to additional resources besides the MT system itself is required. Different from most of the current work that treats the MT system as a black box, we explore useful information that can be extracted from the MT system as a by-product of translation. By utilizing methods for uncertainty quantification, we achieve very good correlation with human judgments of quality, rivaling state-of-the-art supervised QE models. To evaluate our approach we collect the first dataset that enables work on both black-box and glass-box approaches to QE.

Download Full-text

Is Neural Machine Translation the New State of the Art?

Prague Bulletin of Mathematical Linguistics ◽

10.1515/pralin-2017-0013 ◽

2017 ◽

Vol 108 (1) ◽

pp. 109-120 ◽

Cited By ~ 37

Author(s):

Sheila Castilho ◽

Joss Moorkens ◽

Federico Gaspari ◽

Iacer Calixto ◽

John Tinsley ◽

...

Keyword(s):

Machine Translation ◽

State Of The Art ◽

Evaluation Methods ◽

Automatic Evaluation ◽

New Paradigm ◽

Neural Machine Translation ◽

Human Evaluation ◽

Statistical Mt

Abstract This paper discusses neural machine translation (NMT), a new paradigm in the MT field, comparing the quality of NMT systems with statistical MT by describing three studies using automatic and human evaluation methods. Automatic evaluation results presented for NMT are very promising, however human evaluations show mixed results. We report increases in fluency but inconsistent results for adequacy and post-editing effort. NMT undoubtedly represents a step forward for the MT field, but one that the community should be careful not to oversell.

Download Full-text

A review of the state-of-the-art in automatic post-editing

Machine Translation ◽

10.1007/s10590-020-09252-y ◽

2020 ◽

Author(s):

Félix do Carmo ◽

Dimitar Shterionov ◽

Joss Moorkens ◽

Joachim Wagner ◽

Murhaf Hossari ◽

...

Keyword(s):

Machine Translation ◽

State Of The Art ◽

The State ◽

Definition Of ◽

Translation Systems

AbstractThis article presents a review of the evolution of automatic post-editing, a term that describes methods to improve the output of machine translation systems, based on knowledge extracted from datasets that include post-edited content. The article describes the specificity of automatic post-editing in comparison with other tasks in machine translation, and it discusses how it may function as a complement to them. Particular detail is given in the article to the five-year period that covers the shared tasks presented in WMT conferences (2015–2019). In this period, discussion of automatic post-editing evolved from the definition of its main parameters to an announced demise, associated with the difficulties in improving output obtained by neural methods, which was then followed by renewed interest. The article debates the role and relevance of automatic post-editing, both as an academic endeavour and as a useful application in commercial workflows.

Download Full-text

NMT-Keras: a Very Flexible Toolkit with a Focus on Interactive NMT and Online Learning

Prague Bulletin of Mathematical Linguistics ◽

10.2478/pralin-2018-0010 ◽

2018 ◽

Vol 111 (1) ◽

pp. 113-124 ◽

Cited By ~ 3

Author(s):

Álvaro Peris ◽

Francisco Casacuberta

Keyword(s):

Machine Translation ◽

Question Answering ◽

State Of The Art ◽

Translation System ◽

Extended Version ◽

Neural Machine Translation ◽

Video Captioning ◽

High Level ◽

Translation Systems

Abstract We present NMT-Keras, a flexible toolkit for training deep learning models, which puts a particular emphasis on the development of advanced applications of neural machine translation systems, such as interactive-predictive translation protocols and long-term adaptation of the translation system via continuous learning. NMT-Keras is based on an extended version of the popular Keras library, and it runs on Theano and TensorFlow. State-of-the-art neural machine translation models are deployed and used following the high-level framework provided by Keras. Given its high modularity and flexibility, it also has been extended to tackle different problems, such as image and video captioning, sentence classification and visual question answering.

Download Full-text

Automatic evaluation of the quality of machine translation of a scientific text: the results of a five-year-long experiment

E3S Web of Conferences ◽

10.1051/e3sconf/202128408001 ◽

2021 ◽

Vol 284 ◽

pp. 08001

Author(s):

Ilya Ulitkin ◽

Irina Filippova ◽

Natalia Ivanova ◽

Alexey Poroykov

Keyword(s):

Machine Translation ◽

Statistical Machine Translation ◽

Dramatic Improvement ◽

Automatic Evaluation ◽

Neural Machine Translation ◽

Translation Quality ◽

Automatic Translation ◽

Translation Systems ◽

Qualitative Changes

We report on various approaches to automatic evaluation of machine translation quality and describe three widely used methods. These methods, i.e. methods based on string matching and n-gram models, make it possible to compare the quality of machine translation to reference translation. We employ modern metrics for automatic evaluation of machine translation quality such as BLEU, F-measure, and TER to compare translations made by Google and PROMT neural machine translation systems with translations obtained 5 years ago, when statistical machine translation and rule-based machine translation algorithms were employed by Google and PROMT, respectively, as the main translation algorithms [6]. The evaluation of the translation quality of candidate texts generated by Google and PROMT with reference translation using an automatic translation evaluation program reveal significant qualitative changes as compared with the results obtained 5 years ago, which indicate a dramatic improvement in the work of the above-mentioned online translation systems. Ways to improve the quality of machine translation are discussed. It is shown that modern systems of automatic evaluation of translation quality allow errors made by machine translation systems to be identified and systematized, which will enable the improvement of the quality of translation by these systems in the future.

Download Full-text