scholarly journals Translation Quality Assessment. Strategies, Parametres and Procedures

2002 ◽  
Vol 45 (3) ◽  
pp. 497-519 ◽  
Author(s):  
Jamal al-Qinai

Résumé L'évaluation de la qualité de la traduction a souvent fait les frais de jugements impressionnistes et souvent paradoxaux s'appuyant sur une esthétique bien vague. Cette étude cherche à construire un modèle empirique d'évaluation de la qualité qui s'appuie sur des paramètres objectifs de typologie textuelle, de correspondance formelle, de cohérence thématique, de cohésion référentielle, d'équivalence pragmatique et de propriétés lexicosyntactiques. La vérification de la viabilité de ce modèle éclectique passe par son application à l'analyse textuelle d'un texte évocateur (une publicité) avec pour but de faire ressortir les points convergents et divergents.

Author(s):  
A.V. Kozina ◽  
Yu.S. Belov

Automatically assessing the quality of machine translation is an important yet challenging task for machine translation research. Translation quality assessment is understood as predicting translation quality without reference to the source text. Translation quality depends on the specific machine translation system and often requires post-editing. Manual editing is a long and expensive process. Since the need to quickly determine the quality of translation increases, its automation is required. In this paper, we propose a quality assessment method based on ensemble supervised machine learning methods. The bilingual corpus WMT 2019 for the EnglishRussian language pair was used as data. The text data volume is 17089 sentences, 85% of the data was used for training, and 15% for testing the model. Linguistic functions extracted from the text in the source and target languages were used as features for training the system, since it is these characteristics that can most accurately characterize the translation in terms of quality. The following tools were used for feature extraction: a free language modeling tool based on SRILM and a Stanford POS Tagger parts of speech tagger. Before training the system, the text was preprocessed. The model was trained using three regression methods: Bagging, Extra Tree, and Random Forest. The algorithms were implemented in the Python programming language using the Scikit learn library. The parameters of the random forest method have been optimized using a grid search. The performance of the model was assessed by the mean absolute error MAE and the root mean square error RMSE, as well as by the Pearsоn coefficient, which determines the correlation with human judgment. Testing was carried out using three machine translation systems: Google and Bing neural systems, Mouses statistical machine translation systems based on phrases and based on syntax. Based on the results of the work, the method of additional trees showed itself best. In addition, for all categories of indicators under consideration, the best results are achieved using the Google machine translation system. The developed method showed good results close to human judgment. The system can be used for further research in the task of assessing the quality of translation.


2019 ◽  
Vol 40 (1) ◽  
pp. 130-132
Author(s):  
Peixuan Lin ◽  
Chaoqun Xie

Sign in / Sign up

Export Citation Format

Share Document