Unsupervised Machine Translation Quality Estimation in Black-Box Setting

Author(s):  
Hui Huang ◽  
Hui Di ◽  
Jin’an Xu ◽  
Kazushige Ouchi ◽  
Yufeng Chen
2013 ◽  
Vol 27 (3-4) ◽  
pp. 281-301 ◽  
Author(s):  
Jesús González-Rubio ◽  
J. Ramón Navarro-Cerdán ◽  
Francisco Casacuberta

2020 ◽  
Vol 8 ◽  
pp. 539-555
Author(s):  
Marina Fomicheva ◽  
Shuo Sun ◽  
Lisa Yankovskaya ◽  
Frédéric Blain ◽  
Francisco Guzmán ◽  
...  

Quality Estimation (QE) is an important component in making Machine Translation (MT) useful in real-world applications, as it is aimed to inform the user on the quality of the MT output at test time. Existing approaches require large amounts of expert annotated data, computation, and time for training. As an alternative, we devise an unsupervised approach to QE where no training or access to additional resources besides the MT system itself is required. Different from most of the current work that treats the MT system as a black box, we explore useful information that can be extracted from the MT system as a by-product of translation. By utilizing methods for uncertainty quantification, we achieve very good correlation with human judgments of quality, rivaling state-of-the-art supervised QE models. To evaluate our approach we collect the first dataset that enables work on both black-box and glass-box approaches to QE.


2015 ◽  
Author(s):  
José G. C. de Souza ◽  
Matteo Negri ◽  
Elisa Ricci ◽  
Marco Turchi

2020 ◽  
Vol 14 (01) ◽  
pp. 137-151
Author(s):  
Prabhakar Gupta ◽  
Mayank Sharma

We demonstrate the potential for using aligned bilingual word embeddings in developing an unsupervised method to evaluate machine translations without a need for parallel translation corpus or reference corpus. We explain different aspects of digital entertainment content subtitles. We share our experimental results for four languages pairs English to French, German, Portuguese, Spanish, and present findings on the shortcomings of Neural Machine Translation for subtitles. We propose several improvements over the system designed by Gupta et al. [P. Gupta, S. Shekhawat and K. Kumar, Unsupervised quality estimation without reference corpus for subtitle machine translation using word embeddings, IEEE 13th Int. Conf. Semantic Computing, 2019, pp. 32–38.] by incorporating custom embedding model curated to subtitles, compound word splits and punctuation inclusion. We show a massive run time improvement of the order of [Formula: see text] by considering three types of edits, removing Proximity Intensity Index (PII) and changing post-edit score calculation from their system.


Informatics ◽  
2021 ◽  
Vol 8 (3) ◽  
pp. 61
Author(s):  
Hannah Béchara ◽  
Constantin Orăsan ◽  
Carla Parra Escartín ◽  
Marcos Zampieri ◽  
William Lowe

As Machine Translation (MT) becomes increasingly ubiquitous, so does its use in professional translation workflows. However, its proliferation in the translation industry has brought about new challenges in the field of Post-Editing (PE). We are now faced with a need to find effective tools to assess the quality of MT systems to avoid underpayments and mistrust by professional translators. In this scenario, one promising field of study is MT Quality Estimation (MTQE), as this aims to determine the quality of an automatic translation and, indirectly, its degree of post-editing difficulty. However, its impact on the translation workflows and the translators’ cognitive load is still to be fully explored. We report on the results of an impact study engaging professional translators in PE tasks using MTQE. To assess the translators’ cognitive load we measure their productivity both in terms of time and effort (keystrokes) in three different scenarios: translating from scratch, post-editing without using MTQE, and post-editing using MTQE. Our results show that good MTQE information can improve post-editing efficiency and decrease the cognitive load on translators. This is especially true for cases with low MT quality.


Sign in / Sign up

Export Citation Format

Share Document