Evaluating MT systems with BEER

Abstract We present BEER, an open source implementation of a machine translation evaluation metric. BEER is a metric trained for high correlation with human ranking by using learning-to-rank training methods. For evaluation of lexical accuracy it uses sub-word units (character n-grams) while for measuring word order it uses hierarchical representations based on PETs (permutation trees). During the last WMT metrics tasks, BEER has shown high correlation with human judgments both on the sentence and the corpus levels. In this paper we will show how BEER can be used for (i) full evaluation of MT output, (ii) isolated evaluation of word order and (iii) tuning MT systems.

Download Full-text

Data selection and smoothing in an open-source system for the 2008 NIST machine translation evaluation

10.21437/interspeech.2008-676 ◽

2008 ◽

Author(s):

Holger Schwenk ◽

Yannick Esteve

Keyword(s):

Open Source ◽

Machine Translation ◽

Data Selection ◽

Machine Translation Evaluation

Download Full-text

ReVal: A Simple and Effective Machine Translation Evaluation Metric Based on Recurrent Neural Networks

10.18653/v1/d15-1124 ◽

2015 ◽

Cited By ~ 6

Author(s):

Rohit Gupta ◽

Constantin Orasan ◽

Josef van Genabith

Keyword(s):

Neural Networks ◽

Machine Translation ◽

Recurrent Neural Networks ◽

Machine Translation Evaluation ◽

Evaluation Metric

Download Full-text

Z-MERT: A Fully Configurable Open Source Tool for Minimum Error Rate Training of Machine Translation Systems

Prague Bulletin of Mathematical Linguistics ◽

10.2478/v10108-009-0018-2 ◽

2009 ◽

Vol 91 (1) ◽

pp. 79-88 ◽

Cited By ~ 23

Author(s):

Omar Zaidan

Keyword(s):

Open Source ◽

Machine Translation ◽

Error Rate ◽

Software Tool ◽

Minimum Error ◽

Open Source Tool ◽

Series Of Experiments ◽

Evaluation Metric ◽

Minimum Error Rate Training ◽

Translation Systems

Z-MERT: A Fully Configurable Open Source Tool for Minimum Error Rate Training of Machine Translation Systems We introduce Z-MERT, a software tool for minimum error rate training of machine translation systems (Och, 2003). In addition to being an open source tool that is extremely easy to compile and run, Z-MERT is also agnostic regarding the evaluation metric, fully configurable, and requires no modification to work with any decoder. We describe Z-MERT and review its features, and report the results of a series of experiments that examine the tool's runtime. We establish that Z-MERT is extremely efficient, making it well-suited for time-sensitive pipelines. The experiments also provide an insight into the tool's runtime in terms of several variables (size of the development set, size of produced N-best lists, etc).

Download Full-text

Machine Translation Evaluation: Unveiling the Role of Dense Sentence Vector Embedding for Morphologically Rich Language

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001420590016 ◽

2019 ◽

Vol 34 (01) ◽

pp. 2059001

Author(s):

Samiksha Tripathi ◽

Vineet Kansal

Keyword(s):

Machine Translation ◽

Optimal Solution ◽

Poor Performance ◽

Target Language ◽

Linguistic Knowledge ◽

Machine Translation Evaluation ◽

Mt Evaluation ◽

Morphologically Rich Languages ◽

Evaluation Metric

Machine Translation (MT) evaluation metrics like BiLingual Evaluation Understudy (BLEU) and Metric for Evaluation of Translation with Explicit Ordering (METEOR) are known to have poor performance for word-order and morphologically rich languages. Application of linguistic knowledge to evaluate MTs for morphologically rich language like Hindi as a target language, is shown to be more effective and accurate [S. Tripathi and V. Kansal, Using linguistic knowledge for machine translation evaluation with Hindi as a target language, Comput. Sist.21(4) (2017) 717–724]. Leveraging the recent progress made in the domain of word vector and sentence vector embedding [T. Mikolov and J. Dean, Distributed representations of words and phrases and their compositionality, Adv. Neural Inf. Process. Syst. 2 (2013) 3111–3119], authors have trained a large corpus of pre-processed Hindi text ([Formula: see text] million tokens) for obtaining the word vectors and sentence vector embedding for Hindi. The training has been performed on high end system configuration utilizing Google Cloud platform resources. This sentence vector embedding is further used to corroborate the findings through linguistic knowledge in evaluation metric. For morphologically rich language as target, evaluation metric of MT systems is considered as an optimal solution. In this paper, authors have demonstrated that MT evaluation using sentence embedding-based approach closely mirrors linguistic evaluation technique. The relevant codes used to generate the vector embedding for Hindi have been uploaded on code sharing platform Github. a

Download Full-text

Designing a Frame-Semantic Machine Translation Evaluation Metric

10.26615/issn.2683-0078.2019_004 ◽

2019 ◽

Author(s):

Oliver Czulo ◽

◽

Tiago Timponi Torrent ◽

Ely Edison da Silva Matos ◽

Alexandre Diniz da Costa ◽

...

Keyword(s):

Machine Translation ◽

Machine Translation Evaluation ◽

Evaluation Metric

Download Full-text

Accurate semantic textual similarity for cleaning noisy parallel corpora using semantic machine translation evaluation metric: The NRC supervised submissions to the Parallel Corpus Filtering task

10.18653/v1/w18-6481 ◽

2018 ◽

Author(s):

Chi-kiu Lo ◽

Michel Simard ◽

Darlene Stewart ◽

Samuel Larkin ◽

Cyril Goutte ◽

...

Keyword(s):

Machine Translation ◽

Machine Translation Evaluation ◽

Parallel Corpora ◽

Parallel Corpus ◽

Evaluation Metric ◽

Semantic Textual Similarity

Download Full-text

MaxSD: A Neural Machine Translation Evaluation Metric Optimized by Maximizing Similarity Distance

Natural Language Understanding and Intelligent Applications - Lecture Notes in Computer Science ◽

10.1007/978-3-319-50496-4_13 ◽

2016 ◽

pp. 153-161

Author(s):

Qingsong Ma ◽

Fandong Meng ◽

Daqi Zheng ◽

Mingxuan Wang ◽

Yvette Graham ◽

...

Keyword(s):

Machine Translation ◽

Neural Machine Translation ◽

Machine Translation Evaluation ◽

Evaluation Metric ◽

Similarity Distance

Download Full-text

Machine Translation Evaluation Metric Based on Dependency Parsing Model

ACM Transactions on Asian and Low-Resource Language Information Processing ◽

10.1145/3312573 ◽

2019 ◽

Vol 18 (4) ◽

pp. 1-15 ◽

Cited By ~ 2

Author(s):

Hui Yu ◽

Weizhi Xu ◽

Shouxun Lin ◽

Qun Liu

Keyword(s):

Machine Translation ◽

Dependency Parsing ◽

Machine Translation Evaluation ◽

Evaluation Metric

Download Full-text

Machine Translation Evaluation using Textual Entailment for Arabic

2020 Seventh International Conference on Social Networks Analysis, Management and Security (SNAMS) ◽

10.1109/snams52053.2020.9336580 ◽

2020 ◽

Author(s):

Mohamed El Marouani ◽

Tarik Boudaa ◽

Nourddine Enneya

Keyword(s):

Machine Translation ◽

Machine Translation Evaluation ◽

Textual Entailment

Download Full-text

An Open-Source Web-Based Tool for Resource-Agnostic Interactive Translation Prediction

Prague Bulletin of Mathematical Linguistics ◽

10.2478/pralin-2014-0015 ◽

2014 ◽

Vol 102 (1) ◽

pp. 69-80 ◽

Cited By ~ 2

Author(s):

Torregrosa Daniel ◽

Forcada Mikel L. ◽

Pérez-Ortiz Juan Antonio

Keyword(s):

Open Source ◽

Machine Translation ◽

Web Application ◽

Statistical Machine Translation ◽

Black Box ◽

Translation System ◽

Web Tool ◽

Web Based ◽

Strongly Coupled ◽

Machine Translation System

Abstract We present a web-based open-source tool for interactive translation prediction (ITP) and describe its underlying architecture. ITP systems assist human translators by making context-based computer-generated suggestions as they type. Most of the ITP systems in literature are strongly coupled with a statistical machine translation system that is conveniently adapted to provide the suggestions. Our system, however, follows a resource-agnostic approach and suggestions are obtained from any unmodified black-box bilingual resource. This paper reviews our ITP method and describes the architecture of Forecat, a web tool, partly based on the recent technology of web components, that eases the use of our ITP approach in any web application requiring this kind of translation assistance. We also evaluate the performance of our method when using an unmodified Moses-based statistical machine translation system as the bilingual resource.

Download Full-text