MEANT : a highly accurate semantic frame based evaluation metric for improving machine translation utility

Abstract We present BEER, an open source implementation of a machine translation evaluation metric. BEER is a metric trained for high correlation with human ranking by using learning-to-rank training methods. For evaluation of lexical accuracy it uses sub-word units (character n-grams) while for measuring word order it uses hierarchical representations based on PETs (permutation trees). During the last WMT metrics tasks, BEER has shown high correlation with human judgments both on the sentence and the corpus levels. In this paper we will show how BEER can be used for (i) full evaluation of MT output, (ii) isolated evaluation of word order and (iii) tuning MT systems.

Download Full-text

Z-MERT: A Fully Configurable Open Source Tool for Minimum Error Rate Training of Machine Translation Systems

Prague Bulletin of Mathematical Linguistics ◽

10.2478/v10108-009-0018-2 ◽

2009 ◽

Vol 91 (1) ◽

pp. 79-88 ◽

Cited By ~ 23

Author(s):

Omar Zaidan

Keyword(s):

Open Source ◽

Machine Translation ◽

Error Rate ◽

Software Tool ◽

Minimum Error ◽

Open Source Tool ◽

Series Of Experiments ◽

Evaluation Metric ◽

Minimum Error Rate Training ◽

Translation Systems

Z-MERT: A Fully Configurable Open Source Tool for Minimum Error Rate Training of Machine Translation Systems We introduce Z-MERT, a software tool for minimum error rate training of machine translation systems (Och, 2003). In addition to being an open source tool that is extremely easy to compile and run, Z-MERT is also agnostic regarding the evaluation metric, fully configurable, and requires no modification to work with any decoder. We describe Z-MERT and review its features, and report the results of a series of experiments that examine the tool's runtime. We establish that Z-MERT is extremely efficient, making it well-suited for time-sensitive pipelines. The experiments also provide an insight into the tool's runtime in terms of several variables (size of the development set, size of produced N-best lists, etc).

Download Full-text

Machine Translation Evaluation: Unveiling the Role of Dense Sentence Vector Embedding for Morphologically Rich Language

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001420590016 ◽

2019 ◽

Vol 34 (01) ◽

pp. 2059001

Author(s):

Samiksha Tripathi ◽

Vineet Kansal

Keyword(s):

Machine Translation ◽

Optimal Solution ◽

Poor Performance ◽

Target Language ◽

Linguistic Knowledge ◽

Machine Translation Evaluation ◽

Mt Evaluation ◽

Morphologically Rich Languages ◽

Evaluation Metric

Machine Translation (MT) evaluation metrics like BiLingual Evaluation Understudy (BLEU) and Metric for Evaluation of Translation with Explicit Ordering (METEOR) are known to have poor performance for word-order and morphologically rich languages. Application of linguistic knowledge to evaluate MTs for morphologically rich language like Hindi as a target language, is shown to be more effective and accurate [S. Tripathi and V. Kansal, Using linguistic knowledge for machine translation evaluation with Hindi as a target language, Comput. Sist.21(4) (2017) 717–724]. Leveraging the recent progress made in the domain of word vector and sentence vector embedding [T. Mikolov and J. Dean, Distributed representations of words and phrases and their compositionality, Adv. Neural Inf. Process. Syst. 2 (2013) 3111–3119], authors have trained a large corpus of pre-processed Hindi text ([Formula: see text] million tokens) for obtaining the word vectors and sentence vector embedding for Hindi. The training has been performed on high end system configuration utilizing Google Cloud platform resources. This sentence vector embedding is further used to corroborate the findings through linguistic knowledge in evaluation metric. For morphologically rich language as target, evaluation metric of MT systems is considered as an optimal solution. In this paper, authors have demonstrated that MT evaluation using sentence embedding-based approach closely mirrors linguistic evaluation technique. The relevant codes used to generate the vector embedding for Hindi have been uploaded on code sharing platform Github. a

Download Full-text

Designing a Frame-Semantic Machine Translation Evaluation Metric

10.26615/issn.2683-0078.2019_004 ◽

2019 ◽

Author(s):

Oliver Czulo ◽

◽

Tiago Timponi Torrent ◽

Ely Edison da Silva Matos ◽

Alexandre Diniz da Costa ◽

...

Keyword(s):

Machine Translation ◽

Machine Translation Evaluation ◽

Evaluation Metric

Download Full-text

Accurate semantic textual similarity for cleaning noisy parallel corpora using semantic machine translation evaluation metric: The NRC supervised submissions to the Parallel Corpus Filtering task

10.18653/v1/w18-6481 ◽

2018 ◽

Author(s):

Chi-kiu Lo ◽

Michel Simard ◽

Darlene Stewart ◽

Samuel Larkin ◽

Cyril Goutte ◽

...

Keyword(s):

Machine Translation ◽

Machine Translation Evaluation ◽

Parallel Corpora ◽

Parallel Corpus ◽

Evaluation Metric ◽

Semantic Textual Similarity

Download Full-text

MaxSD: A Neural Machine Translation Evaluation Metric Optimized by Maximizing Similarity Distance

Natural Language Understanding and Intelligent Applications - Lecture Notes in Computer Science ◽

10.1007/978-3-319-50496-4_13 ◽

2016 ◽

pp. 153-161

Author(s):

Qingsong Ma ◽

Fandong Meng ◽

Daqi Zheng ◽

Mingxuan Wang ◽

Yvette Graham ◽

...

Keyword(s):

Machine Translation ◽

Neural Machine Translation ◽

Machine Translation Evaluation ◽

Evaluation Metric ◽

Similarity Distance

Download Full-text

Designing High Accuracy Statistical Machine Translation for Sign Language Using Parallel Corpus

Journal of Information Technology Research ◽

10.4018/jitr.2019040108 ◽

2019 ◽

Vol 12 (2) ◽

pp. 134-158 ◽

Cited By ~ 3

Author(s):

Achraf Othman ◽

Mohamed Jemni

Keyword(s):

American Sign Language ◽

Sign Language ◽

Machine Translation ◽

Statistical Machine Translation ◽

English Text ◽

Parallel Corpus ◽

Transcription System ◽

Novel Approach ◽

Alignment Algorithms ◽

Evaluation Metric

In this article, the authors deal with the machine translation of written English text to sign language. They study the existing systems and issues in order to propose an implantation of a statistical machine translation from written English text to American Sign Language (English/ASL) taking care of several features of sign language. The work proposes a novel approach to build artificial corpus using grammatical dependencies rules owing to the lack of resources for sign language. The parallel corpus was the input of the statistical machine translation, which was used for creating statistical memory translation based on IBM alignment algorithms. These algorithms were enhanced and optimized by integrating the Jaro–Winkler distances in order to decrease training process. Subsequently, based on the constructed translation memory, a decoder was implemented for translating English text to the ASL using a novel proposed transcription system based on gloss annotation. The results were evaluated using the BLEU evaluation metric.

Download Full-text

Colorless green ideas do sleep furiously: gradient acceptability and the nature of the grammar

The Linguistic Review ◽

10.1515/tlr-2018-0005 ◽

2018 ◽

Vol 35 (3) ◽

pp. 575-599 ◽

Cited By ~ 10

Author(s):

Jon Sprouse ◽

Beracah Yankama ◽

Sagar Indurkhya ◽

Sandiway Fong ◽

Robert C. Berwick

Keyword(s):

Machine Translation ◽

Probabilistic Models ◽

Cost Benefit ◽

Linguistic Knowledge ◽

Evaluation Metrics ◽

Data Sets ◽

Data Set ◽

Evaluation Metric ◽

Green Ideas ◽

Translation Errors

Abstract In their recent paper, Lau, Clark, and Lappin explore the idea that the probability of the occurrence of word strings can form the basis of an adequate theory of grammar (Lau, Jey H., Alexander Clark & 15 Shalom Lappin. 2017. Grammaticality, acceptability, and probability: A prob- abilistic view of linguistic knowledge. Cognitive Science 41(5):1201–1241). To make their case, they present the results of correlating the output of several probabilistic models trained solely on naturally occurring sentences with the gradient acceptability judgments that humans report for ungrammatical sentences derived from roundtrip machine translation errors. In this paper, we first explore the logic of the Lau et al. argument, both in terms of the choice of evaluation metric (gradient acceptability), and in the choice of test data set (machine translation errors on random sentences from a corpus). We then present our own series of studies intended to allow for a better comparison between LCL’s models and existing grammatical theories. We evaluate two of LCL’s probabilistic models (trigrams and recurrent neural network) against three data sets (taken from journal articles, a textbook, and Chomsky’s famous colorless-green-ideas sentence), using three evaluation metrics (LCL’s gradience metric, a categorical version of the metric, and the experimental-logic metric used in the syntax literature). Our results suggest there are very real, measurable cost-benefit tradeoffs inherent in LCL’s models across the three evaluation metrics. The gain in explanation of gradience (between 13% and 31% of gradience) is offset by losses in the other two metrics: a 43%-49% loss in coverage based on a categorical metric of explaining acceptability, and a loss of 12%-35% in explaining experimentally-defined phenomena. This suggests that anyone wishing to pursue LCL’s models as competitors with existing syntactic theories must either be satisfied with this tradeoff, or modify the models to capture the phenomena that are not currently captured.

Download Full-text

MEANT : a highly accurate semantic frame based evaluation metric for improving machine translation utility

STD: An Automatic Evaluation Metric for Machine Translation Based on Word Embeddings

ReVal: A Simple and Effective Machine Translation Evaluation Metric Based on Recurrent Neural Networks

Evaluating MT systems with BEER

Z-MERT: A Fully Configurable Open Source Tool for Minimum Error Rate Training of Machine Translation Systems

Machine Translation Evaluation: Unveiling the Role of Dense Sentence Vector Embedding for Morphologically Rich Language

Designing a Frame-Semantic Machine Translation Evaluation Metric

Accurate semantic textual similarity for cleaning noisy parallel corpora using semantic machine translation evaluation metric: The NRC supervised submissions to the Parallel Corpus Filtering task

MaxSD: A Neural Machine Translation Evaluation Metric Optimized by Maximizing Similarity Distance

Designing High Accuracy Statistical Machine Translation for Sign Language Using Parallel Corpus

Colorless green ideas do sleep furiously: gradient acceptability and the nature of the grammar

Export Citation Format