Combining Machine Translation Output with Open Source: The Carnegie Mellon Multi-Engine Machine Translation Scheme

Abstract We present a web-based open-source tool for interactive translation prediction (ITP) and describe its underlying architecture. ITP systems assist human translators by making context-based computer-generated suggestions as they type. Most of the ITP systems in literature are strongly coupled with a statistical machine translation system that is conveniently adapted to provide the suggestions. Our system, however, follows a resource-agnostic approach and suggestions are obtained from any unmodified black-box bilingual resource. This paper reviews our ITP method and describes the architecture of Forecat, a web tool, partly based on the recent technology of web components, that eases the use of our ITP approach in any web application requiring this kind of translation assistance. We also evaluate the performance of our method when using an unmodified Moses-based statistical machine translation system as the bilingual resource.

Download Full-text

OpenNMT: Open-Source Toolkit for Neural Machine Translation

10.18653/v1/p17-4012 ◽

2017 ◽

Cited By ~ 139

Author(s):

Guillaume Klein ◽

Yoon Kim ◽

Yuntian Deng ◽

Jean Senellart ◽

Alexander Rush

Keyword(s):

Open Source ◽

Machine Translation ◽

Neural Machine Translation

Download Full-text

Multi-engine machine translation with an open-source decoder for statistical machine translation

10.3115/1626355.1626381 ◽

2007 ◽

Cited By ~ 2

Author(s):

Yu Chen ◽

Andreas Eisele ◽

Christian Federmann ◽

Eva Hasler ◽

Michael Jellinghaus ◽

...

Keyword(s):

Open Source ◽

Machine Translation ◽

Statistical Machine Translation

Download Full-text

Recent advances in Apertium, a free/open-source rule-based machine translation platform for low-resource languages

Machine Translation ◽

10.1007/s10590-021-09260-6 ◽

2021 ◽

Author(s):

Tanmai Khanna ◽

Jonathan N. Washington ◽

Francis M. Tyers ◽

Sevilay Bayatlı ◽

Daniel G. Swanson ◽

...

Keyword(s):

Open Source ◽

Machine Translation ◽

Lexical Selection ◽

Rule Based ◽

Low Resource ◽

Language Technology ◽

Language Data ◽

Recursive Structures ◽

Platform Translation ◽

Free Open Source

AbstractThis paper presents an overview of Apertium, a free and open-source rule-based machine translation platform. Translation in Apertium happens through a pipeline of modular tools, and the platform continues to be improved as more language pairs are added. Several advances have been implemented since the last publication, including some new optional modules: a module that allows rules to process recursive structures at the structural transfer stage, a module that deals with contiguous and discontiguous multi-word expressions, and a module that resolves anaphora to aid translation. Also highlighted is the hybridisation of Apertium through statistical modules that augment the pipeline, and statistical methods that augment existing modules. This includes morphological disambiguation, weighted structural transfer, and lexical selection modules that learn from limited data. The paper also discusses how a platform like Apertium can be a critical part of access to language technology for so-called low-resource languages, which might be ignored or deemed unapproachable by popular corpus-based translation technologies. Finally, the paper presents some of the released and unreleased language pairs, concluding with a brief look at some supplementary Apertium tools that prove valuable to users as well as language developers. All Apertium-related code, including language data, is free/open-source and available at https://github.com/apertium.

Download Full-text

Integration of a Multilingual Preordering Component into a Commercial SMT Platform

Prague Bulletin of Mathematical Linguistics ◽

10.1515/pralin-2017-0009 ◽

2017 ◽

Vol 108 (1) ◽

pp. 61-72

Author(s):

Anita Ramm ◽

Riccardo Superbo ◽

Dimitar Shterionov ◽

Tony O’Dowd ◽

Alexander Fraser

Keyword(s):

Open Source ◽

Machine Translation ◽

Long Range ◽

Significant Role ◽

Processing Speed ◽

Statistical Machine Translation ◽

Neural Machine Translation ◽

Open Source Tool

AbstractWe present a multilingual preordering component tailored for a commercial Statistical Machine translation platform. In commercial settings, issues such as processing speed as well as the ability to adapt models to the customers’ needs play a significant role and have a big impact on the choice of approaches that are added to the custom pipeline to deal with specific problems such as long-range reorderings.We developed a fast and customisable preordering component, also available as an open-source tool, which comes along with a generic implementation that is restricted neither to the translation platform nor to the Machine Translation paradigm. We test preordering on three language pairs: English →Japanese/German/Chinese for both Statistical Machine Translation (SMT) and Neural Machine Translation (NMT). Our experiments confirm previously reported improvements in the SMT output when the models are trained on preordered data, but they also show that preordering does not improve NMT.

Download Full-text

Apertium: a free/open-source platform for rule-based machine translation

Machine Translation ◽

10.1007/s10590-011-9090-0 ◽

2011 ◽

Vol 25 (2) ◽

pp. 127-144 ◽

Cited By ~ 53

Author(s):

Mikel L. Forcada ◽

Mireia Ginestí-Rosell ◽

Jacob Nordfalk ◽

Jim O’Regan ◽

Sergio Ortiz-Rojas ◽

...

Keyword(s):

Open Source ◽

Machine Translation ◽

Rule Based ◽

Free Open Source

Download Full-text

A novel rule based machine translation scheme from Greek to Greek Sign Language: Production of different types of large corpora and Language Models evaluation

Computer Speech & Language ◽

10.1016/j.csl.2018.04.001 ◽

2018 ◽

Vol 51 ◽

pp. 110-135 ◽

Cited By ~ 3

Author(s):

Dimitrios Kouremenos ◽

Klimis Ntalianis ◽

Stefanos Kollias

Keyword(s):

Sign Language ◽

Machine Translation ◽

Language Production ◽

Language Models ◽

Rule Based ◽

Different Types ◽

Translation Scheme

Download Full-text

Joshua 6: A phrase-based and hierarchical statistical machine translation system

Prague Bulletin of Mathematical Linguistics ◽

10.1515/pralin-2015-0009 ◽

2015 ◽

Vol 104 (1) ◽

pp. 5-16 ◽

Cited By ~ 1

Author(s):

Matt Post ◽

Yuan Cao ◽

Gaurav Kumar

Keyword(s):

Open Source ◽

Machine Translation ◽

Large Scale ◽

Statistical Machine Translation ◽

End Users ◽

Translation System ◽

Tight Coupling ◽

Single Function ◽

Black Boxes ◽

Machine Translation System

Abstract We describe the version six release of Joshua, an open-source statistical machine translation toolkit. The main difference from release five is the introduction of a simple, unlexicalized, phrase-based stack decoder. This phrase-based decoder shares a hypergraph format with the syntax-based systems, permitting a tight coupling with the existing codebase of feature functions and hypergraph tools. Joshua 6 also includes a number of large-scale discriminative tuners and a simplified sparse feature function interface with reflection-based loading, which allows new features to be used by writing a single function. Finally, Joshua includes a number of simplifications and improvements focused on usability for both researchers and end-users, including the release of language packs — precompiled models that can be run as black boxes.

Download Full-text

CASMACAT: An Open Source Workbench for Advanced Computer Aided Translation

Prague Bulletin of Mathematical Linguistics ◽

10.2478/pralin-2013-0016 ◽

2013 ◽

Vol 100 (1) ◽

pp. 101-112 ◽

Cited By ~ 19

Author(s):

Vicent Alabau ◽

Ragnar Bonk ◽

Christian Buck ◽

Michael Carl ◽

Francisco Casacuberta ◽

...

Keyword(s):

Open Source ◽

Machine Translation ◽

Word Alignment ◽

Computer Aided ◽

Advanced Computer

Abstract We describe an open source workbench that offers advanced computer aided translation (CAT) functionality: post-editing machine translation (MT), interactive translation prediction (ITP), visualization of word alignment, extensive logging with replay mode, integration with eye trackers and e-pen.

Download Full-text