OpenMaTrEx: A Free/Open-Source Marker-Driven Example-Based Machine Translation System

Author(s):  
Sandipan Dandapat ◽  
Mikel L. Forcada ◽  
Declan Groves ◽  
Sergio Penkale ◽  
John Tinsley ◽  
...  
2018 ◽  
Vol 2 (2) ◽  
pp. 32
Author(s):  
Kanaan Mikael Kaka-Khan

In this paper we present a machine translation system developed to translate simple English sentences to Kurdish. The system is based on the (apertuim) free open source engine that provides the environment and the required tools to develop a machine translation system. The developed system is used to translate some as simple sentence, compound sentence, phrases and idioms from English to Kurdish. The resulting translation is then evaluated manually for accuracy and completeness compared to the result produced by the popular (inKurdish) English to Kurdish machine translation system. The result shows that our system is more accurate than inkurdish system. This paper contributes towards the ongoing effort to achieve full machine-based translation in general and English to Kurdish machine translation in specific.


2014 ◽  
Vol 102 (1) ◽  
pp. 69-80 ◽  
Author(s):  
Torregrosa Daniel ◽  
Forcada Mikel L. ◽  
Pérez-Ortiz Juan Antonio

Abstract We present a web-based open-source tool for interactive translation prediction (ITP) and describe its underlying architecture. ITP systems assist human translators by making context-based computer-generated suggestions as they type. Most of the ITP systems in literature are strongly coupled with a statistical machine translation system that is conveniently adapted to provide the suggestions. Our system, however, follows a resource-agnostic approach and suggestions are obtained from any unmodified black-box bilingual resource. This paper reviews our ITP method and describes the architecture of Forecat, a web tool, partly based on the recent technology of web components, that eases the use of our ITP approach in any web application requiring this kind of translation assistance. We also evaluate the performance of our method when using an unmodified Moses-based statistical machine translation system as the bilingual resource.


2015 ◽  
Vol 104 (1) ◽  
pp. 5-16 ◽  
Author(s):  
Matt Post ◽  
Yuan Cao ◽  
Gaurav Kumar

Abstract We describe the version six release of Joshua, an open-source statistical machine translation toolkit. The main difference from release five is the introduction of a simple, unlexicalized, phrase-based stack decoder. This phrase-based decoder shares a hypergraph format with the syntax-based systems, permitting a tight coupling with the existing codebase of feature functions and hypergraph tools. Joshua 6 also includes a number of large-scale discriminative tuners and a simplified sparse feature function interface with reflection-based loading, which allows new features to be used by writing a single function. Finally, Joshua includes a number of simplifications and improvements focused on usability for both researchers and end-users, including the release of language packs — precompiled models that can be run as black boxes.


2016 ◽  
Vol 106 (1) ◽  
pp. 159-168 ◽  
Author(s):  
Julian Hitschler ◽  
Laura Jehl ◽  
Sariya Karimova ◽  
Mayumi Ohta ◽  
Benjamin Körner ◽  
...  

Abstract We present Otedama, a fast, open-source tool for rule-based syntactic pre-ordering, a well established technique in statistical machine translation. Otedama implements both a learner for pre-ordering rules, as well as a component for applying these rules to parsed sentences. Our system is compatible with several external parsers and capable of accommodating many source and all target languages in any machine translation paradigm which uses parallel training data. We demonstrate improvements on a patent translation task over a state-of-the-art English-Japanese hierarchical phrase-based machine translation system. We compare Otedama with an existing syntax-based pre-ordering system, showing comparable translation performance at a runtime speedup of a factor of 4.5-10.


Orð og tunga ◽  
2016 ◽  
Vol 18 ◽  
pp. 131-143
Author(s):  
Ingibjörg Elsa Björnsdóttir

There has been rapid development in language technology and machine translation in recent decades. There are three main types of machine translation: statistical ma-chine translation, rule-based machine translation, and example-based machine translation. In this article the Apertium machine translation system is discussed in particular. While Apertium was originally designed to translate between closely related languages, it can now handle languages that are much more different and variable in structure. Anyone can participate in the development of the Apertium system since it is an open source soft ware. Thus Apertium is one of the best options available in order to research and develop a machine translation system for Icelandic. The Apertium system has an easy-to-use interface, and it translates almost instantly from Icelandic into English or Swedish. However, the system still has certain limitations as regards vocabulary and ambiguity.


2011 ◽  
Vol 25 (1) ◽  
pp. 53-82 ◽  
Author(s):  
Aingeru Mayor ◽  
Iñaki Alegria ◽  
Arantza Díaz de Ilarraza ◽  
Gorka Labaka ◽  
Mikel Lersundi ◽  
...  

2017 ◽  
Vol 107 (1) ◽  
pp. 57-66
Author(s):  
Jernej Vičič ◽  
Vladislav Kuboň ◽  
Petr Homola

Abstract The Machine Translation system Česílko has been developed as an answer to a growing need of translation and localization from one source language to many target languages. The system belongs to the shallow parse, shallow transfer RBMT paradigm and it is designed primarily for translation of related languages. The paper presents the architecture, the development design and the basic installation instructions of the translation system.


2009 ◽  
Vol 91 (1) ◽  
pp. 17-26 ◽  
Author(s):  
Antal van den Bosch ◽  
Peter Berck

Memory-Based Machine Translation and Language Modeling We describe a freely available open source memory-based machine translation system, mbmt. Its translation model is a fast approximate memory-based classifier, trained to map trigrams of source-language words onto trigrams of target-language words. In a second decoding step, the predicted trigrams are rearranged according to their overlap, and candidate output sequences are ranked according to a memory-based language model. We report on the scaling abilities of the memory-based approach, observing fast training and testing times, and linear scaling behavior in speed and memory costs. The system is released as an open source software package1, for which we provide a first reference guide.


Sign in / Sign up

Export Citation Format

Share Document