Coping with ambiguity in a large-scale machine translation system

Abstract We describe the version six release of Joshua, an open-source statistical machine translation toolkit. The main difference from release five is the introduction of a simple, unlexicalized, phrase-based stack decoder. This phrase-based decoder shares a hypergraph format with the syntax-based systems, permitting a tight coupling with the existing codebase of feature functions and hypergraph tools. Joshua 6 also includes a number of large-scale discriminative tuners and a simplified sparse feature function interface with reflection-based loading, which allows new features to be used by writing a single function. Finally, Joshua includes a number of simplifications and improvements focused on usability for both researchers and end-users, including the release of language packs — precompiled models that can be run as black boxes.

Download Full-text

Essential requirements for a large-scale operational machine-translation system

Computers and Translation ◽

10.1007/bf00936466 ◽

1986 ◽

Vol 1 (2) ◽

pp. 67-72 ◽

Cited By ~ 2

Author(s):

Ian M. Pigott

Keyword(s):

Machine Translation ◽

Large Scale ◽

Translation System ◽

Machine Translation System

Download Full-text

Large Scale Myanmar to English Neural Machine Translation System

2018 IEEE 7th Global Conference on Consumer Electronics (GCCE) ◽

10.1109/gcce.2018.8574614 ◽

2018 ◽

Cited By ~ 1

Author(s):

Yi Mon ShweSin ◽

Khin Mar Soe ◽

Khin Yadanar Htwe

Keyword(s):

Machine Translation ◽

Large Scale ◽

Translation System ◽

Neural Machine Translation ◽

Machine Translation System

Download Full-text

Selection of In-Domain Bilingual Sentence Pairs Based on Topic Information

Scientific Programming ◽

10.1155/2020/8879570 ◽

2020 ◽

Vol 2020 ◽

pp. 1-7

Author(s):

Bin Li ◽

Jianmin Yao

Keyword(s):

Machine Translation ◽

Large Scale ◽

Training Data ◽

Training Dataset ◽

Translation System ◽

Target Domain ◽

Translation Quality ◽

Machine Translation System ◽

Topic Relevance ◽

Selection Of

The performance of a machine translation system (MTS) depends on the quality and size of the training data. How to extend the training dataset for the MTS in specific domains with effective methods to enhance the performance of machine translation needs to be explored. A method for selecting in-domain bilingual sentence pairs based on the topic information is proposed. With the aid of the topic relevance of the bilingual sentence pairs to the target domain, subsets of sentence pairs related to the texts to be translated are selected from a large-scale bilingual corpus to train the translation system in specific domains to improve the translation quality for in-domain texts. Through the test, the bilingual sentence pairs are selected by using the proposed method, and further the MTS is trained. In this way, the translation performance is greatly enhanced.

Download Full-text

An English-Japanese machine translation system based on formal semantics of natural language

10.3115/991813.991857 ◽

1982 ◽

Cited By ~ 2

Author(s):

Toyo-aki Nishida ◽

Shuji Doshita

Keyword(s):

Natural Language ◽

Machine Translation ◽

Formal Semantics ◽

Translation System ◽

Machine Translation System

Download Full-text

English-Korean Machine Translation System with the Improved Ability to Resolve Linguistic Differences by Pre- and Post-Processing

The Journal of Linguistics Science ◽

10.21296/jls.2020.3.92.151 ◽

2020 ◽

Vol 92 ◽

pp. 151-179

Author(s):

Sung-Dong Kim ◽

Seok Kee Lee

Keyword(s):

Machine Translation ◽

Translation System ◽

Post Processing ◽

Linguistic Differences ◽

Machine Translation System

Download Full-text

English-Dogri Translation System using MOSES

Circulation in Computer Science ◽

10.22632/ccs-2016-251-25 ◽

2016 ◽

Vol 1 (1) ◽

pp. 45-49

Author(s):

Avinash Singh ◽

Asmeet Kour ◽

Shubhnandan S. Jamwal

Keyword(s):

Natural Language Processing ◽

Machine Translation ◽

Language Processing ◽

Statistical Machine Translation ◽

Translation System ◽

Parallel Corpus ◽

English System ◽

Machine Translation System ◽

Translation Machine ◽

Language Pair

The objective behind this paper is to analyze the English-Dogri parallel corpus translation. Machine translation is the translation from one language into another language. Machine translation is the biggest application of the Natural Language Processing (NLP). Moses is statistical machine translation system allow to train translation models for any language pair. We have developed translation system using Statistical based approach which helps in translating English to Dogri and vice versa. The parallel corpus consists of 98,973 sentences. The system gives accuracy of 80% in translating English to Dogri and the system gives accuracy of 87% in translating Dogri to English system.

Download Full-text