A Context-Aware Topic Model for Statistical Machine Translation

Topic-Based Dissimilarity and Sensitivity Models for Translation Rule Selection

Journal of Artificial Intelligence Research ◽

10.1613/jair.4265 ◽

2014 ◽

Vol 50 ◽

pp. 1-30 ◽

Cited By ~ 3

Author(s):

M. Zhang ◽

X. Xiao ◽

D. Xiong ◽

Q. Liu

Keyword(s):

Machine Translation ◽

Topic Model ◽

Statistical Machine Translation ◽

Model Space ◽

Target Language ◽

Translation Quality ◽

Rule Selection ◽

Translation Rule ◽

Selection Experiments ◽

Target Side

Translation rule selection is a task of selecting appropriate translation rules for an ambiguous source-language segment. As translation ambiguities are pervasive in statistical machine translation, we introduce two topic-based models for translation rule selection which incorporates global topic information into translation disambiguation. We associate each synchronous translation rule with source- and target-side topic distributions.With these topic distributions, we propose a topic dissimilarity model to select desirable (less dissimilar) rules by imposing penalties for rules with a large value of dissimilarity of their topic distributions to those of given documents. In order to encourage the use of non-topic specific translation rules, we also present a topic sensitivity model to balance translation rule selection between generic rules and topic-specific rules. Furthermore, we project target-side topic distributions onto the source-side topic model space so that we can benefit from topic information of both the source and target language. We integrate the proposed topic dissimilarity and sensitivity model into hierarchical phrase-based machine translation for synchronous translation rule selection. Experiments show that our topic-based translation rule selection model can substantially improve translation quality.

Download Full-text

Context-aware discriminative phrase selection for statistical machine translation

10.3115/1626355.1626374 ◽

2007 ◽

Cited By ~ 7

Author(s):

Jesús Giménez ◽

Lluís Màrquez

Keyword(s):

Machine Translation ◽

Statistical Machine Translation ◽

Context Aware ◽

Selection For

Download Full-text

Context-Aware Phrase Representation for Statistical Machine Translation

Lecture Notes in Computer Science - PRICAI 2018: Trends in Artificial Intelligence ◽

10.1007/978-3-319-97304-3_11 ◽

2018 ◽

pp. 137-149 ◽

Cited By ~ 1

Author(s):

Zhiwei Ruan ◽

Jinsong Su ◽

Deyi Xiong ◽

Rongrong Ji

Keyword(s):

Machine Translation ◽

Statistical Machine Translation ◽

Context Aware

Download Full-text

Dynamic Models in Moses for Online Adaptation

Prague Bulletin of Mathematical Linguistics ◽

10.2478/pralin-2014-0001 ◽

2014 ◽

Vol 101 (1) ◽

pp. 7-28 ◽

Cited By ~ 2

Author(s):

Nicola Bertoldi

Keyword(s):

Machine Translation ◽

Dynamic Models ◽

Statistical Machine Translation ◽

Scoring Function ◽

External Information ◽

Computer Assisted ◽

Context Aware ◽

Translation Process ◽

Translation Quality ◽

Online Adaptation

Abstract A very hot issue for research and industry is how to effectively integrate machine translation (MT) within computer assisted translation (CAT) software. This paper focuses on this issue, and more generally how to dynamically adapt phrase-based statistical machine translation (SMT) by exploiting external knowledge, like the post-editions from professional translators. We present an enhancement of the Moses SMT toolkit dynamically adaptable to external information, which becomes available during the translation process, and which can depend on the previously translated text. We have equipped Moses with two new elements: a new phrase table implementation and a new LM-like feature. Both the phrase table and the LM-like feature can be dynamically modified by adding and removing entries and re-scoring them according to a time-decaying scoring function. The final goal of these two dynamically adaptable features is twofold: to create additional translation alternatives and to reward those which are composed of entries previously inserted therein. The implemented dynamic system is highly configurable, flexible and applicable to many tasks, like for instance online MT adaptation, interactive MT, and context-aware MT. When exploited in a real-world CAT scenario where online adaptation is applied to repetitive texts, it has proven itself very effective in improving translation quality and reducing post-editing effort.

Download Full-text

Factored Statistical Machine Translation for German-English

Journal of Applied Information, Communication and Technology ◽

10.33555/ejaict.v5i1.47 ◽

2018 ◽

Vol 5 (1) ◽

pp. 37-45

Author(s):

Darryl Yunus Sulistyan

Keyword(s):

Machine Translation ◽

English Language ◽

Statistical Machine Translation ◽

New Model ◽

Language Pair

Machine Translation is a machine that is going to automatically translate given sentences in a language to other particular language. This paper aims to test the effectiveness of a new model of machine translation which is factored machine translation. We compare the performance of the unfactored system as our baseline compared to the factored model in terms of BLEU score. We test the model in German-English language pair using Europarl corpus. The tools we are using is called MOSES. It is freely downloadable and use. We found, however, that the unfactored model scored over 24 in BLEU and outperforms the factored model which scored below 24 in BLEU for all cases. In terms of words being translated, however, all of factored models outperforms the unfactored model.

Download Full-text

Proceedings of the Workshop on Statistical Machine Translation - StatMT '06

10.3115/1654650 ◽

2006 ◽

Cited By ~ 1

Keyword(s):

Machine Translation ◽

Statistical Machine Translation

Download Full-text

Proceedings of the Second Workshop on Statistical Machine Translation - StatMT '07

10.3115/1626355 ◽

2007 ◽

Cited By ~ 1

Keyword(s):

Machine Translation ◽

Statistical Machine Translation

Download Full-text

Improve Statistical Machine Translation with Context-Sensitive Bilingual Semantic Embedding Model

10.3115/v1/d14-1015 ◽

2014 ◽

Cited By ~ 3

Author(s):

Haiyang Wu ◽

Daxiang Dong ◽

Xiaoguang Hu ◽

Dianhai Yu ◽

Wei He ◽

...

Keyword(s):

Machine Translation ◽

Statistical Machine Translation ◽

Context Sensitive ◽

Semantic Embedding

Download Full-text

Synchronous Tree Sequence Substitution Grammar for Statistical Machine Translation

ACTA AUTOMATICA SINICA ◽

10.3724/sp.j.1004.2009.01317 ◽

2009 ◽

Vol 35 (10) ◽

pp. 1317-1326

Author(s):

Hong-Fei JIANG ◽

Sheng LI ◽

Min ZHANG ◽

Tie-Jun ZHAO ◽

Mu-Yun YANG

Keyword(s):

Machine Translation ◽

Statistical Machine Translation ◽

Sequence Substitution

Download Full-text

Analysis Accuracy of Similar Word Based Clustering (EWSB) Algorithm on Machine Translator Bahasa Indonesia-Minang

Kinetik Game Technology Information System Computer Network Computing Electronics and Control ◽

10.22219/kinetik.v3i3.241 ◽

2018 ◽

Vol 3 (3) ◽

Author(s):

Herry Sujaini

Keyword(s):

Machine Translation ◽

Clustering Algorithm ◽

Statistical Machine Translation ◽

Target Language ◽

Word Similarity ◽

Similar Word ◽

Word Clustering ◽

Translation Accuracy ◽

Bahasa Indonesia

Extended Word Similarity Based (EWSB) Clustering is a word clustering algorithm based on the value of words similarity obtained from the computation of a corpus. One of the benefits of clustering with this algorithm is to improve the translation of a statistical machine translation. Previous research proved that EWSB algorithm could improve the Indonesian-English translator, where the algorithm was applied to Indonesian language as target language.This paper discusses the results of a research using EWSB algorithm on a Indonesian to Minang statistical machine translator, where the algorithm is applied to Minang language as the target language. The research obtained resulted that the EWSB algorithm is quite effective when used in Minang language as the target language. The results of this study indicate that EWSB algorithm can improve the translation accuracy by 6.36%.

Download Full-text