Chart-based transfer rule application in Machine Translation

This article shows that the structure of bilingual material from standard parsing and alignment tools is not optimal for training syntax-based statistical machine translation (SMT) systems. We present three modifications to the MT training data to improve the accuracy of a state-of-the-art syntax MT system: re-structuring changes the syntactic structure of training parse trees to enable reuse of substructures; re-labeling alters bracket labels to enrich rule application context; and re-aligning unifies word alignment across sentences to remove bad word alignments and refine good ones. Better structures, labels, and word alignments are learned by the EM algorithm. We show that each individual technique leads to improvement as measured by BLEU, and we also show that the greatest improvement is achieved by combining them. We report an overall 1.48 BLEU improvement on the NIST08 evaluation set over a strong baseline in Chinese/English translation.

Download Full-text

Integrating Rules and Dictionaries from Shallow-Transfer Machine Translation into Phrase-Based Statistical Machine Translation

Journal of Artificial Intelligence Research ◽

10.1613/jair.4761 ◽

2016 ◽

Vol 55 ◽

pp. 17-61 ◽

Cited By ~ 4

Author(s):

Víctor M. Sánchez-Cartagena ◽

Juan Antonio Pérez-Ortiz ◽

Felipe Sánchez-Martínez

Keyword(s):

Machine Translation ◽

Statistical Machine Translation ◽

Black Box ◽

Source Language ◽

Linguistic Resources ◽

Translation Quality ◽

Hybrid Approaches ◽

Transfer Rule ◽

Transfer Rules ◽

New Strategy

We describe a hybridisation strategy whose objective is to integrate linguistic resources from shallow-transfer rule-based machine translation (RBMT) into phrase-based statistical machine translation (PBSMT). It basically consists of enriching the phrase table of a PBSMT system with bilingual phrase pairs matching transfer rules and dictionary entries from a shallow-transfer RBMT system. This new strategy takes advantage of how the linguistic resources are used by the RBMT system to segment the source-language sentences to be translated, and overcomes the limitations of existing hybrid approaches that treat the RBMT systems as a black box. Experimental results confirm that our approach delivers translations of higher quality than existing ones, and that it is specially useful when the parallel corpus available for training the SMT system is small or when translating out-of-domain texts that are well covered by the RBMT dictionaries. A combination of this approach with a recently proposed unsupervised shallow-transfer rule inference algorithm results in a significantly greater translation quality than that of a baseline PBSMT; in this case, the only hand-crafted resource used are the dictionaries commonly used in RBMT. Moreover, the translation quality achieved by the hybrid system built with automatically inferred rules is similar to that obtained by those built with hand-crafted rules.

Download Full-text

The Prediction of Problem-Solving Assessed Via Microworlds

European Journal of Psychological Assessment ◽

10.1027/1015-5759/a000263 ◽

2016 ◽

Vol 32 (4) ◽

pp. 298-306 ◽

Cited By ~ 7

Author(s):

Samuel Greiff ◽

Katarina Krkovic ◽

Jarkko Hautamäki

Keyword(s):

Working Memory ◽

Problem Solving ◽

Complex Problem ◽

Two Dimensions ◽

Assessment Instruments ◽

Complex Problem Solving ◽

Visual Spatial ◽

Rule Knowledge ◽

Rule Application ◽

Fluid Reasoning

Abstract. In this study, we explored the network of relations between fluid reasoning, working memory, and the two dimensions of complex problem solving, rule knowledge and rule application. In doing so, we replicated the recent study by Bühner, Kröner, and Ziegler (2008) and the structural relations investigated therein [ Bühner, Kröner, & Ziegler, (2008) . Working memory, visual-spatial intelligence and their relationship to problem-solving. Intelligence, 36, 672–680]. However, in the present study, we used different assessment instruments by employing assessments of figural, numerical, and verbal fluid reasoning, an assessment of numerical working memory, and a complex problem solving assessment using the MicroDYN approach. In a sample of N = 2,029 Finnish sixth-grade students of which 328 students took the numerical working memory assessment, the findings diverged substantially from the results reported by Bühner et al. Importantly, in the present study, fluid reasoning was the main source of variation for rule knowledge and rule application, and working memory contributed only a little added value. Albeit generally in line with previously conducted research on the relation between complex problem solving and other cognitive abilities, these findings directly contrast the results of Bühner et al. (2008) who reported that only working memory was a source of variation in complex problem solving, whereas fluid reasoning was not. Explanations for the different patterns of results are sought, and implications for the use of assessment instruments and for research on interindividual differences in complex problem solving are discussed.

Download Full-text

Factored Statistical Machine Translation for German-English

Journal of Applied Information, Communication and Technology ◽

10.33555/ejaict.v5i1.47 ◽

2018 ◽

Vol 5 (1) ◽

pp. 37-45

Author(s):

Darryl Yunus Sulistyan

Keyword(s):

Machine Translation ◽

English Language ◽

Statistical Machine Translation ◽

New Model ◽

Language Pair

Machine Translation is a machine that is going to automatically translate given sentences in a language to other particular language. This paper aims to test the effectiveness of a new model of machine translation which is factored machine translation. We compare the performance of the unfactored system as our baseline compared to the factored model in terms of BLEU score. We test the model in German-English language pair using Europarl corpus. The tools we are using is called MOSES. It is freely downloadable and use. We found, however, that the unfactored model scored over 24 in BLEU and outperforms the factored model which scored below 24 in BLEU for all cases. In terms of words being translated, however, all of factored models outperforms the unfactored model.

Download Full-text

On (Not) Translating Lacan: Barbara Cassin's Sophistico-Analytical Performances

Paragraph ◽

10.3366/para.2020.0323 ◽

2020 ◽

Vol 43 (1) ◽

pp. 98-113

Author(s):

Michael Syrotinski

Keyword(s):

Machine Translation ◽

Reading And Writing ◽

The Relationship ◽

The Way

Barbara Cassin's Jacques the Sophist: Lacan, Logos, and Psychoanalysis, recently translated into English, constitutes an important rereading of Lacan, and a sustained commentary not only on his interpretation of Greek philosophers, notably the Sophists, but more broadly the relationship between psychoanalysis and sophistry. In her study, Cassin draws out the sophistic elements of Lacan's own language, or the way that Lacan ‘philosophistizes’, as she puts it. This article focuses on the relation between Cassin's text and her better-known Dictionary of Untranslatables, and aims to show how and why both ‘untranslatability’ and ‘performativity’ become keys to understanding what this book is not only saying, but also doing. It ends with a series of reflections on machine translation, and how the intersubjective dynamic as theorized by Lacan might open up the possibility of what is here termed a ‘translatorly’ mode of reading and writing.

Download Full-text

A Review and evaluation of Machine Translation methods for Lumasaaba

Journal of Digital Science ◽

10.33847/2686-8296.2.1_1 ◽

2020 ◽

pp. 3-17

Author(s):

Peter Nabende

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Machine Translation ◽

Language Processing ◽

Research Area ◽

Data Driven ◽

East African ◽

Data Set ◽

African Languages ◽

Translation Methods

Natural Language Processing for under-resourced languages is now a mainstream research area. However, there are limited studies on Natural Language Processing applications for many indigenous East African languages. As a contribution to covering the current gap of knowledge, this paper focuses on evaluating the application of well-established machine translation methods for one heavily under-resourced indigenous East African language called Lumasaaba. Specifically, we review the most common machine translation methods in the context of Lumasaaba including both rule-based and data-driven methods. Then we apply a state of the art data-driven machine translation method to learn models for automating translation between Lumasaaba and English using a very limited data set of parallel sentences. Automatic evaluation results show that a transformer-based Neural Machine Translation model architecture leads to consistently better BLEU scores than the recurrent neural network-based models. Moreover, the automatically generated translations can be comprehended to a reasonable extent and are usually associated with the source language input.

Download Full-text