Can a corpus-driven lexical analysis of human and machine translation unveil discourse features that set them apart?

Target ◽

10.1075/target.20065.fra ◽

2021 ◽

Author(s):

Ana Frankenberg-Garcia

Keyword(s):

Machine Translation ◽

Lexical Analysis ◽

Pronoun Resolution ◽

Keyword Analysis

Abstract There is still much to learn about the ways in which human and machine translation differ with regard to the contexts that regulate the production and interpretation of discourse. The present study explores whether a corpus-driven lexical analysis of human and machine translation can unveil discourse features that set the two apart. A balanced corpus of source texts aligned with authentic, professional translations and neural machine translations was compiled for the study. Lexical discrepancies in the two translation corpora were then extracted via a corpus-driven keyword analysis, and examined qualitatively through parallel concordances of source texts aligned with human and machine translation. The study shows that keyword analysis not only reiterates known problems of discourse in machine translation such as lexical inconsistency and pronoun resolution, but can also provide valuable insights regarding contextual aspects of translated discourse deserving further research.

Download Full-text

Pronoun Resolution Task for Multilingual Machine Translation

SSRN Electronic Journal ◽

10.2139/ssrn.3527387 ◽

2020 ◽

Cited By ~ 1

Author(s):

Madhura Phadke ◽

Satish Devane

Keyword(s):

Machine Translation ◽

Pronoun Resolution

Download Full-text

Improving Context-Aware Neural Machine Translation Using Self-Attentive Sentence Embedding

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6494 ◽

2020 ◽

Vol 34 (05) ◽

pp. 9498-9506 ◽

Cited By ~ 1

Author(s):

Hyeongu Yun ◽

Yongkeun Hwang ◽

Kyomin Jung

Keyword(s):

Machine Translation ◽

Contextual Information ◽

Context Aware ◽

Pronoun Resolution ◽

Test Set ◽

Neural Machine Translation ◽

Attentional Networks ◽

Multiple Context ◽

Sentence Level ◽

Level Information

Fully Attentional Networks (FAN) like Transformer (Vaswani et al. 2017) has shown superior results in Neural Machine Translation (NMT) tasks and has become a solid baseline for translation tasks. More recent studies also have reported experimental results that additional contextual sentences improve translation qualities of NMT models (Voita et al. 2018; Müller et al. 2018; Zhang et al. 2018). However, those studies have exploited multiple context sentences as a single long concatenated sentence, that may cause the models to suffer from inefficient computational complexities and long-range dependencies. In this paper, we propose Hierarchical Context Encoder (HCE) that is able to exploit multiple context sentences separately using the hierarchical FAN structure. Our proposed encoder first abstracts sentence-level information from preceding sentences in a self-attentive way, and then hierarchically encodes context-level information. Through extensive experiments, we observe that our HCE records the best performance measured in BLEU score on English-German, English-Turkish, and English-Korean corpus. In addition, we observe that our HCE records the best performance in a crowd-sourced test set which is designed to evaluate how well an encoder can exploit contextual information. Finally, evaluation on English-Korean pronoun resolution test suite also shows that our HCE can properly exploit contextual information.

Download Full-text

Zero pronoun resolution in a Japanese to English machine translation system by using verbal semantic attributes

10.3115/974499.974536 ◽

1992 ◽

Cited By ~ 5

Author(s):

Hiromi Nakaiwa ◽

Satoru Ikehara

Keyword(s):

Machine Translation ◽

Translation System ◽

Pronoun Resolution ◽

Machine Translation System ◽

Semantic Attributes ◽

Zero Pronoun

Download Full-text

Factored Statistical Machine Translation for German-English

Journal of Applied Information, Communication and Technology ◽

10.33555/ejaict.v5i1.47 ◽

2018 ◽

Vol 5 (1) ◽

pp. 37-45

Author(s):

Darryl Yunus Sulistyan

Keyword(s):

Machine Translation ◽

English Language ◽

Statistical Machine Translation ◽

New Model ◽

Language Pair

Machine Translation is a machine that is going to automatically translate given sentences in a language to other particular language. This paper aims to test the effectiveness of a new model of machine translation which is factored machine translation. We compare the performance of the unfactored system as our baseline compared to the factored model in terms of BLEU score. We test the model in German-English language pair using Europarl corpus. The tools we are using is called MOSES. It is freely downloadable and use. We found, however, that the unfactored model scored over 24 in BLEU and outperforms the factored model which scored below 24 in BLEU for all cases. In terms of words being translated, however, all of factored models outperforms the unfactored model.

Download Full-text

On (Not) Translating Lacan: Barbara Cassin's Sophistico-Analytical Performances

Paragraph ◽

10.3366/para.2020.0323 ◽

2020 ◽

Vol 43 (1) ◽

pp. 98-113

Author(s):

Michael Syrotinski

Keyword(s):

Machine Translation ◽

Reading And Writing ◽

The Relationship ◽

The Way

Barbara Cassin's Jacques the Sophist: Lacan, Logos, and Psychoanalysis, recently translated into English, constitutes an important rereading of Lacan, and a sustained commentary not only on his interpretation of Greek philosophers, notably the Sophists, but more broadly the relationship between psychoanalysis and sophistry. In her study, Cassin draws out the sophistic elements of Lacan's own language, or the way that Lacan ‘philosophistizes’, as she puts it. This article focuses on the relation between Cassin's text and her better-known Dictionary of Untranslatables, and aims to show how and why both ‘untranslatability’ and ‘performativity’ become keys to understanding what this book is not only saying, but also doing. It ends with a series of reflections on machine translation, and how the intersubjective dynamic as theorized by Lacan might open up the possibility of what is here termed a ‘translatorly’ mode of reading and writing.

Download Full-text

A Review and evaluation of Machine Translation methods for Lumasaaba

Journal of Digital Science ◽

10.33847/2686-8296.2.1_1 ◽

2020 ◽

pp. 3-17

Author(s):

Peter Nabende

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Machine Translation ◽

Language Processing ◽

Research Area ◽

Data Driven ◽

East African ◽

Data Set ◽

African Languages ◽

Translation Methods

Natural Language Processing for under-resourced languages is now a mainstream research area. However, there are limited studies on Natural Language Processing applications for many indigenous East African languages. As a contribution to covering the current gap of knowledge, this paper focuses on evaluating the application of well-established machine translation methods for one heavily under-resourced indigenous East African language called Lumasaaba. Specifically, we review the most common machine translation methods in the context of Lumasaaba including both rule-based and data-driven methods. Then we apply a state of the art data-driven machine translation method to learn models for automating translation between Lumasaaba and English using a very limited data set of parallel sentences. Automatic evaluation results show that a transformer-based Neural Machine Translation model architecture leads to consistently better BLEU scores than the recurrent neural network-based models. Moreover, the automatically generated translations can be comprehended to a reasonable extent and are usually associated with the source language input.

Download Full-text