Word reordering on multiple pivots for the Japanese and Indonesian language pair

Author(s):  
Sari Dewi Budiwati ◽  
Masayoshi Aritsugi
Keyword(s):  
2018 ◽  
Vol 5 (1) ◽  
pp. 37-45
Author(s):  
Darryl Yunus Sulistyan

Machine Translation is a machine that is going to automatically translate given sentences in a language to other particular language. This paper aims to test the effectiveness of a new model of machine translation which is factored machine translation. We compare the performance of the unfactored system as our baseline compared to the factored model in terms of BLEU score. We test the model in German-English language pair using Europarl corpus. The tools we are using is called MOSES. It is freely downloadable and use. We found, however, that the unfactored model scored over 24 in BLEU and outperforms the factored model which scored below 24 in BLEU for all cases. In terms of words being translated, however, all of factored models outperforms the unfactored model.


2016 ◽  
Vol 1 (1) ◽  
pp. 45-49
Author(s):  
Avinash Singh ◽  
Asmeet Kour ◽  
Shubhnandan S. Jamwal

The objective behind this paper is to analyze the English-Dogri parallel corpus translation. Machine translation is the translation from one language into another language. Machine translation is the biggest application of the Natural Language Processing (NLP). Moses is statistical machine translation system allow to train translation models for any language pair. We have developed translation system using Statistical based approach which helps in translating English to Dogri and vice versa. The parallel corpus consists of 98,973 sentences. The system gives accuracy of 80% in translating English to Dogri and the system gives accuracy of 87% in translating Dogri to English system.


Author(s):  
Elsa Huertas Barros ◽  
Míriam Buendía Castro

AbstractBased on a previous case study on common translation errors made by trainee translators when dealing with phraseological units in legal translation (Huertas Barros and Buendía Castro 2018, Analysing phraseological units in legal translation: Evaluation of translation errors for the English-Spanish language pair. In S. Gozdz Roszkowski & G. Pontrandolfo (eds.),


Author(s):  
Iana E. ANDREEVA

This article examines the linguistic means of representing the category of everyday life in the novel by G. Sh. Yakhina “Zuleikha opens her eyes” and in its translation into Chinese. Recently, there has been an increasing interest in the anthropology of everyday life, a broad line of research into everyday life. Comparative study of linguistic units, which reveal the essence of everyday human existence, makes it possible to identify lacunar units that are difficult to translate fiction in the context of the Russian-Chinese language pair. The scientific novelty of the research is determined by the involvement in the analysis of linguistic methods of conveying the category of everyday life in the aspect of translating a Russian literary text into Chinese. The work used the methods of comparative, component, contextual analysis, the method of linguoculturological commenting. As a result of the study, the lexical-semantic, lexical-stylistic and grammatical lacunar units were identified, which demonstrate linguocultural barriers in the process of translating a text into Chinese. A comparative analysis of the texts was carried out in order to comprehend the lexical and grammatical transformations performed in the process of translation. As a result, the main ways of compensating for the lacunae of everyday life in Russian-Chinese translation were identified: transcription, tracing, descriptive translation, lexical-semantic replacement. In addition, it was found that the study of various options for depicting everyday life in a literary text not only makes it possible to identify lacunar units of everyday life, but also reveals the artistic and philosophical intention of the work.


Author(s):  
Iqra Muneer ◽  
Rao Muhammad Adeel Nawab

Cross-Lingual Text Reuse Detection (CLTRD) has recently attracted the attention of the research community due to a large amount of digital text readily available for reuse in multiple languages through online digital repositories. In addition, efficient machine translation systems are freely and readily available to translate text from one language into another, which makes it quite easy to reuse text across languages, and consequently difficult to detect it. In the literature, the most prominent and widely used approach for CLTRD is Translation plus Monolingual Analysis (T+MA). To detect CLTR for English-Urdu language pair, T+MA has been used with lexical approaches, namely, N-gram Overlap, Longest Common Subsequence, and Greedy String Tiling. This clearly shows that T+MA has not been thoroughly explored for the English-Urdu language pair. To fulfill this gap, this study presents an in-depth and detailed comparison of 26 approaches that are based on T+MA. These approaches include semantic similarity approaches (semantic tagger based approaches, WordNet-based approaches), probabilistic approach (Kullback-Leibler distance approach), monolingual word embedding-based approaches siamese recurrent architecture, and monolingual sentence transformer-based approaches for English-Urdu language pair. The evaluation was carried out using the CLEU benchmark corpus, both for the binary and the ternary classification tasks. Our extensive experimentation shows that our proposed approach that is a combination of 26 approaches obtained an F 1 score of 0.77 and 0.61 for the binary and ternary classification tasks, respectively, and outperformed the previously reported approaches [ 41 ] ( F 1 = 0.73) for the binary and ( F 1 = 0.55) for the ternary classification tasks) on the CLEU corpus.


2021 ◽  
pp. 1-10
Author(s):  
Zhiqiang Yu ◽  
Yuxin Huang ◽  
Junjun Guo

It has been shown that the performance of neural machine translation (NMT) drops starkly in low-resource conditions. Thai-Lao is a typical low-resource language pair of tiny parallel corpus, leading to suboptimal NMT performance on it. However, Thai and Lao have considerable similarities in linguistic morphology and have bilingual lexicon which is relatively easy to obtain. To use this feature, we first build a bilingual similarity lexicon composed of pairs of similar words. Then we propose a novel NMT architecture to leverage the similarity between Thai and Lao. Specifically, besides the prevailing sentence encoder, we introduce an extra similarity lexicon encoder into the conventional encoder-decoder architecture, by which the semantic information carried by the similarity lexicon can be represented. We further provide a simple mechanism in the decoder to balance the information representations delivered from the input sentence and the similarity lexicon. Our approach can fully exploit linguistic similarity carried by the similarity lexicon to improve translation quality. Experimental results demonstrate that our approach achieves significant improvements over the state-of-the-art Transformer baseline system and previous similar works.


Semiotica ◽  
2018 ◽  
Vol 2018 (221) ◽  
pp. 175-198 ◽  
Author(s):  
Rongbo Fu

AbstractThis paper is a corpus-based sociosemiotic inquiry into the translation of linguistic modality in government press conferences in the Chinese context, with an eye to its indication of interpreter’s identity. Viewing translation (including interpreting) as a process of social semiosis, the paper draws on theoretical insights from Systemic Functional Linguistics (SFL) and applies them to the analysis of modality in both English and Chinese – the language pair that concerns the present research. Results of the study show that, while modality distribution in the two languages are basically maintained at the same level, interchangeable uses between volitive and obligatory subtypes of modality plus the general increase of modality value in interpreted vis-à-vis source speeches indicate that interpreters are not deprived of mediating latitude which is believed to contradict their prescriptive stereotypes. Also, exemplary parallel concordance analysis of modality reveals that interpreters adopt various solutions to translating the same modal element. Further, the paper proposes a taxonomy for the analysis of modality shifts in interpreter-mediated encounters, with illustrative cases of each subclass examined and discussed. The findings are expected to shed light on the interpreter’s identity in political institutional settings.


2010 ◽  
Vol 7 (2) ◽  
pp. 57-73
Author(s):  
Primož Jurko

Phraseology is seen as one of the key elements and arguably the most productive part of any language. %e paper is focused on collocations and separates them from other phraseological units, such as idioms or compounds. Highlighting the difference between a monolingual and a bilingual (i.e. contrastive) approach to collocation, the article presents two distinct classes of collocations: grammatical and lexical. %e latter, treated contrastively, represent the focal point of the paper, since they are an unending source of translation errors to both students of translation and professional translators. %e author introduces a methodology of systematic classification of lexical collocations applied on the Slovene-English language pair and based on structural (lexical congruence) and semantic (translational predictability) criteria.


Sign in / Sign up

Export Citation Format

Share Document