machine translation
Recently Published Documents


TOTAL DOCUMENTS

6298
(FIVE YEARS 2237)

H-INDEX

50
(FIVE YEARS 9)

Author(s):  
Xiaomian Kang ◽  
Yang Zhao ◽  
Jiajun Zhang ◽  
Chengqing Zong

Document-level neural machine translation (DocNMT) has yielded attractive improvements. In this article, we systematically analyze the discourse phenomena in Chinese-to-English translation, and focus on the most obvious ones, namely lexical translation consistency. To alleviate the lexical inconsistency, we propose an effective approach that is aware of the words which need to be translated consistently and constrains the model to produce more consistent translations. Specifically, we first introduce a global context extractor to extract the document context and consistency context, respectively. Then, the two types of global context are integrated into a encoder enhancer and a decoder enhancer to improve the lexical translation consistency. We create a test set to evaluate the lexical consistency automatically. Experiments demonstrate that our approach can significantly alleviate the lexical translation inconsistency. In addition, our approach can also substantially improve the translation quality compared to sentence-level Transformer.


Author(s):  
Rupjyoti Baruah ◽  
Rajesh Kumar Mundotiya ◽  
Anil Kumar Singh

Machine translation (MT) systems have been built using numerous different techniques for bridging the language barriers. These techniques are broadly categorized into approaches like Statistical Machine Translation (SMT) and Neural Machine Translation (NMT). End-to-end NMT systems significantly outperform SMT in translation quality on many language pairs, especially those with the adequate parallel corpus. We report comparative experiments on baseline MT systems for Assamese to other Indo-Aryan languages (in both translation directions) using the traditional Phrase-Based SMT as well as some more successful NMT architectures, namely basic sequence-to-sequence model with attention, Transformer, and finetuned Transformer. The results are evaluated using the most prominent and popular standard automatic metric BLEU (BiLingual Evaluation Understudy), as well as other well-known metrics for exploring the performance of different baseline MT systems, since this is the first such work involving Assamese. The evaluation scores are compared for SMT and NMT models for the effectiveness of bi-directional language pairs involving Assamese and other Indo-Aryan languages (Bangla, Gujarati, Hindi, Marathi, Odia, Sinhalese, and Urdu). The highest BLEU scores obtained are for Assamese to Sinhalese for SMT (35.63) and the Assamese to Bangla for NMT systems (seq2seq is 50.92, Transformer is 50.01, and finetuned Transformer is 50.19). We also try to relate the results with the language characteristics, distances, family trees, domains, data sizes, and sentence lengths. We find that the effect of the domain is the most important factor affecting the results for the given data domains and sizes. We compare our results with the only existing MT system for Assamese (Bing Translator) and also with pairs involving Hindi.


Author(s):  
Meftah Mohammed Charaf Eddine

In the field of machine translation of texts, the ambiguity in both lexical (dictionary) and structural aspects is still one of the difficult problems. Researchers in this field use different approaches, the most important of which is machine learning in its various types. The goal of the approach that we propose in this article is to define a new concept of electronic text, which makes the electronic text free from any lexical or structural ambiguity. We used a semantic coding system that relies on attaching the original electronic text (via the text editor interface) with the meanings intended by the author. The author defines the meaning desired for each word that can be a source of ambiguity. The proposed approach in this article can be used with any type of electronic text (text processing applications, web pages, email text, etc.). Thanks to the approach that we propose and through the experiments that we have conducted using it, we can obtain a very high accuracy rate. We can say that the problem of lexical and structural ambiguity can be completely solved. With this new concept of electronic text, the text file contains not only the text but also with it the true sense of the exact meaning intended by the writer in the form of symbols. These semantic symbols are used during machine translation to obtain a translated text completely free of any lexical and structural ambiguity.


2022 ◽  
Vol 4 ◽  
Author(s):  
Ziyan Yang ◽  
Leticia Pinto-Alva ◽  
Franck Dernoncourt ◽  
Vicente Ordonez

People are able to describe images using thousands of languages, but languages share only one visual world. The aim of this work is to use the learned intermediate visual representations from a deep convolutional neural network to transfer information across languages for which paired data is not available in any form. Our work proposes using backpropagation-based decoding coupled with transformer-based multilingual-multimodal language models in order to obtain translations between any languages used during training. We particularly show the capabilities of this approach in the translation of German-Japanese and Japanese-German sentence pairs, given a training data of images freely associated with text in English, German, and Japanese but for which no single image contains annotations in both Japanese and German. Moreover, we demonstrate that our approach is also generally useful in the multilingual image captioning task when sentences in a second language are available at test time. The results of our method also compare favorably in the Multi30k dataset against recently proposed methods that are also aiming to leverage images as an intermediate source of translations.


2022 ◽  
Author(s):  
Shufang Xie ◽  
Yingce Xia ◽  
Lijun Wu ◽  
Yiqing Huang ◽  
Yang Fan ◽  
...  

2022 ◽  
Author(s):  
Xueqing Wu ◽  
Yingce Xia ◽  
Jinhua Zhu ◽  
Lijun Wu ◽  
Shufang Xie ◽  
...  

Author(s):  
Aletta G. Dorst ◽  
Susana Valdez ◽  
Heather Bouman

Abstract Machine Translation (MT), the process by which a computer engine such as Google Translate or Bing automatically translates a text from one language into another without any human involvement, is increasingly used in professional, institutional and everyday contexts for a wide range of purposes. While a growing number of studies has looked at professional translators and translation students, there is currently a lack of research on non-translator users and uses in multilingual contexts. This paper presents a survey examining how, when and why students at Leiden University’s Faculty of Humanities use MT. A questionnaire was used to determine which MT engines students use and for what purposes, and gauge their awareness of issues concerning privacy, academic integrity and plagiarism. The findings reveal a widespread adoption of Google Translate and indicate that students use MT predominantly to look up single words, as an alternative to a dictionary. Many seemed sceptical about the value of MT for educational purposes, and many assumed that the use of MT is not permitted by lecturers for graded assignments, especially in courses focusing on language skills. The results demonstrate a clear need for more MT literacy. Students may not need practical training in how to use MT, but there is much room for improvement in terms of when and why they use it.


Sign in / Sign up

Export Citation Format

Share Document