Translation of Sentence Lampung-Indonesian Languages with Neural Machine Translation  Attention Based Approach

Zaenal Abidin

doi:10.35450/jip.v6i02.97

Translation of Sentence Lampung-Indonesian Languages with Neural Machine Translation Attention Based Approach

Inovasi Pembangunan : Jurnal Kelitbangan ◽

10.35450/jip.v6i02.97 ◽

2018 ◽

Vol 6 (02) ◽

pp. 191-206

Author(s):

Zaenal Abidin

Keyword(s):

Machine Translation ◽

Language Translation ◽

Training Process ◽

Neural Machine Translation ◽

New Approach ◽

Source Language ◽

Average Value ◽

Approach Method ◽

Stable Vectors ◽

Network Component

In this research, automatically Lampung language translation into the Indonesian language was using neural machine translation (NMT) attention based approach. NMT, a new approach method in machine translation technology, that has worked by combining the encoder and decoder. The encoder in NMT is a recurrent neural network component that encrypts the source language to several length-stable vectors and the decoder is a recurrent neural networks component that generates translation result comprehensive. NMT Research has begun with creating a pair of 3000 parallel sentences of Lampung language (api dialect) and Indonesian language. Then it continues to decide the NMT parameter model for the data training process. The next step is building NMT model and evaluate it. The testing of this approach has used 25 single sentences without out-of-vocabulary (OOV), 25 single sentences with OOV, 25 plural sentences without OOV, and 25 plural sentences with OOV. The testing translation result using NMT attention shows the bilingual evaluation understudy (BLEU) an average value is 51, 96 %.

Download Full-text

A Study of Neural Machine Translation from Chinese to Urdu

Journal of Autonomous Intelligence ◽

10.32629/jai.v2i4.82 ◽

2020 ◽

Vol 2 (4) ◽

pp. 28

Author(s):

. Zeeshan

Keyword(s):

Machine Translation ◽

Chinese Language ◽

Language Translation ◽

Target Language ◽

Foreign Languages ◽

Neural Machine Translation ◽

Source Language ◽

Great Progress ◽

Score Method ◽

Translation Methods

Machine Translation (MT) is used for giving a translation from a source language to a target language. Machine translation simply translates text or speech from one language to another language, but this process is not sufficient to give the perfect translation of a text due to the requirement of identification of whole expressions and their direct counterparts. Neural Machine Translation (NMT) is one of the most standard machine translation methods, which has made great progress in the recent years especially in non-universal languages. However, local language translation software for other foreign languages is limited and needs improving. In this paper, the Chinese language is translated to the Urdu language with the help of Open Neural Machine Translation (OpenNMT) in Deep Learning. Firstly, a Chineseto Urdu language sentences datasets were established and supported with Seven million sentences. After that, these datasets were trained by using the Open Neural Machine Translation (OpenNMT) method. At the final stage, the translation was compared to the desired translation with the help of the Bleu Score Method.

Download Full-text

Knowledge Graphs Effectiveness in Neural Machine Translation Improvement

Computer Science ◽

10.7494/csci.2020.21.3.3701 ◽

2020 ◽

Vol 21 (3) ◽

Author(s):

Benyamin Ahmadnia ◽

Bonnie J. Dorr ◽

Parisa Kordjamshidi

Keyword(s):

Machine Translation ◽

Semantic Representation ◽

Language Translation ◽

Semantic Relations ◽

Training Data ◽

Target Language ◽

Neural Machine Translation ◽

Source Language ◽

Knowledge Graphs ◽

Unknown Words

Neural Machine Translation (NMT) systems require a massive amount of Maintaining semantic relations between words during the translation process yields more accurate target-language output from Neural Machine Translation (NMT). Although difficult to achieve from training data alone, it is possible to leverage Knowledge Graphs (KGs) to retain source-language semantic relations in the corresponding target-language translation. The core idea is to use KG entity relations as embedding constraints to improve the mapping from source to target. This paper describes two embedding constraints, both of which employ Entity Linking (EL)---assigning a unique identity to entities---to associate words in training sentences with those in the KG: (1) a monolingual embedding constraint that supports an enhanced semantic representation of the source words through access to relations between entities in a KG; and (2) a bilingual embedding constraint that forces entity relations in the source-language to be carried over to the corresponding entities in the target-language translation. The method is evaluated for English-Spanish translation exploiting Freebase as a source of knowledge. Our experimental results show that exploiting KG information not only decreases the number of unknown words in the translation but also improves translation quality.

Download Full-text

Improved neural machine translation for low-resource English–Assamese pair

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-219260 ◽

2021 ◽

pp. 1-12

Author(s):

Sahinur Rahman Laskar ◽

Abdullah Faiz Ur Rahman Khilji ◽

Partha Pakray ◽

Sivaji Bandyopadhyay

Keyword(s):

Machine Translation ◽

Data Augmentation ◽

Language Translation ◽

Linguistically Diverse ◽

Neural Machine Translation ◽

Low Resource ◽

Parallel Data ◽

The World ◽

Translation Accuracy ◽

Vocabulary Problems

Language translation is essential to bring the world closer and plays a significant part in building a community among people of different linguistic backgrounds. Machine translation dramatically helps in removing the language barrier and allows easier communication among linguistically diverse communities. Due to the unavailability of resources, major languages of the world are accounted as low-resource languages. This leads to a challenging task of automating translation among various such languages to benefit indigenous speakers. This article investigates neural machine translation for the English–Assamese resource-poor language pair by tackling insufficient data and out-of-vocabulary problems. We have also proposed an approach of data augmentation-based NMT, which exploits synthetic parallel data and shows significantly improved translation accuracy for English-to-Assamese and Assamese-to-English translation and obtained state-of-the-art results.

Download Full-text

Controlling Neural Machine Translation Formality with Synthetic Supervision

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6379 ◽

2020 ◽

Vol 34 (05) ◽

pp. 8568-8575

Author(s):

Xing Niu ◽

Marine Carpuat

Keyword(s):

Machine Translation ◽

Target Language ◽

Sentence Pair ◽

English Sentence ◽

Neural Machine Translation ◽

Source Language ◽

Training Scheme ◽

Training Examples ◽

Language Content ◽

Missing Element

This work aims to produce translations that convey source language content at a formality level that is appropriate for a particular audience. Framing this problem as a neural sequence-to-sequence task ideally requires training triplets consisting of a bilingual sentence pair labeled with target language formality. However, in practice, available training examples are limited to English sentence pairs of different styles, and bilingual parallel sentences of unknown formality. We introduce a novel training scheme for multi-task models that automatically generates synthetic training triplets by inferring the missing element on the fly, thus enabling end-to-end training. Comprehensive automatic and human assessments show that our best model outperforms existing models by producing translations that better match desired formality levels while preserving the source meaning.1

Download Full-text

Identifying Semantics in Clinical Reports Using Neural Machine Translation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33019552 ◽

2019 ◽

Vol 33 ◽

pp. 9552-9557

Author(s):

Srikanth Mujjiga ◽

Vamsi Krishna ◽

Kalyan Chakravarthi ◽

Vijayananda J

Keyword(s):

Machine Translation ◽

Large Scale ◽

Language Translation ◽

Semantic Search ◽

Neural Machine Translation ◽

Healthcare Facilities ◽

Semantic Level ◽

Concept Space ◽

Unique Approach ◽

Retrieval Problem

Clinical documents are vital resources for radiologists when they have to consult or refer while studying similar cases. In large healthcare facilities where millions of reports are generated, searching for relevant documents is quite challenging. With abundant interchangeable words in clinical domain, understanding the semantics of the words in the clinical documents is vital to improve the search results. This paper details an end to end semantic search application to address the large scale information retrieval problem of clinical reports. The paper specifically focuses on the challenge of identifying semantics in the clinical reports to facilitate search at semantic level. The semantic search works by mapping the documents into the concept space and the search is performed in the concept space. A unique approach of framing the concept mapping problem as a language translation problem is proposed in this paper. The concept mapper is modelled using the Neural machine translation model (NMT) based on encoder-decoder with attention architecture. The regular expression based concept mapper takes approximately 3 seconds to extract UMLS concepts from a single document, where as the trained NMT does the same in approximately 30 milliseconds. NMT based model further enables incorporation of negation detection to identify whether a concept is negated or not, facilitating search for negated queries.

Download Full-text

Deep Learning-based Roman-Urdu to Urdu Transliteration

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001421520017 ◽

2020 ◽

pp. 2152001

Author(s):

Mehreen Alam ◽

Sibt ul Hussain

Keyword(s):

Machine Translation ◽

State Of The Art ◽

Research Problem ◽

Attention Mechanism ◽

Data Driven ◽

Neural Machine Translation ◽

Parallel Corpus ◽

Source Language ◽

Data Driven Approach ◽

Modern Machine

Attention-based encoder-decoder models have superseded conventional techniques due to their unmatched performance on many neural machine translation problems. Usually, the encoders and decoders are two recurrent neural networks where the decoder is directed to focus on relevant parts of the source language using attention mechanism. This data-driven approach leads to generic and scalable solutions with no reliance on manual hand-crafted features. To the best of our knowledge, none of the modern machine translation approaches has been applied to address the research problem of Urdu machine transliteration. Ours is the first attempt to apply the deep neural network-based encoder-decoder using attention mechanism to address the aforementioned problem using Roman-Urdu and Urdu parallel corpus. To this end, we present (i) the first ever Roman-Urdu to Urdu parallel corpus of 1.1 million sentences, (ii) three state of the art encoder-decoder models, and (iii) a detailed empirical analysis of these three models on the Roman-Urdu to Urdu parallel corpus. Overall, attention-based model gives state-of-the-art performance with the benchmark of 70 BLEU score. Our qualitative experimental evaluation shows that our models generate coherent transliterations which are grammatically and logically correct.

Download Full-text

TRANSLATION OF ENGLISH TASKS INTO INDONESIAN THROUGH ONLINE MACHINE TRANSLATION PROGRAM

IJER - INDONESIAN JOURNAL OF EDUCATIONAL REVIEW ◽

10.21009/ijer.04.01.10 ◽

2017 ◽

Vol 4 (1) ◽

pp. 103

Author(s):

Emzir Emzir ◽

Ninuk Lustyantie ◽

Akbar Akbar

Keyword(s):

Machine Translation ◽

Language Translation ◽

Doctoral Program ◽

Deep Understanding ◽

Language Education ◽

Target Language ◽

State University ◽

Optimum Result ◽

Source Language ◽

Different Cultures

The objective of this research is to obtain a deep understanding about the online machine translation of graduate students in the Language Education Doctoral Program of State University of Jakarta, Indonesia, from source language to target language in order to achieve equivalence in the subject of Language Translation and Education. The approach used is qualitative approach with ethnography method. The translation process is conducted by writing down words or copying-pasting sentences to be translated and then those words/sentences will be automatically translated by machine translation. A repetitive edit, revision and correction process shall be first performed in order to get an optimum result i.e. translated sentences are equal in textual and meanings. The deviations occur due to inaccurate equivalents caused by different cultures between the source language and target language as well as the scope of translated language scientific field. The used strategy is a literal translation. Based on the research results, the translation of English tasks to Indonesian through the online translation program is very useful to facilitate the students’ lecturing process in completing their tasks.

Download Full-text

Acquiring Knowledge from Pre-Trained Model to Neural Machine Translation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6465 ◽

2020 ◽

Vol 34 (05) ◽

pp. 9266-9273

Author(s):

Rongxiang Weng ◽

Heng Yu ◽

Shujian Huang ◽

Shanbo Cheng ◽

Weihua Luo

Keyword(s):

Machine Translation ◽

Large Scale ◽

Fine Tuning ◽

Great Success ◽

Training Process ◽

Neural Machine Translation ◽

Language Knowledge ◽

Knowledge Distillation ◽

Training Objective ◽

Natural Language Process

Pre-training and fine-tuning have achieved great success in natural language process field. The standard paradigm of exploiting them includes two steps: first, pre-training a model, e.g. BERT, with a large scale unlabeled monolingual data. Then, fine-tuning the pre-trained model with labeled data from downstream tasks. However, in neural machine translation (NMT), we address the problem that the training objective of the bilingual task is far different from the monolingual pre-trained model. This gap leads that only using fine-tuning in NMT can not fully utilize prior language knowledge. In this paper, we propose an Apt framework for acquiring knowledge from pre-trained model to NMT. The proposed approach includes two modules: 1). a dynamic fusion mechanism to fuse task-specific features adapted from general knowledge into NMT network, 2). a knowledge distillation paradigm to learn language knowledge continuously during the NMT training process. The proposed approach could integrate suitable knowledge from pre-trained models to improve the NMT. Experimental results on WMT English to German, German to English and Chinese to English machine translation tasks show that our model outperforms strong baselines and the fine-tuning counterparts.

Download Full-text

Language To Language Translation System

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit206363 ◽

2020 ◽

pp. 289-293

Author(s):

Ms Pratheeksha ◽

Pratheeksha Rai ◽

Ms Vijetha

Keyword(s):

Speech Recognition ◽

Machine Translation ◽

Automatic Speech Recognition ◽

Speech Synthesis ◽

Language Translation ◽

Target Language ◽

Translation System ◽

Text To Speech ◽

Source Language ◽

Text To Speech Synthesis

The system used in Language to Language Translation is the phrases spoken in one language are immediately spoken in other language by the device. Language to Language Translation is a three steps software process which includes Automatic Speech Recognition, Machine Translation and Voice Synthesis. Language to Language system includes the major speech translation projects using different approaches for Speech Recognition, Translation and Text to Speech synthesis highlighting the major pros and cons for the approach being used. Language translation is a process that takes the conversational phrase in one language as an input and translated speech phrases in another language as the output. The three components of language-to-language translation are connected in a sequential order. Automatic Speech Recognition (ASR) is responsible for converting the spoken phrases of source language to the text in the same language followed by machine translation which translates the source language to next target language text and finally the speech synthesizer is responsible for text to speech conversion of target language.

Download Full-text

Explicitation in Neural Machine Translation

Across Languages and Cultures ◽

10.1556/084.2020.00012 ◽

2020 ◽

Vol 21 (2) ◽

pp. 195-216

Author(s):

Ralph Krüger

Keyword(s):

Machine Translation ◽

Contextual Information ◽

Research Report ◽

Linguistic Terms ◽

Neural Machine Translation ◽

New Approach ◽

Contextual Awareness ◽

Translation Systems ◽

Text Context ◽

And Storage

This paper is concerned with the following question: to what extent does neural machine translation (NMT) – a relatively new approach to machine translation (MT), which can draw on richer contextual information than previous MT architectures – perform explicitation shifts in translation and how are these shifts realised in linguistic terms? In order to answer this question, the paper attempts to identify instances of explicitation in the machine-translated version of a research report on carbon dioxide capture and storage. The machine-translated text was created using the publicly available generic NMT system DeepL. The human translation of the research report was analysed in a prior research project for instances of explicitation and implicitation (Krüger 2015). After a brief quantitative di scussion of the frequency and distribution of explicitation shifts identiﬁed in the DeepL output as compared to the shifts identiﬁed in the human translation of the research report, the paper analyses in detail several examples in which DeepL performed explicitation shifts of various kinds. The quantitative and qualitative analyses are intended to yield a tentative picture of the capacity of state-of-the art neural machine translation systems to perform explicitation shifts in translation. As explicitation is understood in this article as an indicator of translational text–context interaction, the explicitation performance of NMT can – to some extent – be taken to be indicative of the “contextual awareness” of this new MT architecture.

Download Full-text