Research on Mongolian-Chinese machine translation based on the end-to-end neural network

Author(s):  
Ren Qing-Dao-Er-Ji ◽  
Yila Su ◽  
Nier Wu

With the development of natural language processing and neural machine translation, the neural machine translation method of end-to-end (E2E) neural network model has gradually become the focus of research because of its high translation accuracy and strong semantics of translation. However, there are still problems such as limited vocabulary and low translation loyalty, etc. In this paper, the discriminant method and the Conditional Random Field (CRF) model were used to segment and label the stem and affixes of Mongolian in the preprocessing stage of Mongolian-Chinese bilingual corpus. Aiming at the low translation loyalty problem, a decoding model combining Convolution Neural Network (CNN) and Gated Recurrent Unit (GRU) was constructed. The target language decoding was performed by using the GRU. A global attention model was used to obtain the bilingual word alignment information in the process of bilingual word alignment processing. Finally, the quality of the translation was evaluated by Bilingual Evaluation Understudy (BLEU) values and Perplexity (PPL) values. The improved model yields a BLEU value of 25.13 and a PPL value of [Formula: see text]. The experimental results show that the E2E Mongolian-Chinese neural machine translation model was improved in terms of translation quality and semantic confusion compared with traditional statistical methods and machine translation models based on Recurrent Neural Networks (RNN).

Author(s):  
N Revathi

Abstract: Language is a main mode of communication, and translation is a critical tool for understanding information in a foreign language. Without the help of human translators, machine translation allows users to absorb unfamiliar linguistic material. The main goal of this project is to create a practical language translation from English to Hindi. Given its relevance and potential in the English-Hindi translation, machine translation is an efficient way to turn content into a new language without employing people. Among all available translation machines, Neural Machine Translation (NMT) is one of the most efficient ways. So, in this case, we're employing Sequence to Sequence Modeling, which includes the Recurrent Neural Network (RNN), Long and Short Term Memory (LSTM), and Encoder-Decoder methods. Deep Neural Network (DNN) comprehension and principles of deep learning, i.e. machine translation, are disclosed in the field of Natural Language Processing (NLP). In machine reclining techniques, DNN plays a crucial role. Keywords: Sequence to Sequence, Encoder-Decoder, Recurrent Neural Network, Long & Short term Memory, Deep Neural Network.


2017 ◽  
Vol 108 (1) ◽  
pp. 37-48 ◽  
Author(s):  
Praveen Dakwale ◽  
Christof Monz

AbstractNeural machine translation is a recently proposed approach which has shown competitive results to traditional MT approaches. Standard neural MT is an end-to-end neural network where the source sentence is encoded by a recurrent neural network (RNN) called encoder and the target words are predicted using another RNN known as decoder. Recently, various models have been proposed which replace the RNN encoder with a convolutional neural network (CNN). In this paper, we propose to augment the standard RNN encoder in NMT with additional convolutional layers in order to capture wider context in the encoder output. Experiments on English to German translation demonstrate that our approach can achieve significant improvements over a standard RNN-based baseline.


2018 ◽  
Vol 28 (09) ◽  
pp. 1850007
Author(s):  
Francisco Zamora-Martinez ◽  
Maria Jose Castro-Bleda

Neural Network Language Models (NNLMs) are a successful approach to Natural Language Processing tasks, such as Machine Translation. We introduce in this work a Statistical Machine Translation (SMT) system which fully integrates NNLMs in the decoding stage, breaking the traditional approach based on [Formula: see text]-best list rescoring. The neural net models (both language models (LMs) and translation models) are fully coupled in the decoding stage, allowing to more strongly influence the translation quality. Computational issues were solved by using a novel idea based on memorization and smoothing of the softmax constants to avoid their computation, which introduces a trade-off between LM quality and computational cost. These ideas were studied in a machine translation task with different combinations of neural networks used both as translation models and as target LMs, comparing phrase-based and [Formula: see text]-gram-based systems, showing that the integrated approach seems more promising for [Formula: see text]-gram-based systems, even with nonfull-quality NNLMs.


Molecules ◽  
2017 ◽  
Vol 22 (10) ◽  
pp. 1732 ◽  
Author(s):  
Renzhi Cao ◽  
Colton Freitas ◽  
Leong Chan ◽  
Miao Sun ◽  
Haiqing Jiang ◽  
...  

Author(s):  
Afrizal Zein

Mesin penerjemah adalah alat penerjemah otomatis pada sebuah teks yang dapat merubah dari satu bahasa ke bahasa yang berbeda. Mesin penerjemah adalah sebuah software dengan hasil terjemahan dihasilkan atas dasar model linier regresi yang parameter-parameternya diambil dari hasil analisis statistik teks bilingual. Sekarang kami memperkenalkan langkah berikutnya dalam membuat Mesin Penerjemah yang lebih baik menggunakan metode Neural Machine Translation.Cara Neural Machine Translation menerjemahkan seluruh kalimat dalam satu waktu, bukan hanya memenggal sepotong demi sepotong. Menggunakan konteks yang lebih luas untuk membantu mencari tahu terjemahan yang paling relevan, yang kemudian menata kembali dan menyesuaikan untuk menjadi lebih seperti layaknya berbicara dengan manusia menggunakan tata bahasa yang benar.Program aplikasi ini dibuat menggunakan Bahasa pemrograman C# ditambah pustaka AJAX API Google untuk menerjemahkan teks dan mengambil terjemahan dengan mengurai konten JSON.Dari hasil penelitian didapat sebuah terjemahan yang jauh lebih halus dan mudah dibaca, dan ini semua mungkin karena sistem pembelajaran end-to-end yang dibangun di atas Neural Machine Translation yang pada dasarnya berarti bahwa sistem belajar dari waktu ke waktu untuk membuat lebih baik, terjemahan yang lebih alami.


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Michael Adjeisah ◽  
Guohua Liu ◽  
Douglas Omwenga Nyabuga ◽  
Richard Nuetey Nortey ◽  
Jinling Song

Scaling natural language processing (NLP) to low-resourced languages to improve machine translation (MT) performance remains enigmatic. This research contributes to the domain on a low-resource English-Twi translation based on filtered synthetic-parallel corpora. It is often perplexing to learn and understand what a good-quality corpus looks like in low-resource conditions, mainly where the target corpus is the only sample text of the parallel language. To improve the MT performance in such low-resource language pairs, we propose to expand the training data by injecting synthetic-parallel corpus obtained by translating a monolingual corpus from the target language based on bootstrapping with different parameter settings. Furthermore, we performed unsupervised measurements on each sentence pair engaging squared Mahalanobis distances, a filtering technique that predicts sentence parallelism. Additionally, we extensively use three different sentence-level similarity metrics after round-trip translation. Experimental results on a diverse amount of available parallel corpus demonstrate that injecting pseudoparallel corpus and extensive filtering with sentence-level similarity metrics significantly improves the original out-of-the-box MT systems for low-resource language pairs. Compared with existing improvements on the same original framework under the same structure, our approach exhibits tremendous developments in BLEU and TER scores.


2020 ◽  
Vol 34 (05) ◽  
pp. 8830-8837
Author(s):  
Xin Sheng ◽  
Linli Xu ◽  
Junliang Guo ◽  
Jingchang Liu ◽  
Ruoyu Zhao ◽  
...  

We propose a novel introspective model for variational neural machine translation (IntroVNMT) in this paper, inspired by the recent successful application of introspective variational autoencoder (IntroVAE) in high quality image synthesis. Different from the vanilla variational NMT model, IntroVNMT is capable of improving itself introspectively by evaluating the quality of the generated target sentences according to the high-level latent variables of the real and generated target sentences. As a consequence of introspective training, the proposed model is able to discriminate between the generated and real sentences of the target language via the latent variables generated by the encoder of the model. In this way, IntroVNMT is able to generate more realistic target sentences in practice. In the meantime, IntroVNMT inherits the advantages of the variational autoencoders (VAEs), and the model training process is more stable than the generative adversarial network (GAN) based models. Experimental results on different translation tasks demonstrate that the proposed model can achieve significant improvements over the vanilla variational NMT model.


2020 ◽  
Vol 34 (05) ◽  
pp. 8568-8575
Author(s):  
Xing Niu ◽  
Marine Carpuat

This work aims to produce translations that convey source language content at a formality level that is appropriate for a particular audience. Framing this problem as a neural sequence-to-sequence task ideally requires training triplets consisting of a bilingual sentence pair labeled with target language formality. However, in practice, available training examples are limited to English sentence pairs of different styles, and bilingual parallel sentences of unknown formality. We introduce a novel training scheme for multi-task models that automatically generates synthetic training triplets by inferring the missing element on the fly, thus enabling end-to-end training. Comprehensive automatic and human assessments show that our best model outperforms existing models by producing translations that better match desired formality levels while preserving the source meaning.1


10.29007/nwj8 ◽  
2019 ◽  
Author(s):  
Sebastien Carré ◽  
Victor Dyseryn ◽  
Adrien Facon ◽  
Sylvain Guilley ◽  
Thomas Perianin

Cache timing attacks are serious security threats that exploit cache memories to steal secret information.We believe that the identification of a sequence of operations from a set of cache-timing data measurements is not a trivial step when building an attack. We present a recurrent neural network model able to automatically retrieve a sequence of function calls from cache-timings. Inspired from natural language processing, our model is able to learn on partially labelled data. We use the model to unfold an end-to-end automated attack on OpenSSL ECDSA on the secp256k1 curve. Contrary to most research, we did not need human processing of the traces to retrieve relevant information.


2020 ◽  
Vol 184 ◽  
pp. 01061
Author(s):  
Anusha Anugu ◽  
Gajula Ramesh

Machine translation has gradually developed in past 1940’s.It has gained more and more attention because of effective and efficient nature. As it makes the translation automatically without the involvement of human efforts. The distinct models of machine translation along with “Neural Machine Translation (NMT)” is summarized in this paper. Researchers have previously done lots of work on Machine Translation techniques and their evaluation techniques. Thus, we want to demonstrate an analysis of the existing techniques for machine translation including Neural Machine translation, their differences and the translation tools associated with them. Now-a-days the combination of two Machine Translation systems has the full advantage of using features from both the systems which attracts in the domain of natural language processing. So, the paper also includes the literature survey of the Hybrid Machine Translation (HMT).


Sign in / Sign up

Export Citation Format

Share Document