scholarly journals High-Performance English–Chinese Machine Translation Based on GPU-Enabled Deep Neural Networks with Domain Corpus

2021 ◽  
Vol 11 (22) ◽  
pp. 10915
Author(s):  
Lanxin Zhao ◽  
Wanrong Gao ◽  
Jianbin Fang

The ability to automate machine translation has various applications in international commerce, medicine, travel, education, and text digitization. Due to the different grammar and lack of clear word boundaries in Chinese, it is challenging to conduct translation from word-based languages (e.g., English) to Chinese. This article has implemented a GPU-enabled deep learning machine translation system based on a domain-specific corpus. Our system takes an English text as input and uses an encoder-decoder model with an attention mechanism based on Google’s Transformer to translate the text to Chinese output. The model was trained using a simple self-designed entropy loss function and an Adam optimizer on English–Chinese bilingual text sentences from the News area of the UM-Corpus. The parallel training process of our model can be performed on common laptops, desktops, and servers with one or more GPUs. At training time, we not only track loss over training epochs but also measure the quality of our model’s translations with the BLEU score. We also provide an easy-to-use web interface for users so as to manage corpus, training projects, and trained models. The experimental results show that we can achieve a maximum BLEU score of 29.2. We can further improve this score by tuning other hyperparameters. The GPU-enabled model training runs over 15x faster than on a multi-core CPU, which facilitates us having a shorter turn-around time. As a case study, we compare the performance of our model to that of Baidu’s, which shows that our model can compete with the industry-level translation system. We argue that our deep-learning-based translation system is particularly suitable for teaching purposes and small/medium-sized enterprises.

2022 ◽  
Vol 2022 ◽  
pp. 1-11
Author(s):  
Syed Abdul Basit Andrabi ◽  
Abdul Wahid

Machine translation is an ongoing field of research from the last decades. The main aim of machine translation is to remove the language barrier. Earlier research in this field started with the direct word-to-word replacement of source language by the target language. Later on, with the advancement in computer and communication technology, there was a paradigm shift to data-driven models like statistical and neural machine translation approaches. In this paper, we have used a neural network-based deep learning technique for English to Urdu languages. Parallel corpus sizes of around 30923 sentences are used. The corpus contains sentences from English-Urdu parallel corpus, news, and sentences which are frequently used in day-to-day life. The corpus contains 542810 English tokens and 540924 Urdu tokens, and the proposed system is trained and tested using 70 : 30 criteria. In order to evaluate the efficiency of the proposed system, several automatic evaluation metrics are used, and the model output is also compared with the output from Google Translator. The proposed model has an average BLEU score of 45.83.


2021 ◽  
Vol 2030 (1) ◽  
pp. 012098
Author(s):  
Ting Yang ◽  
Shinan Zhao ◽  
He Chen ◽  
Bo Chen

2019 ◽  
Vol 2019 ◽  
pp. 1-7
Author(s):  
ShaoLin Zhu ◽  
Xiao Li ◽  
YaTing Yang ◽  
Lei Wang ◽  
ChengGang Mi

Machine translation needs a large number of parallel sentence pairs to make sure of having a good translation performance. However, the lack of parallel corpus heavily limits machine translation for low-resources language pairs. We propose a novel method that combines the continuous word embeddings with deep learning to obtain parallel sentences. Since parallel sentences are very invaluable for low-resources language pair, we introduce cross-lingual semantic representation to induce bilingual signals. Our experiments show that we can achieve promising results under lacking external resources for low-resource languages. Finally, we construct a state-of-the-art machine translation system in low-resources language pair.


2021 ◽  
Vol 3 (2) ◽  
pp. 34
Author(s):  
Zeshan Ali Ali

Urdu is Pakistan 's national language. However, Chinese expertise is very negligible in Pakistan and the Asian nations. Yet fewer research has been undertaken in the area of computer translation on Chinese to Urdu. In order to solve the above problems, we designed of an electronic dictionary for Chinese-Urdu, and studied the sentence-level machine translation technology which is based on deep learning. The Design of an electronic dictionary Chinese-Urdu machine translation system we collected and constructed an electronic dictionary containing 24000 entries from Chinese to Urdu. For Sentence we used English as an intermediate language, and based on the existing parallel corpus of Chinese to English and English to Urdu, we constructed a bilingual parallel corpus containing 66000 sentences from Chinese to Urdu. The Corpus has trained by using two NMT Models (LSTM,Transformer Model) and the above two translation model were compared to the desired translation, with the help of bilingual valuation understudy (BLEU) score.  On NMT, The LSTM Model is gain of 0.067 to 0.41 in BLEU score while on Transformer model, there is gain of 0.077 to 0.52 in BLEU which is better than from LSTM Model score. Furthermore, we compared the proposed model with Google and Microsoft translation.


2016 ◽  
Vol 1 (1) ◽  
pp. 45-49
Author(s):  
Avinash Singh ◽  
Asmeet Kour ◽  
Shubhnandan S. Jamwal

The objective behind this paper is to analyze the English-Dogri parallel corpus translation. Machine translation is the translation from one language into another language. Machine translation is the biggest application of the Natural Language Processing (NLP). Moses is statistical machine translation system allow to train translation models for any language pair. We have developed translation system using Statistical based approach which helps in translating English to Dogri and vice versa. The parallel corpus consists of 98,973 sentences. The system gives accuracy of 80% in translating English to Dogri and the system gives accuracy of 87% in translating Dogri to English system.


Sign in / Sign up

Export Citation Format

Share Document