scholarly journals Linguistic knowledge-based vocabularies for Neural Machine Translation

2020 ◽  
pp. 1-22
Author(s):  
Noe Casas ◽  
Marta R. Costa-jussà ◽  
José A. R. Fonollosa ◽  
Juan A. Alonso ◽  
Ramón Fanlo

Abstract Neural Networks applied to Machine Translation need a finite vocabulary to express textual information as a sequence of discrete tokens. The currently dominant subword vocabularies exploit statistically-discovered common parts of words to achieve the flexibility of character-based vocabularies without delegating the whole learning of word formation to the neural network. However, they trade this for the inability to apply word-level token associations, which limits their use in semantically-rich areas and prevents some transfer learning approaches e.g. cross-lingual pretrained embeddings, and reduces their interpretability. In this work, we propose new hybrid linguistically-grounded vocabulary definition strategies that keep both the advantages of subword vocabularies and the word-level associations, enabling neural networks to profit from the derived benefits. We test the proposed approaches in both morphologically rich and poor languages, showing that, for the former, the quality in the translation of out-of-domain texts is improved with respect to a strong subword baseline.

Author(s):  
Yirong Pan ◽  
Xiao Li ◽  
Yating Yang ◽  
Rui Dong

Incorporating source-side linguistic knowledge into the neural machine translation (NMT) model has recently achieved impressive performance on machine translation tasks. One popular method is to generalize the word embedding layer of the encoder to encode each word and its linguistic features. The other method is to change the architecture of the encoder to encode syntactic information. However, the former cannot explicitly balance the contribution from the word and its linguistic features. The latter cannot flexibly utilize various types of linguistic information. Focusing on the above issues, this paper proposes a novel NMT approach that models the words in parallel to the linguistic knowledge by using two separate encoders. Compared with the single encoder based NMT model, the proposed approach additionally employs the knowledge-based encoder to specially encode linguistic features. Moreover, it shares parameters across encoders to enhance the model representation ability of the source-side language. Extensive experiments show that the approach achieves significant improvements of up to 2.4 and 1.1 BLEU points on Turkish→English and English→Turkish machine translation tasks, respectively, which indicates that it is capable of better utilizing the external linguistic knowledge and effective improving the machine translation quality.


2020 ◽  
Vol 34 (05) ◽  
pp. 7756-7763 ◽  
Author(s):  
Zuohui Fu ◽  
Yikun Xian ◽  
Shijie Geng ◽  
Yingqiang Ge ◽  
Yuting Wang ◽  
...  

A number of cross-lingual transfer learning approaches based on neural networks have been proposed for the case when large amounts of parallel text are at our disposal. However, in many real-world settings, the size of parallel annotated training data is restricted. Additionally, prior cross-lingual mapping research has mainly focused on the word level. This raises the question of whether such techniques can also be applied to effortlessly obtain cross-lingually aligned sentence representations. To this end, we propose an Adversarial Bi-directional Sentence Embedding Mapping (ABSent) framework, which learns mappings of cross-lingual sentence representations from limited quantities of parallel data. The experiments show that our method outperforms several technically more powerful approaches, especially under challenging low-resource circumstances. The source code is available from https://github.com/zuohuif/ABSent along with relevant datasets.


Author(s):  
Rashmini Naranpanawa ◽  
Ravinga Perera ◽  
Thilakshi Fonseka ◽  
Uthayasanker Thayasivam

Neural machine translation (NMT) is a remarkable approach which performs much better than the Statistical machine translation (SMT) models when there is an abundance of parallel corpus. However, vanilla NMT is primarily based upon word-level with a fixed vocabulary. Therefore, low resource morphologically rich languages such as Sinhala are mostly affected by the out of vocabulary (OOV) and Rare word problems. Recent advancements in subword techniques have opened up opportunities for low resource communities by enabling open vocabulary translation. In this paper, we extend our recently published state-of-the-art EN-SI translation system using the transformer and explore standard subword techniques on top of it to identify which subword approach has a greater effect on English Sinhala language pair. Our models demonstrate that subword segmentation strategies along with the state-of-the-art NMT can perform remarkably when translating English sentences into a rich morphology language regardless of a large parallel corpus.


Author(s):  
Shuo Ren ◽  
Zhirui Zhang ◽  
Shujie Liu ◽  
Ming Zhou ◽  
Shuai Ma

Without real bilingual corpus available, unsupervised Neural Machine Translation (NMT) typically requires pseudo parallel data generated with the back-translation method for the model training. However, due to weak supervision, the pseudo data inevitably contain noises and errors that will be accumulated and reinforced in the subsequent training process, leading to bad translation performance. To address this issue, we introduce phrase based Statistic Machine Translation (SMT) models which are robust to noisy data, as posterior regularizations to guide the training of unsupervised NMT models in the iterative back-translation process. Our method starts from SMT models built with pre-trained language models and word-level translation tables inferred from cross-lingual embeddings. Then SMT and NMT models are optimized jointly and boost each other incrementally in a unified EM framework. In this way, (1) the negative effect caused by errors in the iterative back-translation process can be alleviated timely by SMT filtering noises from its phrase tables; meanwhile, (2) NMT can compensate for the deficiency of fluency inherent in SMT. Experiments conducted on en-fr and en-de translation tasks show that our method outperforms the strong baseline and achieves new state-of-the-art unsupervised machine translation performance.


2017 ◽  
Vol 108 (1) ◽  
pp. 13-25 ◽  
Author(s):  
Parnia Bahar ◽  
Tamer Alkhouli ◽  
Jan-Thorsten Peter ◽  
Christopher Jan-Steffen Brix ◽  
Hermann Ney

AbstractTraining neural networks is a non-convex and a high-dimensional optimization problem. In this paper, we provide a comparative study of the most popular stochastic optimization techniques used to train neural networks. We evaluate the methods in terms of convergence speed, translation quality, and training stability. In addition, we investigate combinations that seek to improve optimization in terms of these aspects. We train state-of-the-art attention-based models and apply them to perform neural machine translation. We demonstrate our results on two tasks: WMT 2016 En→Ro and WMT 2015 De→En.


2019 ◽  
Vol 35 (2) ◽  
pp. 147-166 ◽  
Author(s):  
Hong-Hai Phan-Vu ◽  
Viet Trung Tran ◽  
Van Nam Nguyen ◽  
Hoang Vu Dang ◽  
Phan Thuan Do

Machine translation is shifting to an end-to-end approach based on deep neural networks. The state of the art achieves impressive results for popular language pairs such as English - French or English - Chinese. However for English - Vietnamese the shortage of parallel corpora and expensive hyper-parameter search present practical challenges to neural-based approaches. This paper highlights our efforts on improving English-Vietnamese translations in two directions: (1) Building the largest open Vietnamese - English corpus to date, and (2) Extensive experiments with the latest neural models to achieve the highest BLEU scores. Our experiments provide practical examples of effectively employing different neural machine translation models with low-resource language pairs.


1996 ◽  
Vol 118 (2) ◽  
pp. 237-246 ◽  
Author(s):  
S. Yoshimura ◽  
A. S. Jovanovic

This paper describes analyses of case studies on failure of structural components in power plants using hierarchical (multilayer) neural networks. Using selected test data about case studies stored in the structural failure database of a knowledge-based system, the network is trained: either to predict possible failure mechanisms like creep, overheating (OH), or overstressing (OS)-induced failure (network of Type A), or to classify a root failure cause of each case study into either a primary or secondary cause (network of Type B). In the present study, the primary root cause is defined as “manufacturing, material or design-induced causes,” while the secondary one as “not manufacturing, material or design-induced causes, e.g., failures due to operation or mal-operation.” An ordinary three-layer neural network employing the back propagation algorithm with the momentum method is utilized in this study. The results clearly show that the neural network is a powerful tool for analyzing case studies of failure in structural components. For example, the trained network of Type A predicts creep-induced failure in unknown case studies with an accuracy of 86 percent, while the network of Type B classifies root failure causes of unknown case studies with an accuracy of 88 percent. It should be noted that, due to a shortage of available case studies, an appropriate selection of case studies and input parameters to be used for network training was necessary in order to attain high accuracy. A collection of more case studies should, however, resolve this problem, and improve the accuracy of the analyses. An analysis module for case studies using the neural network has also been developed and successfully implemented in a knowledge-based system.


2013 ◽  
Vol 416-417 ◽  
pp. 1552-1557
Author(s):  
Xiao Xu Hu

Hypothesis combination is a main method to improve the performance of machine translation (MT) system. The state-of-the-arts strategies include sentence-level and word-level methods, which has its own advantages and disadvantages. And, the current strategies mainly depends on the statistical method with little guidance from the rich linguistic knowledge. This paper propose hybrid framework to combine the ability of the sentence-level and word-level methods. In word-level stage, the method select the well translated words according to its part-of-speech and translation ability of this part-of-speech of the MT system which generate this word. The experimental results with different MT systems proves the effectiveness of this approach.


Sign in / Sign up

Export Citation Format

Share Document