Cross-lingual Supervision Improves Unsupervised Neural Machine Translation

Without real bilingual corpus available, unsupervised Neural Machine Translation (NMT) typically requires pseudo parallel data generated with the back-translation method for the model training. However, due to weak supervision, the pseudo data inevitably contain noises and errors that will be accumulated and reinforced in the subsequent training process, leading to bad translation performance. To address this issue, we introduce phrase based Statistic Machine Translation (SMT) models which are robust to noisy data, as posterior regularizations to guide the training of unsupervised NMT models in the iterative back-translation process. Our method starts from SMT models built with pre-trained language models and word-level translation tables inferred from cross-lingual embeddings. Then SMT and NMT models are optimized jointly and boost each other incrementally in a unified EM framework. In this way, (1) the negative effect caused by errors in the iterative back-translation process can be alleviated timely by SMT filtering noises from its phrase tables; meanwhile, (2) NMT can compensate for the deficiency of fluency inherent in SMT. Experiments conducted on en-fr and en-de translation tasks show that our method outperforms the strong baseline and achieves new state-of-the-art unsupervised machine translation performance.

Download Full-text

Cross-Lingual Pre-Training Based Transfer for Zero-Shot Neural Machine Translation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i01.5341 ◽

2020 ◽

Vol 34 (01) ◽

pp. 115-122 ◽

Cited By ~ 3

Author(s):

Baijun Ji ◽

Zhirui Zhang ◽

Xiangyu Duan ◽

Min Zhang ◽

Boxing Chen ◽

...

Keyword(s):

Machine Translation ◽

Transfer Learning ◽

Large Scale ◽

Feature Space ◽

Target Language ◽

Smooth Transition ◽

Training Methods ◽

Neural Machine Translation ◽

Cross Lingual ◽

Effective Transfer

Transfer learning between different language pairs has shown its effectiveness for Neural Machine Translation (NMT) in low-resource scenario. However, existing transfer methods involving a common target language are far from success in the extreme scenario of zero-shot translation, due to the language space mismatch problem between transferor (the parent model) and transferee (the child model) on the source side. To address this challenge, we propose an effective transfer learning approach based on cross-lingual pre-training. Our key idea is to make all source languages share the same feature space and thus enable a smooth transition for zero-shot translation. To this end, we introduce one monolingual pre-training method and two bilingual pre-training methods to obtain a universal encoder for different languages. Once the universal encoder is constructed, the parent model built on such encoder is trained with large-scale annotated data and then directly applied in zero-shot translation scenario. Experiments on two public datasets show that our approach significantly outperforms strong pivot-based baseline and various multilingual NMT approaches.

Download Full-text

Zero-Shot Cross-Lingual Transfer of Neural Machine Translation with Multilingual Pretrained Encoders

10.18653/v1/2021.emnlp-main.2 ◽

2021 ◽

Author(s):

Guanhua Chen ◽

Shuming Ma ◽

Yun Chen ◽

Li Dong ◽

Dongdong Zhang ◽

...

Keyword(s):

Machine Translation ◽

Neural Machine Translation ◽

Cross Lingual

Download Full-text

Linguistic knowledge-based vocabularies for Neural Machine Translation

Natural Language Engineering ◽

10.1017/s1351324920000364 ◽

2020 ◽

pp. 1-22

Author(s):

Noe Casas ◽

Marta R. Costa-jussà ◽

José A. R. Fonollosa ◽

Juan A. Alonso ◽

Ramón Fanlo

Keyword(s):

Neural Networks ◽

Machine Translation ◽

Word Formation ◽

Linguistic Knowledge ◽

Learning Approaches ◽

Neural Machine Translation ◽

Knowledge Based ◽

Word Level ◽

The Neural Network ◽

Cross Lingual

Abstract Neural Networks applied to Machine Translation need a finite vocabulary to express textual information as a sequence of discrete tokens. The currently dominant subword vocabularies exploit statistically-discovered common parts of words to achieve the flexibility of character-based vocabularies without delegating the whole learning of word formation to the neural network. However, they trade this for the inability to apply word-level token associations, which limits their use in semantically-rich areas and prevents some transfer learning approaches e.g. cross-lingual pretrained embeddings, and reduces their interpretability. In this work, we propose new hybrid linguistically-grounded vocabulary definition strategies that keep both the advantages of subword vocabularies and the word-level associations, enabling neural networks to profit from the derived benefits. We test the proposed approaches in both morphologically rich and poor languages, showing that, for the former, the quality in the translation of out-of-domain texts is improved with respect to a strong subword baseline.

Download Full-text

Cross-lingual text similarity exploiting neural machine translation models

Journal of Information Science ◽

10.1177/0165551520912676 ◽

2020 ◽

pp. 016555152091267 ◽

Cited By ~ 1

Author(s):

Kazuhiro Seki

Keyword(s):

Machine Translation ◽

Learning To Rank ◽

Translation System ◽

Text Similarity ◽

Retrieval Task ◽

Neural Machine Translation ◽

Intermediate States ◽

Machine Translation System ◽

Cross Lingual ◽

Types Of Information

This article studies cross-lingual text similarity using neural machine translation models. A straightforward approach based on machine translation is to use translated text so as to make the problem monolingual. Another possible approach is to use intermediate states of machine translation models as recently proposed in the related work, which could avoid propagation of translation errors. We aim at improving both approaches independently and then combine the two types of information, that is, translations and intermediate states, in a learning-to-rank framework to compute cross-lingual text similarity. To evaluate the effectiveness and generalisability of our approach, we conduct empirical experiments on English–Japanese and English–Hindi translation corpora for a cross-lingual sentence retrieval task. It is demonstrated that our approach using translations and intermediate states outperforms other neural network–based approaches and is even comparable with a strong baseline based on a state-of-the-art machine translation system.

Download Full-text

Machine Translation in Low-Resource Languages by an Adversarial Neural Network

Applied Sciences ◽

10.3390/app112210860 ◽

2021 ◽

Vol 11 (22) ◽

pp. 10860

Author(s):

Mengtao Sun ◽

Hao Wang ◽

Mark Pasquine ◽

Ibrahim A. Hameed

Keyword(s):

Machine Translation ◽

Transfer Learning ◽

Grammatical Structure ◽

Neural Machine Translation ◽

Low Resource ◽

High Resource ◽

Learning Techniques ◽

Good Potential ◽

Target Languages ◽

Cross Lingual

Existing Sequence-to-Sequence (Seq2Seq) Neural Machine Translation (NMT) shows strong capability with High-Resource Languages (HRLs). However, this approach poses serious challenges when processing Low-Resource Languages (LRLs), because the model expression is limited by the training scale of parallel sentence pairs. This study utilizes adversary and transfer learning techniques to mitigate the lack of sentence pairs in LRL corpora. We propose a new Low resource, Adversarial, Cross-lingual (LAC) model for NMT. In terms of the adversary technique, LAC model consists of a generator and discriminator. The generator is a Seq2Seq model that produces the translations from source to target languages, while the discriminator measures the gap between machine and human translations. In addition, we introduce transfer learning on LAC model to help capture the features in rare resources because some languages share the same subject-verb-object grammatical structure. Rather than using the entire pretrained LAC model, we separately utilize the pretrained generator and discriminator. The pretrained discriminator exhibited better performance in all experiments. Experimental results demonstrate that the LAC model achieves higher Bilingual Evaluation Understudy (BLEU) scores and has good potential to augment LRL translations.

Download Full-text

Evaluating the Cross-Lingual Effectiveness of Massively Multilingual Neural Machine Translation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6414 ◽

2020 ◽

Vol 34 (05) ◽

pp. 8854-8861 ◽

Cited By ~ 1

Author(s):

Aditya Siddhant ◽

Melvin Johnson ◽

Henry Tsai ◽

Naveen Ari ◽

Jason Riesa ◽

...

Keyword(s):

Machine Translation ◽

Transfer Learning ◽

Single Model ◽

Neural Machine Translation ◽

Low Resource ◽

Sequence Labeling ◽

Transfer Capability ◽

Learning Scenarios ◽

The Cross ◽

Cross Lingual

The recently proposed massively multilingual neural machine translation (NMT) system has been shown to be capable of translating over 100 languages to and from English within a single model (Aharoni, Johnson, and Firat 2019). Its improved translation performance on low resource languages hints at potential cross-lingual transfer capability for downstream tasks. In this paper, we evaluate the cross-lingual effectiveness of representations from the encoder of a massively multilingual NMT model on 5 downstream classification and sequence labeling tasks covering a diverse set of over 50 languages. We compare against a strong baseline, multilingual BERT (mBERT) (Devlin et al. 2018), in different cross-lingual transfer learning scenarios and show gains in zero-shot transfer in 4 out of these 5 tasks.

Download Full-text