Neural machine translation and the evolution of the localisation sector

Abstract The localisation sector is highly technologized and evolves rapidly. Though significant consideration has been given to third-level training in localisation for Translation Studies students, the nature of the industry is such that this topic demands regular attention. Our objective was to survey employees and executive managers to understand what impact recent technological developments, including but not limited to neural machine translation (NMT), might have on future skills and training requirements for localisation linguists. Our findings are that linguists in localisation take up a variety of roles, including transcreation, data mining, and project and vendor management. NMT is considered an important advancement, and its introduction has emphasised the need for a critical use of technology, while opening new career pathways, such as data curation and annotation. Domain specialisation is recommended for those who translate, and transferable soft skills are more essential than ever. Increased industry and interdisciplinary collaborations in training are also considered valuable.

Download Full-text

Empirical Investigation of Optimization Algorithms in Neural Machine Translation

Prague Bulletin of Mathematical Linguistics ◽

10.1515/pralin-2017-0005 ◽

2017 ◽

Vol 108 (1) ◽

pp. 13-25 ◽

Cited By ~ 2

Author(s):

Parnia Bahar ◽

Tamer Alkhouli ◽

Jan-Thorsten Peter ◽

Christopher Jan-Steffen Brix ◽

Hermann Ney

Keyword(s):

Neural Networks ◽

Machine Translation ◽

Optimization Problem ◽

Empirical Investigation ◽

State Of The Art ◽

Optimization Techniques ◽

Neural Machine Translation ◽

Translation Quality ◽

And Training ◽

Dimensional Optimization

AbstractTraining neural networks is a non-convex and a high-dimensional optimization problem. In this paper, we provide a comparative study of the most popular stochastic optimization techniques used to train neural networks. We evaluate the methods in terms of convergence speed, translation quality, and training stability. In addition, we investigate combinations that seek to improve optimization in terms of these aspects. We train state-of-the-art attention-based models and apply them to perform neural machine translation. We demonstrate our results on two tasks: WMT 2016 En→Ro and WMT 2015 De→En.

Download Full-text

Sequence-Level Training for Non-Autoregressive Neural Machine Translation

Computational Linguistics ◽

10.1162/coli_a_00421 ◽

2021 ◽

pp. 1-36

Author(s):

Chenze Shao ◽

Yang Feng ◽

Jinchao Zhang ◽

Fandong Meng ◽

Jie Zhou

Keyword(s):

Machine Translation ◽

Cross Entropy ◽

Gradient Estimation ◽

Neural Machine Translation ◽

Training Strategy ◽

Word Generation ◽

Translation Quality ◽

Word Level ◽

Training Objective ◽

Level Training

Abstract In recent years, Neural Machine Translation (NMT) has achieved notable results in various translation tasks. However, the word-by-word generation manner determined by the autoregressive mechanism leads to high translation latency of the NMT and restricts its low-latency applications. Non-Autoregressive Neural Machine Translation (NAT) removes the autoregressive mechanism and achieves significant decoding speedup through generating target words independently and simultaneously. Nevertheless, NAT still takes the word-level cross-entropy loss as the training objective, which is not optimal because the output of NAT cannot be properly evaluated due to the multimodality problem. In this article, we propose using sequence-level training objectives to train NAT models, which evaluate the NAT outputs as a whole and correlates well with the real translation quality. Firstly, we propose training NAT models to optimize sequence-level evaluation metrics (e.g., BLEU) based on several novel reinforcement algorithms customized for NAT, which outperforms the conventional method by reducing the variance of gradient estimation. Secondly, we introduce a novel training objective for NAT models, which aims to minimize the Bag-of-Ngrams (BoN) difference between the model output and the reference sentence. The BoN training objective is differentiable and can be calculated efficiently without doing any approximations. Finally, we apply a three-stage training strategy to combine these two methods to train the NAT model.We validate our approach on four translation tasks (WMT14 En↔De, WMT16 En↔Ro), which shows that our approach largely outperforms NAT baselines and achieves remarkable performance on all translation tasks. The source code is available at https://github.com/ictnlp/Seq-NAT.

Download Full-text

Incremental Decoding and Training Methods for Simultaneous Translation in Neural Machine Translation

10.18653/v1/n18-2079 ◽

2018 ◽

Author(s):

Fahim Dalvi ◽

Nadir Durrani ◽

Hassan Sajjad ◽

Stephan Vogel

Keyword(s):

Machine Translation ◽

Training Methods ◽

Neural Machine Translation ◽

And Training ◽

Simultaneous Translation

Download Full-text

Neural Machine Translation with Adequacy-Oriented Learning

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33016618 ◽

2019 ◽

Vol 33 ◽

pp. 6618-6625 ◽

Cited By ~ 1

Author(s):

Xiang Kong ◽

Zhaopeng Tu ◽

Shuming Shi ◽

Eduard Hovy ◽

Tong Zhang

Keyword(s):

Machine Translation ◽

State Of The Art ◽

Likelihood Estimation ◽

Neural Machine Translation ◽

Learning Mechanism ◽

Translation Quality ◽

Word Level ◽

Adversarial Training ◽

Level Training ◽

Qualitative Analyses

Although Neural Machine Translation (NMT) models have advanced state-of-the-art performance in machine translation, they face problems like the inadequate translation. We attribute this to that the standard Maximum Likelihood Estimation (MLE) cannot judge the real translation quality due to its several limitations. In this work, we propose an adequacyoriented learning mechanism for NMT by casting translation as a stochastic policy in Reinforcement Learning (RL), where the reward is estimated by explicitly measuring translation adequacy. Benefiting from the sequence-level training of RL strategy and a more accurate reward designed specifically for translation, our model outperforms multiple strong baselines, including (1) standard and coverage-augmented attention models with MLE-based training, and (2) advanced reinforcement and adversarial training strategies with rewards based on both word-level BLEU and character-level CHRF3. Quantitative and qualitative analyses on different language pairs and NMT architectures demonstrate the effectiveness and universality of the proposed approach.

Download Full-text

Revisiting Negation in Neural Machine Translation

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00395 ◽

2021 ◽

Vol 9 ◽

pp. 740-755

Author(s):

Gongbo Tang ◽

Philipp Rönchen ◽

Rico Sennrich ◽

Joakim Nivre

Keyword(s):

Machine Translation ◽

Information Flow ◽

Statistical Machine Translation ◽

Flow Analysis ◽

Training Data ◽

Error Type ◽

Neural Machine Translation ◽

Information Flow Analysis ◽

Hidden States ◽

And Training

In this paper, we evaluate the translation of negation both automatically and manually, in English–German (EN–DE) and English– Chinese (EN–ZH). We show that the ability of neural machine translation (NMT) models to translate negation has improved with deeper and more advanced networks, although the performance varies between language pairs and translation directions. The accuracy of manual evaluation in EN→DE, DE→EN, EN→ZH, and ZH→EN is 95.7%, 94.8%, 93.4%, and 91.7%, respectively. In addition, we show that under-translation is the most significant error type in NMT, which contrasts with the more diverse error profile previously observed for statistical machine translation. To better understand the root of the under-translation of negation, we study the model’s information flow and training data. While our information flow analysis does not reveal any deficiencies that could be used to detect or fix the under-translation of negation, we find that negation is often rephrased during training, which could make it more difficult for the model to learn a reliable link between source and target negation. We finally conduct intrinsic analysis and extrinsic probing tasks on negation, showing that NMT models can distinguish negation and non-negation tokens very well and encode a lot of information about negation in hidden states but nevertheless leave room for improvement.

Download Full-text