Speed Up the Training of Neural Machine Translation

2019 ◽  
Vol 51 (1) ◽  
pp. 231-249 ◽  
Author(s):  
Xinyue Liu ◽  
Weixuan Wang ◽  
Wenxin Liang ◽  
Yuangang Li
2020 ◽  
Vol 34 (05) ◽  
pp. 7839-7846
Author(s):  
Junliang Guo ◽  
Xu Tan ◽  
Linli Xu ◽  
Tao Qin ◽  
Enhong Chen ◽  
...  

Non-autoregressive translation (NAT) models remove the dependence on previous target tokens and generate all target tokens in parallel, resulting in significant inference speedup but at the cost of inferior translation accuracy compared to autoregressive translation (AT) models. Considering that AT models have higher accuracy and are easier to train than NAT models, and both of them share the same model configurations, a natural idea to improve the accuracy of NAT models is to transfer a well-trained AT model to an NAT model through fine-tuning. However, since AT and NAT models differ greatly in training strategy, straightforward fine-tuning does not work well. In this work, we introduce curriculum learning into fine-tuning for NAT. Specifically, we design a curriculum in the fine-tuning process to progressively switch the training from autoregressive generation to non-autoregressive generation. Experiments on four benchmark translation datasets show that the proposed method achieves good improvement (more than 1 BLEU score) over previous NAT baselines in terms of translation accuracy, and greatly speed up (more than 10 times) the inference process over AT baselines.


2019 ◽  
Vol 28 (4) ◽  
pp. 1-29 ◽  
Author(s):  
Michele Tufano ◽  
Cody Watson ◽  
Gabriele Bavota ◽  
Massimiliano Di Penta ◽  
Martin White ◽  
...  

Procedia CIRP ◽  
2021 ◽  
Vol 96 ◽  
pp. 9-14
Author(s):  
Uwe Dombrowski ◽  
Alexander Reiswich ◽  
Raphael Lamprecht

2020 ◽  
Vol 34 (4) ◽  
pp. 325-346
Author(s):  
John E. Ortega ◽  
Richard Castro Mamani ◽  
Kyunghyun Cho

Sign in / Sign up

Export Citation Format

Share Document