scholarly journals Acoustic Model Training, using Kaldi, for Automatic Whispery Speech Recognition

Author(s):  
Piotr Kozierski ◽  
Talar Sadalla ◽  
Szymon Drgas ◽  
Adam Dąbrowski ◽  
Joanna Ziętkiewicz ◽  
...  
Author(s):  
Dinkar Sitaram ◽  
Haripriya Srinivasaraghavan ◽  
Kapish Agarwal ◽  
Amritanshu Agrawal ◽  
Neha Joshi ◽  
...  

2020 ◽  
Vol 10 (10) ◽  
pp. 3542 ◽  
Author(s):  
Hoon Chung ◽  
Sung Joo Lee ◽  
Hyeong Bae Jeon ◽  
Jeon Gue Park

In this paper, we propose a policy gradient-based semi-supervised speech recognition acoustic model training. In practice, self-training and teacher/student learning are one of the widely used semi-supervised training methods due to their scalability and effectiveness. These methods are based on generating pseudo labels for unlabeled samples using a pre-trained model and selecting reliable samples using confidence measure. However, there are some considerations in this approach. The generated pseudo labels can be biased depending on which pre-trained model is used, and the training process can be complicated because the confidence measure is usually carried out in post-processing using external knowledge. Therefore, to address these issues, we propose a policy gradient method-based approach. Policy gradient is a reinforcement learning algorithm to find an optimal behavior strategy for an agent to obtain optimal rewards. The policy gradient-based approach provides a framework for exploring unlabeled data as well as exploiting labeled data, and it also provides a way to incorporate external knowledge in the same training cycle. The proposed approach was evaluated on an in-house non-native Korean recognition domain. The experimental results show that the method is effective in semi-supervised acoustic model training.


Sign in / Sign up

Export Citation Format

Share Document