Automatic Speech Recognition of Code Switching Speech Using 1-Best Rescoring

Author(s):  
Basem H.A. Ahmed ◽  
Tien-Ping Tan

2021 ◽  
Vol 11 (19) ◽  
pp. 9106
Author(s):  
Zheying Huang ◽  
Pei Wang ◽  
Jian Wang ◽  
Haoran Miao ◽  
Ji Xu ◽  
...  

A Recurrent Neural Networks (RNN) based attention model has been used in code-switching speech recognition (CSSR). However, due to the sequential computation constraint of RNN, there are stronger short-range dependencies and weaker long-range dependencies, which makes it hard to immediately switch languages in CSSR. Firstly, to deal with this problem, we introduce the CTC-Transformer, relying entirely on a self-attention mechanism to draw global dependencies and adopting connectionist temporal classification (CTC) as an auxiliary task for better convergence. Secondly, we proposed two multi-task learning recipes, where a language identification (LID) auxiliary task is learned in addition to the CTC-Transformer automatic speech recognition (ASR) task. Thirdly, we study a decoding strategy to combine the LID into an ASR task. Experiments on the SEAME corpus demonstrate the effects of the proposed methods, achieving a mixed error rate (MER) of 30.95%. It obtains up to 19.35% relative MER reduction compared to the baseline RNN-based CTC-Attention system, and 8.86% relative MER reduction compared to the baseline CTC-Transformer system.



2019 ◽  
Vol 110 ◽  
pp. 76-89 ◽  
Author(s):  
Sreeram Ganji ◽  
Kunal Dhawan ◽  
Rohit Sinha


Author(s):  
Peter A. Heeman ◽  
Rebecca Lunsford ◽  
Andy McMillin ◽  
J. Scott Yaruss


Author(s):  
Manoj Kumar ◽  
Daniel Bone ◽  
Kelly McWilliams ◽  
Shanna Williams ◽  
Thomas D. Lyon ◽  
...  


Sign in / Sign up

Export Citation Format

Share Document