scholarly journals Exploration of End-to-End Framework for Code-Switching Speech Recognition Task: Challenges and Enhancements

IEEE Access ◽  
2020 ◽  
Vol 8 ◽  
pp. 68146-68157
Author(s):  
Ganji Sreeram ◽  
Rohit Sinha
Author(s):  
Zhiping Zeng ◽  
Yerbolat Khassanov ◽  
Van Tung Pham ◽  
Haihua Xu ◽  
Eng Siong Chng ◽  
...  

2019 ◽  
Author(s):  
Yerbolat Khassanov ◽  
Haihua Xu ◽  
Van Tung Pham ◽  
Zhiping Zeng ◽  
Eng Siong Chng ◽  
...  

2021 ◽  
Author(s):  
Shuai Zhang ◽  
Jiangyan Yi ◽  
Zhengkun Tian ◽  
Ye Bai ◽  
Jianhua Tao ◽  
...  

Author(s):  
Denis Ivanko ◽  
Dmitry Ryumin

In this paper we design end-to-end neural network for the low-resource lip-reading task and audio speech recognition task using 3D CNNs, pre-trained CNN weights of several state-of- the-art models (e.g. VGG19, InceptionV3, MobileNetV2, etc.) and LSTMs. We present two phrase-level speech recognition pipelines: for lip-reading and acoustic speech recognition. We evaluate different combinations of front-end and back-end modules on the RUSAVIC dataset. We compare our results with traditional 2D CNN approach and demonstrate the increase in recognition accuracy up to 14%. Moreover, we carefully studied existing state-of-the-art models to be use for augmentation. Based on the conducted analysis we have chosen 5 most promising model’s architectures and evaluated them on own data. We have tested our systems on a real-word data of two different scenarios: recorded in idling vehicle and during actual driving. Our independently trained systems demonstrated acoustic speech accuracy up to 90% and lip-reading accuracy up to 61%. Future work will focus on the fusion of visual and audio speech modalities and on speaker adaptation. We expect that fused multi-modal information will help to further improve recognition performance compared to a single modality. Another possible direction could be the research of different NN-based architectures to better tackle end-to-end lip-reading task.


2020 ◽  
Author(s):  
Zimeng Qiu ◽  
Yiyuan Li ◽  
Xinjian Li ◽  
Florian Metze ◽  
William M. Campbell

Sign in / Sign up

Export Citation Format

Share Document