Investigation of Transformer Based Spelling Correction Model for CTC-Based End-to-End Mandarin Speech Recognition

Mapping Intimacies ◽

10.21437/interspeech.2019-1290 ◽

2019 ◽

Author(s):

Shiliang Zhang ◽

Ming Lei ◽

Zhijie Yan

Keyword(s):

Speech Recognition ◽

Spelling Correction ◽

Correction Model ◽

Mandarin Speech Recognition

Download Full-text

A Spelling Correction Model for End-to-end Speech Recognition

ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp.2019.8683745 ◽

2019 ◽

Author(s):

Jinxi Guo ◽

Tara N. Sainath ◽

Ron J. Weiss

Keyword(s):

Speech Recognition ◽

Spelling Correction ◽

Correction Model ◽

Download Full-text

Semantic Data Augmentation for End-to-End Mandarin Speech Recognition

10.21437/interspeech.2021-1162 ◽

2021 ◽

Author(s):

Jianwei Sun ◽

Zhiyuan Tang ◽

Hengxin Yin ◽

Wei Wang ◽

Xi Zhao ◽

...

Keyword(s):

Speech Recognition ◽

Data Augmentation ◽

Semantic Data ◽

Mandarin Speech Recognition

Download Full-text

End-to-End Spelling Correction Conditioned on Acoustic Feature for Code-Switching Speech Recognition

10.21437/interspeech.2021-1242 ◽

2021 ◽

Author(s):

Shuai Zhang ◽

Jiangyan Yi ◽

Zhengkun Tian ◽

Ye Bai ◽

Jianhua Tao ◽

...

Keyword(s):

Speech Recognition ◽

Code Switching ◽

Acoustic Feature ◽

Spelling Correction ◽

Download Full-text

End-to-End Mandarin Speech Recognition Using Bidirectional Long Short-Term Memory Network

Advances in Intelligent Systems and Computing - Recent Developments in Mechatronics and Intelligent Robotics ◽

10.1007/978-3-030-00214-5_91 ◽

2018 ◽

pp. 726-735

Author(s):

Yu Yao ◽

Ryad Chellali

Keyword(s):

Speech Recognition ◽

Short Term Memory ◽

Term Memory ◽

Memory Network ◽

Long Short Term Memory ◽

Mandarin Speech Recognition

Download Full-text

A Light-Weight Contextual Spelling Correction Model for Customizing Transducer-Based Speech Recognition Systems

10.21437/interspeech.2021-379 ◽

2021 ◽

Author(s):

Xiaoqiang Wang ◽

Yanqing Liu ◽

Sheng Zhao ◽

Jinyu Li

Keyword(s):

Speech Recognition ◽

Light Weight ◽

Spelling Correction ◽

Correction Model ◽

Recognition Systems

Download Full-text

End-to-End Mandarin Speech Recognition Combining CNN and BLSTM

Symmetry ◽

10.3390/sym11050644 ◽

2019 ◽

Vol 11 (5) ◽

pp. 644 ◽

Author(s):

Dong Wang ◽

Xiaodong Wang ◽

Shaohe Lv

Keyword(s):

Speech Recognition ◽

Contextual Information ◽

Language Models ◽

Speech Corpus ◽

Speech Features ◽

Mandarin Speech Recognition ◽

Conventional Systems ◽

Since conventional Automatic Speech Recognition (ASR) systems often contain many modules and use varieties of expertise, it is hard to build and train such models. Recent research show that end-to-end ASRs can significantly simplify the speech recognition pipelines and achieve competitive performance with conventional systems. However, most end-to-end ASR systems are neither reproducible nor comparable because they use specific language models and in-house training databases which are not freely available. This is especially common for Mandarin speech recognition. In this paper, we propose a CNN+BLSTM+CTC end-to-end Mandarin ASR. This CNN+BLSTM+CTC ASR uses Convolutional Neural Net (CNN) to learn local speech features, uses Bidirectional Long-Short Time Memory (BLSTM) to learn history and future contextual information, and uses Connectionist Temporal Classification (CTC) for decoding. Our model is completely trained on the by-far-largest open-source Mandarin speech corpus AISHELL-1, using neither any in-house databases nor external language models. Experiments show that our CNN+BLSTM+CTC model achieves a WER of 19.2%, outperforming the exiting best work. Because all the data corpora we used are freely available, our model is reproducible and comparable, providing a new baseline for further Mandarin ASR research.

Download Full-text

An Analysis of Decoding for Attention-Based End-to-End Mandarin Speech Recognition

2018 11th International Symposium on Chinese Spoken Language Processing (ISCSLP) ◽

10.1109/iscslp.2018.8706686 ◽

2018 ◽

Author(s):

Dongwei Jiang ◽

Wei Zou ◽

Shuaijiang Zhao ◽

Guilin Yang ◽

Xiangang Li

Keyword(s):

Speech Recognition ◽

Mandarin Speech Recognition

Download Full-text

Comparable Study Of Modeling Units For End-To-End Mandarin Speech Recognition

2018 11th International Symposium on Chinese Spoken Language Processing (ISCSLP) ◽

10.1109/iscslp.2018.8706661 ◽

2018 ◽

Author(s):

Wei Zou ◽

Dongwei Jiang ◽

Shuaijiang Zhao ◽

Guilin Yang ◽

Xiangang Li

Keyword(s):

Speech Recognition ◽

Mandarin Speech Recognition

Download Full-text

Measuring Mandarin Speech Recognition Thresholds Using the Method of Adaptive Tracking

Journal of Speech Language and Hearing Research ◽

10.1044/2019_jslhr-h-18-0162 ◽

2019 ◽

Vol 62 (6) ◽

pp. 2009-2017

Author(s):

Yuxia Wang ◽

Zhaoyu Lu ◽

Xiaohu Yang ◽

Chang Liu

Keyword(s):

Speech Recognition ◽

Adaptive Tracking ◽

Mandarin Speech Recognition

Download Full-text

Selective Adaptation of End-to-End Speech Recognition using Hybrid CTC/Attention Architecture for Noise Robustness

2020 28th European Signal Processing Conference (EUSIPCO) ◽

10.23919/eusipco47968.2020.9287836 ◽

2021 ◽

Author(s):

Cong-Thanh Do ◽

Shucong Zhang ◽

Thomas Hain

Keyword(s):

Speech Recognition ◽

Selective Adaptation ◽

Noise Robustness ◽

Download Full-text