mandarin speech recognition Latest Research Papers

In real world applications of speech recognition, recognition errors are inevitable, and manual correction is necessary. This paper presents an approach for the refinement of Mandarin speech recognition result by exploiting user feedback. An interface incorporating character-based candidate lists and feedback-driven updating of the candidate lists is introduced. For dynamic updating of candidate lists, a novel method based on lattice modification and rescoring is proposed. By adding words with similar pronunciations to the candidates next to the corrected character into the lattice and then performing rescoring on the modified lattice, the proposed method can improve the accuracy of the candidate lists even if the correct characters are not in the original lattice, with much lower computational cost than that of the speech re-recognition methods. Experimental results show that the proposed method can reduce 24.03% of user inputs and improve average candidate rank by 25.31%.

Download Full-text

Investigation of Transformer Based Spelling Correction Model for CTC-Based End-to-End Mandarin Speech Recognition

10.21437/interspeech.2019-1290 ◽

2019 ◽

Cited By ~ 2

Author(s):

Shiliang Zhang ◽

Ming Lei ◽

Zhijie Yan

Keyword(s):

Speech Recognition ◽

Spelling Correction ◽

Correction Model ◽

End To End ◽

Mandarin Speech Recognition

Download Full-text

Measuring Mandarin Speech Recognition Thresholds Using the Method of Adaptive Tracking

Journal of Speech Language and Hearing Research ◽

10.1044/2019_jslhr-h-18-0162 ◽

2019 ◽

Vol 62 (6) ◽

pp. 2009-2017

Author(s):

Yuxia Wang ◽

Zhaoyu Lu ◽

Xiaohu Yang ◽

Chang Liu

Keyword(s):

Speech Recognition ◽

Adaptive Tracking ◽

Mandarin Speech Recognition

Download Full-text

End-to-End Mandarin Speech Recognition Combining CNN and BLSTM

Symmetry ◽

10.3390/sym11050644 ◽

2019 ◽

Vol 11 (5) ◽

pp. 644 ◽

Cited By ~ 9

Author(s):

Dong Wang ◽

Xiaodong Wang ◽

Shaohe Lv

Keyword(s):

Speech Recognition ◽

Contextual Information ◽

Language Models ◽

Neural Net ◽

Speech Corpus ◽

Speech Features ◽

End To End ◽

Mandarin Speech Recognition ◽

Conventional Systems ◽

Short Time

Since conventional Automatic Speech Recognition (ASR) systems often contain many modules and use varieties of expertise, it is hard to build and train such models. Recent research show that end-to-end ASRs can significantly simplify the speech recognition pipelines and achieve competitive performance with conventional systems. However, most end-to-end ASR systems are neither reproducible nor comparable because they use specific language models and in-house training databases which are not freely available. This is especially common for Mandarin speech recognition. In this paper, we propose a CNN+BLSTM+CTC end-to-end Mandarin ASR. This CNN+BLSTM+CTC ASR uses Convolutional Neural Net (CNN) to learn local speech features, uses Bidirectional Long-Short Time Memory (BLSTM) to learn history and future contextual information, and uses Connectionist Temporal Classification (CTC) for decoding. Our model is completely trained on the by-far-largest open-source Mandarin speech corpus AISHELL-1, using neither any in-house databases nor external language models. Experiments show that our CNN+BLSTM+CTC model achieves a WER of 19.2%, outperforming the exiting best work. Because all the data corpora we used are freely available, our model is reproducible and comparable, providing a new baseline for further Mandarin ASR research.

Download Full-text

mandarin speech recognition
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Semantic Data Augmentation for End-to-End Mandarin Speech Recognition

Syllable-Based Acoustic Modeling With Lattice-Free MMI for Mandarin Speech Recognition

Cascade RNN-Transducer: Syllable Based Streaming On-Device Mandarin Speech Recognition with a Syllable-To-Character Converter

Improve Mandarin Speech Recognition Using Simple Recurrent Units and SpecAugment

A review of Mandarin speech recognition test materials for use in Singapore

Robust Audio-Visual Mandarin Speech Recognition Based On Adaptive Decision Fusion And Tone Features

Feedback-Driven Refinement of Mandarin Speech Recognition Result Based on Lattice Modification and Rescoring

Investigation of Transformer Based Spelling Correction Model for CTC-Based End-to-End Mandarin Speech Recognition

Measuring Mandarin Speech Recognition Thresholds Using the Method of Adaptive Tracking

End-to-End Mandarin Speech Recognition Combining CNN and BLSTM

Export Citation Format

mandarin speech recognitionRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Semantic Data Augmentation for End-to-End Mandarin Speech Recognition

Syllable-Based Acoustic Modeling With Lattice-Free MMI for Mandarin Speech Recognition

Cascade RNN-Transducer: Syllable Based Streaming On-Device Mandarin Speech Recognition with a Syllable-To-Character Converter

Improve Mandarin Speech Recognition Using Simple Recurrent Units and SpecAugment

A review of Mandarin speech recognition test materials for use in Singapore

Robust Audio-Visual Mandarin Speech Recognition Based On Adaptive Decision Fusion And Tone Features

Feedback-Driven Refinement of Mandarin Speech Recognition Result Based on Lattice Modification and Rescoring

Investigation of Transformer Based Spelling Correction Model for CTC-Based End-to-End Mandarin Speech Recognition

Measuring Mandarin Speech Recognition Thresholds Using the Method of Adaptive Tracking

End-to-End Mandarin Speech Recognition Combining CNN and BLSTM

mandarin speech recognition
Recently Published Documents