Automatic speech recognition errors detection using supervised learning techniques

Automatic speech recognition in patients with aphasia is a challenging task for which studies have been published in a few languages. Reasonably, the systems reported in the literature within this field show significantly lower performance than those focused on transcribing non-pathological clean speech. It is mainly due to the difficulty of recognizing a more unintelligible voice, as well as due to the scarcity of annotated aphasic data. This work is mainly focused on applying novel semi-supervised learning methods to the AphasiaBank dataset in order to deal with these two major issues, reporting improvements for the English language and providing the first benchmark for the Spanish language for which less than one hour of transcribed aphasic speech was used for training. In addition, the influence of reinforcing the training and decoding processes with out-of-domain acoustic and text data is described by using different strategies and configurations to fine-tune the hyperparameters and the final recognition systems. The interesting results obtained encourage extending this technological approach to other languages and scenarios where the scarcity of annotated data to train recognition models is a challenging reality.

Download Full-text

Exploiting automatic speech recognition errors to enhance partial and synchronized caption for facilitating second language listening

Computer Speech & Language ◽

10.1016/j.csl.2017.11.001 ◽

2018 ◽

Vol 49 ◽

pp. 17-36 ◽

Cited By ~ 1

Author(s):

Maryam Sadat Mirzaei ◽

Kourosh Meshgi ◽

Tatsuya Kawahara

Keyword(s):

Second Language ◽

Speech Recognition ◽

Automatic Speech Recognition ◽

Second Language Listening ◽

Recognition Errors

Download Full-text

Machine Learning Techniques for Speech Recognition using the Magnititudes

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.e4945.038620 ◽

2020 ◽

Vol 8 (6) ◽

pp. 1667-1671

Keyword(s):

Machine Learning ◽

Speech Recognition ◽

Data Processing ◽

Supervised Learning ◽

Current Trend ◽

Large Datasets ◽

Point Of View ◽

Machine Learning Techniques ◽

Supervised And Unsupervised Learning ◽

Learning Techniques

Speech is the most proficient method of correspondence between people groups. Discourse acknowledgment is an interdisciplinary subfield of computational phonetics that creates approaches and advances that empowers the acknowledgment and interpretation of communicated in language into content by PCs. It is otherwise called programmed discourse acknowledgment (ASR), PC discourse acknowledgment or discourse to content (STT). It consolidates information and research in the etymology, software engineering, and electrical building fields. This, being the best methodology of correspondence, could likewise be a helpful interface to speak with machines. Machine learning consists of supervised and unsupervised learning among which supervised learning is used for the speech recognition objectives. Supervised learning is that the data processing task of inferring a perform from labeled coaching information. Speech recognition is the current trend that has gained focus over the decades. Most automation technologies use speech and speech recognition for various perspectives. This paper offers a diagram of major innovative point of view and valuation for the fundamental advancement of speech recognitionand offers review method created in each phase of discourse acknowledgment utilizing supervised learning. The project will use ANN to recognize speeches using magnitudes with large datasets.

Download Full-text

End-to-End Rich Transcription-Style Automatic Speech Recognition with Semi-Supervised Learning

10.21437/interspeech.2021-1981 ◽

2021 ◽

Author(s):

Tomohiro Tanaka ◽

Ryo Masumura ◽

Mana Ihori ◽

Akihiko Takashima ◽

Shota Orihashi ◽

...

Keyword(s):

Speech Recognition ◽

Supervised Learning ◽

Automatic Speech Recognition ◽

End To End

Download Full-text

A Study on the Automatic Speech Recognition Errors of Korean Plosives

The Journal of Yeongju Language & Literature ◽

10.30774/yjll.2021.10.49.5 ◽

2021 ◽

Vol 49 ◽

pp. 5-33

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Recognition Errors

Download Full-text

Detecting listening difficulty for second language learners using Automatic Speech Recognition errors

10.21437/slate.2017-27 ◽

2017 ◽

Author(s):

Maryam Sadat Mirzaei ◽

Kourosh Meshgi ◽

Tatsuya Kawahara

Keyword(s):

Second Language ◽

Speech Recognition ◽

Language Learners ◽

Automatic Speech Recognition ◽

Second Language Learners ◽

Recognition Errors

Download Full-text

An Overview of End-to-End Automatic Speech Recognition

Symmetry ◽

10.3390/sym11081018 ◽

2019 ◽

Vol 11 (8) ◽

pp. 1018 ◽

Cited By ~ 8

Author(s):

Dong Wang ◽

Xiaodong Wang ◽

Shaohe Lv

Keyword(s):

Deep Learning ◽

Speech Recognition ◽

Automatic Speech Recognition ◽

Deep Neural Network ◽

Mixed Model ◽

Hidden Markov ◽

Continuous Speech Recognition ◽

Learning Techniques ◽

Long Time ◽

End To End

Automatic speech recognition, especially large vocabulary continuous speech recognition, is an important issue in the field of machine learning. For a long time, the hidden Markov model (HMM)-Gaussian mixed model (GMM) has been the mainstream speech recognition framework. But recently, HMM-deep neural network (DNN) model and the end-to-end model using deep learning has achieved performance beyond HMM-GMM. Both using deep learning techniques,

Download Full-text