automatic speech recognition system Latest Research Papers

There are two approaches to Qur’an recitation, namely talaqqi and qira'ati. Both approaches use the science of recitation containing knowledge of the rules and procedures for reading the Qur'an properly. Talaqqi requires the teacher and students to sit facing each other while qira'ati is the recitation of the Qur'an with rhythms and tones. Many studies have developed an automatic speech recognition system for Qur’an recitation to help the learning process. Feature extraction model using Mel Frequency Cepstral Coefficient (MFCC) and Linear Predictive Code (LPC). The MFCC method has an accuracy of 50% to 60% while the accuracy of Linear Predictive Code (LPC) is only 45% to 50%, so the non-linear MFCC method has higher accuracy than the linear approach method. The cepstral coefficient feature that is used starts from 0 to 23 or 24 cepstral coefficients. Meanwhile, the frame taken consists of 0 to 10 frames or eleven frames. Voting for 300 recorded voice samples was tested against 200 voice recordings, both male and female voices. The frequency used was 44.100 kHz stereo 16 bit. This study aims to obtain good accuracy by selecting the right feature on the cepstral coefficient using MFCC feature extraction and matching accuracy through the selection of the cepstral coefficient feature with Dominant Weight Normalization (NBD) at TPA Nurul Huda Plus Purbayan. Accuracy results showed that the MFCC method with the selection of the 23rd cepstral coefficient has a higher accuracy rate of 90.2% compared to the others. It can be concluded that the selection of the right features on the 23rd cepstral coefficient affects the accuracy of the voice of Qur’an recitation.

Download Full-text

BART Based Semantic Correction for Mandarin Automatic Speech Recognition System

10.21437/interspeech.2021-739 ◽

2021 ◽

Author(s):

Yun Zhao ◽

Xuerui Yang ◽

Jinchao Wang ◽

Yongyu Gao ◽

Chao Yan ◽

...

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Recognition System ◽

Speech Recognition System ◽

Automatic Speech Recognition System

Download Full-text

Feature compensation based on the normalization of vocal tract length for the improvement of emotion-affected speech recognition

EURASIP Journal on Audio Speech and Music Processing ◽

10.1186/s13636-021-00216-5 ◽

2021 ◽

Vol 2021 (1) ◽

Author(s):

Masoud Geravanchizadeh ◽

Elnaz Forouhandeh ◽

Meysam Bashirpour

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Vocal Tract ◽

Gaussian Mixture ◽

Recognition System ◽

Speech Recognition System ◽

Emotional States ◽

Emotional Speech ◽

Automatic Speech Recognition System ◽

Frequency Warping

AbstractThe performance of speech recognition systems trained with neutral utterances degrades significantly when these systems are tested with emotional speech. Since everybody can speak emotionally in the real-world environment, it is necessary to take account of the emotional states of speech in the performance of the automatic speech recognition system. Limited works have been performed in the field of emotion-affected speech recognition and so far, most of the researches have focused on the classification of speech emotions. In this paper, the vocal tract length normalization method is employed to enhance the robustness of the emotion-affected speech recognition system. For this purpose, two structures of the speech recognition system based on hybrids of hidden Markov model with Gaussian mixture model and deep neural network are used. To achieve this goal, frequency warping is applied to the filterbank and/or discrete-cosine transform domain(s) in the feature extraction process of the automatic speech recognition system. The warping process is conducted in a way to normalize the emotional feature components and make them close to their corresponding neutral feature components. The performance of the proposed system is evaluated in neutrally trained/emotionally tested conditions for different speech features and emotional states (i.e., Anger, Disgust, Fear, Happy, and Sad). In this system, frequency warping is employed for different acoustical features. The constructed emotion-affected speech recognition system is based on the Kaldi automatic speech recognition with the Persian emotional speech database and the crowd-sourced emotional multi-modal actors dataset as the input corpora. The experimental simulations reveal that, in general, the warped emotional features result in better performance of the emotion-affected speech recognition system as compared with their unwarped counterparts. Also, it can be seen that the performance of the speech recognition using the deep neural network-hidden Markov model outperforms the system employing the hybrid with the Gaussian mixture model.

Download Full-text

Automatic Speech Recognition System Based on Hybrid Deep Neural Networks

10.1109/icscan53069.2021.9526532 ◽

2021 ◽

Author(s):

Prasanna. S ◽

Thulasidass. S ◽

Mathavan. V ◽

Vanakovarayan. S

Keyword(s):

Neural Networks ◽

Speech Recognition ◽

Automatic Speech Recognition ◽

Deep Neural Networks ◽

Recognition System ◽

Speech Recognition System ◽

Automatic Speech Recognition System

Download Full-text

Classification approaches for automatic speech recognition system

Artificial Intelligence and Speech Technology ◽

10.1201/9781003150664-1 ◽

2021 ◽

pp. 1-7

Author(s):

Amritpreet Kaur ◽

Rohit Sachdeva ◽

Amitoj Singh

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Recognition System ◽

Speech Recognition System ◽

Automatic Speech Recognition System

Download Full-text

Refining Automatic Speech Recognition System for Older Adults

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp39728.2021.9414207 ◽

2021 ◽

Author(s):

Liu Chen ◽

Meysam Asgari

Keyword(s):

Older Adults ◽

Speech Recognition ◽

Automatic Speech Recognition ◽

Recognition System ◽

Speech Recognition System ◽

Automatic Speech Recognition System

Download Full-text

Analysis of COVID-19 Resulting Cough using Formants and Automatic Speech Recognition System

Journal of Voice ◽

10.1016/j.jvoice.2021.05.015 ◽

2021 ◽

Author(s):

Ouissam Zealouk ◽

Hassan Satori ◽

Mohamed Hamidi ◽

Naouar Laaidi ◽

Amine Salek ◽

...

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Recognition System ◽

Speech Recognition System ◽

Automatic Speech Recognition System

Download Full-text

Nonlinear acoustic noise cancellation based automatic speech recognition system (NANC-ASR) with convolutional neural networks

International Journal of Speech Technology ◽

10.1007/s10772-021-09848-6 ◽

2021 ◽

Author(s):

Rabie A. Ramadan ◽

Kusum Yadav

Keyword(s):

Neural Networks ◽

Speech Recognition ◽

Convolutional Neural Networks ◽

Automatic Speech Recognition ◽

Acoustic Noise ◽

Recognition System ◽

Noise Cancellation ◽

Speech Recognition System ◽

Automatic Speech Recognition System ◽

Nonlinear Acoustic

Download Full-text

Towards an Automatic Speech-Based Diagnostic Test for Alzheimer’s Disease

Frontiers in Computer Science ◽

10.3389/fcomp.2021.624594 ◽

2021 ◽

Vol 3 ◽

Author(s):

Roozbeh Sadeghian ◽

J. David Schaffer ◽

Stephen A. Zahorian

Keyword(s):

Speech Recognition ◽

Diagnostic Test ◽

Automatic Speech Recognition ◽

Recognition System ◽

Theory And Practice ◽

Linguistic Features ◽

Automatic Speech Recognition System ◽

Non Invasive ◽

Considerable Work ◽

The Impact

Automatic Speech Recognition (ASR) is widely used in many applications and tools. Smartphones, video games, and cars are a few examples where people use ASR routinely and often daily. A less commonly used, but potentially very important arena for using ASR, is the health domain. For some people, the impact on life could be enormous. The goal of this work is to develop an easy-to-use, non-invasive, inexpensive speech-based diagnostic test for dementia that can easily be applied in a clinician’s office or even at home. While considerable work has been published along these lines, increasing dramatically recently, it is primarily of theoretical value and not yet practical to apply. A large gap exists between current scientific understanding, and the creation of a diagnostic test for dementia. The aim of this paper is to bridge this gap between theory and practice by engineering a practical test. Experimental evidence suggests that strong discrimination between subjects with a diagnosis of probable Alzheimer’s vs. matched normal controls can be achieved with a combination of acoustic features from speech, linguistic features extracted from a transcription of the speech, and results of a mini mental state exam. A fully automatic speech recognition system tuned for the speech-to-text aspect of this application, including automatic punctuation, is also described.

Download Full-text

TRIBUS: An end-to-end automatic speech recognition system for European Portuguese

10.21437/iberspeech.2021-40 ◽

2021 ◽

Author(s):

Carlos Carvalho ◽

Alberto Abad

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Recognition System ◽

Speech Recognition System ◽

Automatic Speech Recognition System ◽

European Portuguese ◽

End To End

Download Full-text

automatic speech recognition system
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

The Implementation Of Mfcc Feature Extraction And Selection of Cepstral Coefficient for Qur’an Recitation in TPA (Qur’an Learning Center) Nurul Huda Plus Purbayan

BART Based Semantic Correction for Mandarin Automatic Speech Recognition System

Feature compensation based on the normalization of vocal tract length for the improvement of emotion-affected speech recognition

Automatic Speech Recognition System Based on Hybrid Deep Neural Networks

Classification approaches for automatic speech recognition system

Refining Automatic Speech Recognition System for Older Adults

Analysis of COVID-19 Resulting Cough using Formants and Automatic Speech Recognition System

Nonlinear acoustic noise cancellation based automatic speech recognition system (NANC-ASR) with convolutional neural networks

Towards an Automatic Speech-Based Diagnostic Test for Alzheimer’s Disease

TRIBUS: An end-to-end automatic speech recognition system for European Portuguese

Export Citation Format

automatic speech recognition systemRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

The Implementation Of Mfcc Feature Extraction And Selection of Cepstral Coefficient for Qur’an Recitation in TPA (Qur’an Learning Center) Nurul Huda Plus Purbayan

BART Based Semantic Correction for Mandarin Automatic Speech Recognition System

Feature compensation based on the normalization of vocal tract length for the improvement of emotion-affected speech recognition

Automatic Speech Recognition System Based on Hybrid Deep Neural Networks

Classification approaches for automatic speech recognition system

Refining Automatic Speech Recognition System for Older Adults

Analysis of COVID-19 Resulting Cough using Formants and Automatic Speech Recognition System

Nonlinear acoustic noise cancellation based automatic speech recognition system (NANC-ASR) with convolutional neural networks

Towards an Automatic Speech-Based Diagnostic Test for Alzheimer’s Disease

TRIBUS: An end-to-end automatic speech recognition system for European Portuguese

automatic speech recognition system
Recently Published Documents