scholarly journals Indigenuous Vocabulary Reformulation for Continuousyorùbá Speech Recognition In M-Commerce Using Acoustic Nudging-Based Gaussian Mixture Model

Author(s):  
Kehinde Lydia Ajayi ◽  
Victor Azeta ◽  
Isaac Odun-Ayo ◽  
Ambrose Azeta ◽  
Ajayi Peter Taiwo ◽  
...  

Abstract One of the current research areas is speech recognition by aiding in the recognition of speech signals through computer applications. In this research paper, Acoustic Nudging, (AN) Model is used in re-formulating the persistence automatic speech recognition (ASR) errors that involves user’s acoustic irrational behavior which alters speech recognition accuracy. GMM helped in addressing low-resourced attribute of Yorùbá language to achieve better accuracy and system performance. From the simulated results given, it is observed that proposed Acoustic Nudging-based Gaussian Mixture Model (ANGM) improves accuracy and system performance which is evaluated based on Word Recognition Rate (WRR) and Word Error Rate (WER)given by validation accuracy, testing accuracy, and training accuracy. The evaluation results for the mean WRR accuracy achieved for the ANGM model is 95.277% and the mean Word Error Rate (WER) is 4.723%when compared to existing models. This approach thereby reduce error rate by 1.1%, 0.5%, 0.8%, 0.3%, and 1.4% when compared with other models. Therefore this work was able to discover a foundation for advancing current understanding of under-resourced languages and at the same time, development of accurate and precise model for speech recognition.

2011 ◽  
Vol 25 (2) ◽  
pp. 404-439 ◽  
Author(s):  
Daniel Povey ◽  
Lukáš Burget ◽  
Mohit Agarwal ◽  
Pinar Akyazi ◽  
Feng Kai ◽  
...  

2013 ◽  
Vol 380-384 ◽  
pp. 3530-3533
Author(s):  
Yong Qiang Bao ◽  
Li Zhao ◽  
Cheng Wei Huang

In this paper we studied speech emotion recognition from Mandarin speech signal. Five basic emotion classes and the neutral state are considered. In a listening experiment we verified the speech corpus using a judgment matrix. Acoustic parameters including short-term energy, pitch contour, and formants are extracted from emotional speech signal. Gaussian mixture model is then adopted for training the emotion model. Due to the data challenge in GMM training, we use multiple discriminant analysis for feature optimization and compared with basic Fisher discriminant ratio based method. The experimental results show that using multiple discriminant analysis our GMM classifier gives a promising recognition rate for Mandarin speech emotion recognition.


Sign in / Sign up

Export Citation Format

Share Document