Study Speech Recognition System Based on Manifold Learning

2013 ◽  
Vol 380-384 ◽  
pp. 3762-3765
Author(s):  
Peng Hao Zhang

This paper conducts a comprehensive research and discussion on the relevant technologies and manifold learning.Traditional MFCC phonetic feature will lead a slower learning speed on account of it has high dimension and is large in data quantities. In order to solve this problem, we introduce a manifold learning, putting forward two new extraction methods of MFCC-Manifold phonetic feature. We can reduce dimensions by making use of ISOMAP algorithm which bases on the classical MDS (Multidimensional scaling). Introducing geodesic distance to replace the original European distance data will make twenty-four dimensional data, which using the traditional MFCC feature extraction down to ten dimensional data.

Author(s):  
Gopal Chaudhary ◽  
Smriti Srivastava ◽  
Saurabh Bhardwaj

This paper presents main paradigms of research for feature extraction methods to further augment the state of art in speaker recognition (SR) which has been recognized extensively in person identification for security and protection applications. Speaker recognition system (SRS) has become a widely researched topic for the last many decades. The basic concept of feature extraction methods is derived from the biological model of human auditory/vocal tract system. This work provides a classification-oriented review of feature extraction methods for SR over the last 55 years that are proven to be successful and have become the new stone to further research. Broadly, the review work is dichotomized into feature extraction methods with and without noise compensation techniques. Feature extraction methods without noise compensation techniques are divided into following categories: On the basis of high/low level of feature extraction; type of transform; speech production/auditory system; type of feature extraction technique; time variability; speech processing techniques. Further, feature extraction methods with noise compensation techniques are classified into noise-screened features, feature normalization methods, feature compensation methods. This classification-oriented review would endow the clear vision of readers to choose among different techniques and will be helpful in future research in this field.


In order to make fast communication between human and machine, speech recognition system are used. Number of speech recognition systems have been developed by various researchers. For example speech recognition, speaker verification and speaker recognition. The basic stages of speech recognition system are pre-processing, feature extraction and feature selection and classification. Numerous works have been done for improvement of all these stages to get accurate and better results. In this paper the main focus is given to addition of machine learning in speech recognition system. This paper covers architecture of ASR that helps in getting idea about basic stages of speech recognition system. Then focus is given to the use of machine learning in ASR. The work done by various researchers using Support vector machine and artificial neural network is also covered in a section of the paper. Along with this review is presented on work done using SVM, ELM, ANN, Naive Bayes and kNN classifier. The simulation results show that the best accuracy is achieved using ELM classifier. The last section of paper covers the results obtained by using proposed approaches in which SVM, ANN with Cuckoo search algorithm and ANN with back propagation classifier is used. The focus is also on the improvement of pre-processing and feature extraction processes.


Sign in / Sign up

Export Citation Format

Share Document