scholarly journals Comparison of Different Speech Feature Extraction Techniques with and without Wavelet Transform to Kannada Speech Recognition

2011 ◽  
Vol 26 (4) ◽  
pp. 19-24 ◽  
Author(s):  
M.A. Anusuya ◽  
S.K. Katti
Author(s):  
Hongbing Zhang

: Nowadays, speech recognition has become one of the important technologies for human-computer interaction. Speech recognition is essentially a process of speech training and pattern recognition, which makes feature extraction technology particularly important. The quality of feature extraction is directly related to the accuracy of speech recognition. Dynamic feature parameters can effectively improve the accuracy of speech recognition, which makes the speech feature dynamic feature extraction has higher research value. The traditional dynamic feature extraction method is easy to generate more redundant information, resulting in low recognition accuracy. Therefore, based on a new speech feature extraction method, a method based on deep learning for speech feature extraction is proposed. Firstly, speech signal is preprocessed by pre-emphasis, windowing, filtering and endpoint detection. Then, the sliding differential cepstral feature (SDC) is extracted, which contains the voice information of the front and back frames. Finally, the feature is used as input to extract the dynamic features that represent the depth essence of speech information through the deep self-encoding neural network. The simulation results show that the dynamic features extracted by in-depth learning have better recognition performance than the original features, and have a good effect in speech recognition.


2013 ◽  
Vol 416-417 ◽  
pp. 1176-1180
Author(s):  
Jian Guo Xing ◽  
Min Xu ◽  
Ji Xiang Zhu

The performance of speech recognition system is well or not is closely related to the characteristic parameters. For emulating human auditory system, a new method of speech feature extraction based on Hopf filter banks is presented. We modeled the extraction process of the MFCC, and used Hopf filter banks instead of the triangular filter banks. Then, we according the characteristics of Basilar Membranes in the cochlea to adjust the center frequency and bandwidth of the filter. The test speech goes through the Hopf filter banks, multi-dimensional eigenvectors will be obtained. After that, by Discrete Cosine Transformation, we will get the Hopf cepstral coefficients of the speech. Comparing with traditional feature MFCC, the speech recognition systems with Hopf characteristic parameters have better recognition rate and robustness characteristics in low Signal Noise Ratio (SNR) environment.


2017 ◽  
Vol 9 (2) ◽  
pp. 23
Author(s):  
Pardeep Sangwan ◽  
Dinesh Sheoran ◽  
Saurabh Bhardwaj

Speech recognition by machine may be defined as the conversion of human speech signal into textual form automatically by the machine without any human intervention. Two feature extraction techniques utilizing DWT (Discrete Wavelet Transform) and WPD (Wavelet Packet Decomposition) for speech recognition are discussed in the present article. The comparison of two speech recognizer, first, based on Discrete Wavelet Transform and the second based on Wavelet Packet Decomposition, and with four classifiers has been done in this paper. The proposed method is implemented for a database consisting of ten digits and two hundred speakers, making it a database of 2000 speech samples. The results present the accuracy rate of the respective speech recognizers.


2012 ◽  
Vol 532-533 ◽  
pp. 1162-1166
Author(s):  
Xiang Hua Ren ◽  
Yun Xia Jiang

Feature extraction plays an important role in speech recognition. In this paper, we propose a speech feature extraction scheme which focuses on increasing the robustness of speech recognizer in noise (additive) and channel (convolutive) distortion environment. Considering the two distortions are additive in spectral and log-spectral domain, respectively, we remove the additive components by computing the time derivatives of speech frames firstly in spectral domain and then in log-spectral domain. Compared with conventional methods, this method does not need spectrum estimation and prior knowledge of noise. Experimental results confirm that our proposed method can improve the speech recognition performance in environ-ments existing both noise and channel distortions.


Sign in / Sign up

Export Citation Format

Share Document