A DCT based nonlinear predictive coding for feature extraction in speech recognition systems

Author(s):  
Mahmood Yousefi Azar ◽  
Farbod Razzazi
2019 ◽  
Vol 8 (4) ◽  
pp. 7160-7162

This fast world is running with machine and human interaction. This kind of interaction is not an easy task. For proper interaction between human and machine speech recognition is major area where the machine should understand the speech properly to perform the tasks. So ASR have been developed which improvised the HMIS (“Human Machine Interaction systems”) technology in to the deep level. This research focuses on speech recognition over “Telugu language”, which is used in Telugu HMI systems. This paper uses LSF (linear spectral frequencies) technique for feature extraction and DNN for feature classification which finally produced the effective results. Many other recognition systems also used these techniques but for Telugu language this are the most suitable techniques.


2015 ◽  
Vol 40 (1) ◽  
pp. 25-31 ◽  
Author(s):  
Sayf A. Majeed ◽  
Hafizah Husain ◽  
Salina A. Samad

Abstract In this paper, a new feature-extraction method is proposed to achieve robustness of speech recognition systems. This method combines the benefits of phase autocorrelation (PAC) with bark wavelet transform. PAC uses the angle to measure correlation instead of the traditional autocorrelation measure, whereas the bark wavelet transform is a special type of wavelet transform that is particularly designed for speech signals. The extracted features from this combined method are called phase autocorrelation bark wavelet transform (PACWT) features. The speech recognition performance of the PACWT features is evaluated and compared to the conventional feature extraction method mel frequency cepstrum coefficients (MFCC) using TI-Digits database under different types of noise and noise levels. This database has been divided into male and female data. The result shows that the word recognition rate using the PACWT features for noisy male data (white noise at 0 dB SNR) is 60%, whereas it is 41.35% for the MFCC features under identical conditions


2020 ◽  
Vol 9 (1) ◽  
pp. 2431-2435

ASR is the use of system software and hardware based techniques to identify and process human voice. In this research, Tamil words are analyzed, segmented as syllables, followed by feature extraction and recognition. Syllables are segmented using short term energy and segmentation is done in order to minimize the corpus size. The algorithm for syllable segmentation works by performing the STE function of the continuous speech signal. The proposed approach for speech recognition uses the combination of Mel-Frequency Cepstral Coefficients (MFCC) and Linear Predictive Coding (LPC). MFCC features are used to extract a feature vector containing all information about the linguistic message. The LPC affords a robust, dependable and correct technique for estimating the parameters that signify the vocal tract system.LPC features can reduce the bit rate of speech (i.e reducing the measurement of transmitting signal).The combined feature extraction technique will minimize the size of transmitting signal. Then the proposed FE algorithm is evaluated on the speech corpus using the Random forest approach. Random forest is an effective algorithm which can build a reliable training model as its training time is less because the classifier works on the subset of features alone.


Author(s):  
Conrad Bernath ◽  
Aitor Alvarez ◽  
Haritz Arzelus ◽  
Carlos David Martínez

Author(s):  
Sheng Li ◽  
Dabre Raj ◽  
Xugang Lu ◽  
Peng Shen ◽  
Tatsuya Kawahara ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document