speech feature extraction
Recently Published Documents


TOTAL DOCUMENTS

77
(FIVE YEARS 10)

H-INDEX

8
(FIVE YEARS 0)

Speech classification is one of the challenging issues in speech processing. In this paper, we have done speech classification for the Kannada language. We have gathered a speech database from children aged 4-6 years. The dataset collected are pre-processed and speech feature extraction is done using Mel Frequency Cepstral Coefficients (MFCC) technique. After feature extraction Kannada alphabets are classified using six different Machine Learning (ML) classifiers. The classifier accuracies are compared with each other. Amongst the Deep Learning classifiers, Recursive Neural Network (RNN) gave the highest accuracy of around 93.6 %( for 300 epochs) and Random Forest (RF) gave the highest accuracy of around 88.9% which is a Machine Learning classifier.


Author(s):  
Hongbing Zhang

: Nowadays, speech recognition has become one of the important technologies for human-computer interaction. Speech recognition is essentially a process of speech training and pattern recognition, which makes feature extraction technology particularly important. The quality of feature extraction is directly related to the accuracy of speech recognition. Dynamic feature parameters can effectively improve the accuracy of speech recognition, which makes the speech feature dynamic feature extraction has higher research value. The traditional dynamic feature extraction method is easy to generate more redundant information, resulting in low recognition accuracy. Therefore, based on a new speech feature extraction method, a method based on deep learning for speech feature extraction is proposed. Firstly, speech signal is preprocessed by pre-emphasis, windowing, filtering and endpoint detection. Then, the sliding differential cepstral feature (SDC) is extracted, which contains the voice information of the front and back frames. Finally, the feature is used as input to extract the dynamic features that represent the depth essence of speech information through the deep self-encoding neural network. The simulation results show that the dynamic features extracted by in-depth learning have better recognition performance than the original features, and have a good effect in speech recognition.


2019 ◽  
Vol 13 (6) ◽  
pp. 863-872
Author(s):  
Sa'ed Abed ◽  
Bassam Jamil Mohd ◽  
Mohammad H. Al Shayeji

Sign in / Sign up

Export Citation Format

Share Document