Composite Feature Extraction for Speech Emotion Recognition

Feature extraction is a very important part in speech emotion recognition, and in allusion to feature extraction in speech emotion recognition problems, this paper proposed a new method of feature extraction, using DBNs in DNN to extract emotional features in speech signal automatically. By training a 5 layers depth DBNs, to extract speech emotion feature and incorporate multiple consecutive frames to form a high dimensional feature. The features after training in DBNs were the input of nonlinear SVM classifier, and finally speech emotion recognition multiple classifier system was achieved. The speech emotion recognition rate of the system reached 86.5%, which was 7% higher than the original method.

Download Full-text

Feature extraction algorithms to improve the speech emotion recognition rate

International Journal of Speech Technology ◽

10.1007/s10772-020-09672-4 ◽

2020 ◽

Vol 23 (1) ◽

pp. 45-55 ◽

Cited By ~ 7

Author(s):

Anusha Koduru ◽

Hima Bindu Valiveti ◽

Anil Kumar Budati

Keyword(s):

Feature Extraction ◽

Emotion Recognition ◽

Recognition Rate ◽

Speech Emotion Recognition

Download Full-text

Deep Convolutional Neural Networks for Feature Extraction in Speech Emotion Recognition

Human-Computer Interaction. Recognition and Interaction Technologies - Lecture Notes in Computer Science ◽

10.1007/978-3-030-22643-5_9 ◽

2019 ◽

pp. 117-132 ◽

Cited By ~ 1

Author(s):

Panikos Heracleous ◽

Yasser Mohammad ◽

Akio Yoneyama

Keyword(s):

Neural Networks ◽

Feature Extraction ◽

Emotion Recognition ◽

Convolutional Neural Networks ◽

Speech Emotion Recognition ◽

Deep Convolutional Neural Networks

Download Full-text

A feature extraction scheme based on enhanced wavelet coefficients for Speech Emotion Recognition

2014 IEEE 57th International Midwest Symposium on Circuits and Systems (MWSCAS) ◽

10.1109/mwscas.2014.6908609 ◽

2014 ◽

Cited By ~ 4

Author(s):

C. Shahnaz ◽

S. Sultana

Keyword(s):

Feature Extraction ◽

Emotion Recognition ◽

Speech Emotion Recognition ◽

Wavelet Coefficients ◽

Extraction Scheme

Download Full-text

Study of prosodic feature extraction for multidialectal Odia speech emotion recognition

2016 IEEE Region 10 Conference (TENCON) ◽

10.1109/tencon.2016.7848296 ◽

2016 ◽

Cited By ~ 1

Author(s):

Monorama Swain ◽

Aurobinda Routray ◽

P. Kabisatpathy ◽

Jogendra N. Kundu

Keyword(s):

Feature Extraction ◽

Emotion Recognition ◽

Speech Emotion Recognition ◽

Prosodic Feature

Download Full-text

An Analysis of the Impact of Spectral Contrast Feature in Speech Emotion Recognition

International Journal of Recent Contributions from Engineering Science & IT (iJES) ◽

10.3991/ijes.v9i2.22983 ◽

2021 ◽

Vol 9 (2) ◽

pp. 87

Author(s):

Shreya Kumar ◽

Swarnalaxmi Thiruvenkadam

Keyword(s):

Feature Extraction ◽

Emotion Recognition ◽

Prediction Accuracy ◽

Speech Emotion Recognition ◽

German Language ◽

Spectral Contrast ◽

Recognition Systems ◽

The Impact ◽

Contrast Feature

Feature extraction is an integral part in speech emotion recognition. Some emotions become indistinguishable from others due to high resemblance in their features, which results in low prediction accuracy. This paper analyses the impact of spectral contrast feature in increasing the accuracy for such emotions. The RAVDESS dataset has been chosen for this study. The SAVEE dataset, CREMA-D dataset and JL corpus dataset were also used to test its performance over different English accents. In addition to that, EmoDB dataset has been used to study its performance in the German language. The use of spectral contrast feature has increased the prediction accuracy in speech emotion recognition systems to a good degree as it performs well in distinguishing emotions with significant differences in arousal levels, and it has been discussed in detail.<div> </div>

Download Full-text

An Appraisal on Speech and Emotion Recognition Technologies based on Machine Learning

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.e5715.018520 ◽

2020 ◽

Vol 8 (5) ◽

pp. 2266-2276 ◽

Cited By ~ 1

Keyword(s):

Machine Learning ◽

Feature Extraction ◽

Emotion Recognition ◽

Speech Development ◽

Speech Emotion Recognition ◽

Part Of Speech ◽

Classification Feature ◽

The Way

In earlier days, people used speech as a means of communication or the way a listener is conveyed by voice or expression. But the idea of machine learning and various methods are necessary for the recognition of speech in the matter of interaction with machines. With a voice as a bio-metric through use and significance, speech has become an important part of speech development. In this article, we attempted to explain a variety of speech and emotion recognition techniques and comparisons between several methods based on existing algorithms and mostly speech-based methods. We have listed and distinguished speaking technologies that are focused on specifications, databases, classification, feature extraction, enhancement, segmentation and process of Speech Emotion recognition in this paper

Download Full-text

A new proposed statistical feature extraction method in speech emotion recognition

Computers & Electrical Engineering ◽

10.1016/j.compeleceng.2021.107172 ◽

2021 ◽

Vol 93 ◽

pp. 107172

Author(s):

Husam Ali Abdulmohsin ◽

Hala Bahjat Abdul wahab ◽

Abdul Mohssen Jaber Abdul hossen

Keyword(s):

Feature Extraction ◽

Emotion Recognition ◽

Extraction Method ◽

Speech Emotion Recognition ◽

Feature Extraction Method ◽

Statistical Feature

Download Full-text

A Novel Heterogeneous Parallel Convolution Bi-LSTM for Speech Emotion Recognition

10.20944/preprints202108.0433.v1 ◽

2021 ◽

Author(s):

Huiyun Zhang ◽

Heming Huang ◽

Henry Han

Keyword(s):

Feature Extraction ◽

Natural Language Processing ◽

Emotion Recognition ◽

Language Processing ◽

Speech Emotion Recognition ◽

Dense Layer ◽

Acoustic Model ◽

Spatiotemporal Information ◽

Proposed Model ◽

The Right

Speech emotion recognition remains a heavy lifting in natural language processing. It has strict requirements to the effectiveness of feature extraction and that of acoustic model. With that in mind, a Heterogeneous Parallel Convolution Bi-LSTM model is proposed to address these challenges. It consists of two heterogeneous branches: the left one contains two dense layers and a Bi-LSTM layer, while the right one contains a dense layer, a convolution layer, and a Bi-LSTM layer. It can exploit the spatiotemporal information more effectively, and achieves 84.65%, 79.67%, and 56.50% unweighted average recall on the benchmark databases EMODB, CASIA, and SAVEE, respectively. Compared with the previous research results, the proposed model achieves better performance stably.

Download Full-text

A Hybrid Technique using CNN+LSTM for Speech Emotion Recognition

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.e1027.069520 ◽

2020 ◽

Vol 9 (5) ◽

pp. 1126-1130

Keyword(s):

Feature Extraction ◽

Human Computer Interaction ◽

Emotion Recognition ◽

Speech Emotion Recognition ◽

Hybrid Technique ◽

Proposed Model ◽

High Level ◽

High Level Feature ◽

Convolutional Lstm

Automatic speech emotion recognition is a very necessary activity for effective human-computer interaction. This paper is motivated by using spectrograms as inputs to the hybrid deep convolutional LSTM for speech emotion recognition. In this study, we trained our proposed model using four convolutional layers for high-level feature extraction from input spectrograms, LSTM layer for accumulating long-term dependencies and finally two dense layers. Experimental results on the SAVEE database shows promising performance. Our proposed model is highly capable as it obtained an accuracy of 94.26%.

Download Full-text