A Two-Stage Attention Based Modality Fusion Framework for Multi-Modal Speech Emotion Recognition

In the recent years, researchers are focusing to improve the accuracy of speech emotion recognition. Generally, high emotion recognition accuracies were obtained for two-class emotion recognition, but multi-class emotion recognition is still a challenging task . The main aim of this work is to propose a two-stage feature reduction using Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) for improving the accuracy of the speech emotion recognition (ER) system. Short-term speech features were extracted from the emotional speech signals. Experiments were carried out using four different supervised classifi ers with two different emotional speech databases. From the experimental results, it can be inferred that the proposed method provides better accuracies of 87.48% for speaker dependent (SD) and gender dependent (GD) ER experiment, 85.15% for speaker independent (SI) ER experiment, and 87.09% for gender independent (GI) experiment.

Download Full-text

A Multi-Scale Fusion Framework for Bimodal Speech Emotion Recognition

10.21437/interspeech.2020-3156 ◽

2020 ◽

Author(s):

Ming Chen ◽

Xudong Zhao

Keyword(s):

Emotion Recognition ◽

Speech Emotion Recognition ◽

Multi Scale ◽

Fusion Framework

Download Full-text

An optimal two stage feature selection for speech emotion recognition using acoustic features

International Journal of Speech Technology ◽

10.1007/s10772-016-9358-0 ◽

2016 ◽

Vol 19 (4) ◽

pp. 657-667 ◽

Cited By ~ 9

Author(s):

Swarna Kuchibhotla ◽

Hima Deepthi Vankayalapati ◽

Koteswara Rao Anne

Keyword(s):

Feature Selection ◽

Emotion Recognition ◽

Speech Emotion Recognition ◽

Acoustic Features ◽

Two Stage ◽

Selection For

Download Full-text

Multi-classification speech emotion recognition based on two-stage bottleneck features selection and MCJD algorithm

Signal Image and Video Processing ◽

10.1007/s11760-021-02076-0 ◽

2022 ◽

Author(s):

Linhui Sun ◽

Yiqing Huang ◽

Qiu Li ◽

Pingan Li

Keyword(s):

Emotion Recognition ◽

Speech Emotion Recognition ◽

Features Selection ◽

Two Stage ◽

Multi Classification

Download Full-text

Speech Emotion Recognition Based on Sparse Representation

Archives of Acoustics ◽

10.2478/aoa-2013-0055 ◽

2013 ◽

Vol 38 (4) ◽

pp. 465-470 ◽

Cited By ~ 11

Author(s):

Jingjie Yan ◽

Xiaolan Wang ◽

Weiyi Gu ◽

LiLi Ma

Keyword(s):

Dimensionality Reduction ◽

Emotion Recognition ◽

Least Squares ◽

Partial Least Squares ◽

Partial Least Squares Regression ◽

Speech Emotion Recognition ◽

Least Squares Regression ◽

Computer Science Pedagogy ◽

Reduction Methods ◽

Analysis Computer

Abstract Speech emotion recognition is deemed to be a meaningful and intractable issue among a number of do- mains comprising sentiment analysis, computer science, pedagogy, and so on. In this study, we investigate speech emotion recognition based on sparse partial least squares regression (SPLSR) approach in depth. We make use of the sparse partial least squares regression method to implement the feature selection and dimensionality reduction on the whole acquired speech emotion features. By the means of exploiting the SPLSR method, the component parts of those redundant and meaningless speech emotion features are lessened to zero while those serviceable and informative speech emotion features are maintained and selected to the following classification step. A number of tests on Berlin database reveal that the recogni- tion rate of the SPLSR method can reach up to 79.23% and is superior to other compared dimensionality reduction methods.

Download Full-text