An Improved Speech Emotion Recognition Algorithm Based on Deep Belief Network

PurposeNowadays, the speech emotion recognition (SER) model has enhanced as the main research topic in various fields including human–computer interaction as well as speech processing. Generally, it focuses on utilizing the models of machine learning for predicting the exact emotional status from speech. The advanced SER applications go successful in affective computing and human–computer interaction, which is making as the main component of computer system's next generation. This is because the natural human machine interface could grant the automatic service provisions, which need a better appreciation of user's emotional states.Design/methodology/approachThis paper implements a new SER model that incorporates both gender and emotion recognition. Certain features are extracted and subjected for classification of emotions. For this, this paper uses deep belief network DBN model.FindingsThrough the performance analysis, it is observed that the developed method attains high accuracy rate (for best case) when compared to other methods, and it is 1.02% superior to whale optimization algorithm (WOA), 0.32% better from firefly (FF), 23.45% superior to particle swarm optimization (PSO) and 23.41% superior to genetic algorithm (GA). In case of worst scenario, the mean update of particle swarm and whale optimization (MUPW) in terms of accuracy is 15.63, 15.98, 16.06% and 16.03% superior to WOA, FF, PSO and GA, respectively. Under the mean case, the performance of MUPW is high, and it is 16.67, 10.38, 22.30 and 22.47% better from existing methods like WOA, FF, PSO, as well as GA, respectively.Originality/valueThis paper presents a new model for SER that aids both gender and emotion recognition. For the classification purpose, DBN is used and the weight of DBN is used and this is the first work uses MUPW algorithm for finding the optimal weight of DBN model.

Download Full-text

Feature Learning via Deep Belief Network for Chinese Speech Emotion Recognition

Communications in Computer and Information Science - Pattern Recognition ◽

10.1007/978-981-10-3005-5_53 ◽

2016 ◽

pp. 645-651 ◽

Cited By ~ 2

Author(s):

Shiqing Zhang ◽

Xiaoming Zhao ◽

Yuelong Chuang ◽

Wenping Guo ◽

Ying Chen

Keyword(s):

Emotion Recognition ◽

Feature Learning ◽

Deep Belief Network ◽

Speech Emotion Recognition ◽

Belief Network

Download Full-text

Speech Expression Multimodal Emotion Recognition Based on Deep Belief Network

Journal of Grid Computing ◽

10.1007/s10723-021-09564-0 ◽

2021 ◽

Vol 19 (2) ◽

Author(s):

Dong Liu ◽

Longxi Chen ◽

Zhiyong Wang ◽

Guangqiang Diao

Keyword(s):

Emotion Recognition ◽

Deep Belief Network ◽

Belief Network ◽

Multimodal Emotion Recognition

Download Full-text

Hybrid Particle Swarm Optimization-Gravitational Search Algorithm based Deep Belief Network: Speech Emotion Recognition

Journal of Computational Mechanics, Power System and Control ◽

10.46253/jcmps.v4i3.a4 ◽

2021 ◽

Vol 4 (3) ◽

pp. 23-31

Author(s):

J Rajeshwar

Keyword(s):

Particle Swarm Optimization ◽

Emotion Recognition ◽

Search Algorithm ◽

Gravitational Search Algorithm ◽

Speech Emotion Recognition ◽

Swarm Optimization ◽

Hybrid Particle ◽

Belief Network ◽

Hybrid Particle Swarm Optimization ◽

Gravitational Search

Download Full-text

Deep Learning based Speech Emotion Recognition System

Journal of University of Shanghai for Science and Technology ◽

10.51201/jusst/21/121003 ◽

2021 ◽

Vol 23 (12) ◽

pp. 212-223

Author(s):

P Jothi Thilaga ◽

◽

S Kavipriya ◽

K Vijayalakshmi ◽

◽

...

Keyword(s):

Neural Network ◽

Decision Making ◽

Emotion Recognition ◽

Recognition System ◽

Recognition Algorithm ◽

Speech Emotion Recognition ◽

Natural Interaction ◽

Everyday Activities ◽

Verbal Content ◽

Voice Interaction

Emotions are elementary for humans, impacting perception and everyday activities like communication, learning and decision-making. Speech emotion Recognition (SER) systems aim to facilitate the natural interaction with machines by direct voice interaction rather than exploitation ancient devices as input to know verbal content and build it straightforward for human listeners to react. During this SER system primarily composed of 2 sections called feature extraction and feature classification phase. SER implements on bots to speak with humans during a non-lexical manner. The speech emotion recognition algorithm here is predicated on the Convolutional Neural Network (CNN) model, which uses varied modules for emotion recognition and classifiers to differentiate feelings like happiness, calm, anger, neutral state, sadness, and fear. The accomplishment of classification is predicated on extracted features. Finally, the emotion of a speech signal will be determined.

Download Full-text