Emotion Analysis from Human Voice Using Various Prosodic Features and Text Analysis

Emotion Analysis is a dynamic field of research with the aim to provide a method to recognize the emotions of a person only from their voice. It is more famously recognized as the Speech Emotion Recognition (SER) problem. This problem has been studied upon from more than a decade with results coming from either Voice Analysis or Text Analysis. Individually, both these methods have shown a good accuracy up till now. But, the use of both of these methods in unison has showed a much more better result than either one of those parts considered individually. When different people of different age groups are talking, it is important to understand their emotions behind what they say as this will in turn help us in reacting better. To try and achieve this, the paper implements a model which performs Emotion Analysis based on both Tone and Text Analysis. The prosodic features of the tone are analyzed and then the speech is converted to text. Once the text has been extracted from the speech, Sentiment Analysis is done on the extracted text to further improve the accuracy of the Emotion Recognition.

Download Full-text

Speech emotion recognition using hybrid spectral-prosodic features of speech signal/glottal waveform, metaheuristic-based dimensionality reduction, and Gaussian elliptical basis function network classifier

Applied Acoustics ◽

10.1016/j.apacoust.2020.107360 ◽

2020 ◽

Vol 166 ◽

pp. 107360 ◽

Cited By ~ 4

Author(s):

Fatemeh Daneshfar ◽

Seyed Jahanshah Kabudian ◽

Abbas Neekabadi

Keyword(s):

Dimensionality Reduction ◽

Emotion Recognition ◽

Basis Function ◽

Speech Signal ◽

Speech Emotion Recognition ◽

Prosodic Features ◽

Glottal Waveform ◽

Function Network

Download Full-text

Speech Emotion Recognition Using Both Spectral and Prosodic Features

2009 International Conference on Information Engineering and Computer Science ◽

10.1109/iciecs.2009.5362730 ◽

2009 ◽

Cited By ~ 35

Author(s):

Yu Zhou ◽

Yanqing Sun ◽

Jianping Zhang ◽

Yonghong Yan

Keyword(s):

Emotion Recognition ◽

Speech Emotion Recognition ◽

Prosodic Features ◽

Spectral And Prosodic Features

Download Full-text

Speech Emotion Recognition and Sentiment Analysis based Therapist Bot

10.1109/icirca51532.2021.9544671 ◽

2021 ◽

Author(s):

Yashwardhan Bhangdia ◽

Rashi Bhansali ◽

Ninad Chaudhari ◽

Dimple Chandnani ◽

M L Dhore

Keyword(s):

Emotion Recognition ◽

Sentiment Analysis ◽

Speech Emotion Recognition

Download Full-text

Study of Speech Emotion Recognition Based on Prosodic Parameters and Facial Expression Features

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.241-244.1677 ◽

2012 ◽

Vol 241-244 ◽

pp. 1677-1681

Author(s):

Yu Tai Wang ◽

Jie Han ◽

Xiao Qing Jiang ◽

Jing Zou ◽

Hui Zhao

Keyword(s):

Facial Expression ◽

Emotion Recognition ◽

Recognition Rate ◽

Single Mode ◽

Gaussian Mixture ◽

Speech Emotion Recognition ◽

Emotional States ◽

Prosodic Features ◽

Single Model ◽

Model Recognition

The present status of speech emotion recognition was introduced in the paper. The emotional databases of Chinese speech and facial expressions were established with the noise stimulus and movies evoking subjects' emtion. For different emotional states, we analyzed the single-mode speech emotion recognitions based the prosodic features and the geometric features of facial expression. Then, we discussed the bimodal emotion recognition by the use of Gaussian Mixture Model. The experimental results show that, the bimodal emotion recognition rate combined with facial expression is about 6% higher than the single model recognition rate merely using prosodic features.

Download Full-text

A Hybrid Speech Emotion Recognition System Based on Spectral and Prosodic Features

IEICE Transactions on Information and Systems ◽

10.1587/transinf.e93.d.2813 ◽

2010 ◽

Vol E93-D (10) ◽

pp. 2813-2821 ◽

Cited By ~ 1

Author(s):

Yu ZHOU ◽

Junfeng LI ◽

Yanqing SUN ◽

Jianping ZHANG ◽

Yonghong YAN ◽

...

Keyword(s):

Emotion Recognition ◽

Recognition System ◽

Speech Emotion Recognition ◽

Prosodic Features ◽

Spectral And Prosodic Features

Download Full-text

Speech Emotion Recognition Based on Hyper-Prosodic Features

2018 International Computers, Signals and Systems Conference (ICOMSSC) ◽

10.1109/icomssc45026.2018.8941666 ◽

2018 ◽

Author(s):

Jin Bicheng ◽

Liu Gang

Keyword(s):

Emotion Recognition ◽

Speech Emotion Recognition ◽

Prosodic Features

Download Full-text

Speech Emotion Recognition Based on Hyper-Prosodic Features

2017 International Conference on Computer Technology, Electronics and Communication (ICCTEC) ◽

10.1109/icctec.2017.00027 ◽

2017 ◽

Author(s):

Bicheng Jin ◽

Gang Liu

Keyword(s):

Emotion Recognition ◽

Speech Emotion Recognition ◽

Prosodic Features

Download Full-text

Analysis of Linguistic and Prosodic Features of Bilingual Arabic–English Speakers for Speech Emotion Recognition

IEEE Access ◽

10.1109/access.2020.2987864 ◽

2020 ◽

Vol 8 ◽

pp. 72957-72970 ◽

Cited By ~ 1

Author(s):

Lamiaa Abdel-Hamid ◽

Nabil H. Shaker ◽

Ingy Emara

Keyword(s):

Emotion Recognition ◽

Speech Emotion Recognition ◽

English Speakers ◽

Prosodic Features

Download Full-text

A Generalizable Speech Emotion Recognition Model Reveals Depression and Remission

10.1101/2021.09.01.458536 ◽

2021 ◽

Author(s):

Lasse Hansen ◽

Yan-Ping Zhang ◽

Detlef Wolf ◽

Konstantinos Sechidis ◽

Nicolai Ladegaard ◽

...

Keyword(s):

Emotion Recognition ◽

Background Noise ◽

Small Sample ◽

Speech Emotion Recognition ◽

First Episode ◽

Control Group ◽

Group Model ◽

Healthy Controls ◽

Voice Analysis ◽

Recognition Model

Objective: Affective disorders have long been associated with atypical voice patterns, however, current work on automated voice analysis often suffers from small sample sizes and untested generalizability. This study investigated a generalizable approach to aid clinical evaluation of depression and remission from voice. Methods: A Mixture-of-Experts machine learning model was trained to infer happy/sad emotional state using three publicly available emotional speech corpora. We examined the model's predictive ability to classify the presence of depression on Danish speaking healthy controls (N = 42), patients with first-episode major depressive disorder (MDD) (N = 40), and the same patients in remission (N = 25) based on recorded clinical interviews. The model was evaluated on raw data, data cleaned for background noise, and speaker diarized data. Results: The model showed reliable separation between healthy controls and depressed patients at the first visit, obtaining an AUC of 0.71. Further, we observed a reliable treatment effect in the depression group, with speech from patients in remission being indistinguishable from that of the control group. Model predictions were stable throughout the interview, suggesting that as little as 20-30 seconds of speech is enough to accurately screen a patient. Background noise (but not speaker diarization) heavily impacted predictions, suggesting that a controlled environment and consistent preprocessing pipelines are crucial for correct characterizations. Conclusion: A generalizable speech emotion recognition model can effectively reveal changes in speaker depressive states before and after treatment in patients with MDD. Data collection settings and data cleaning are crucial when considering automated voice analysis for clinical purposes.

Download Full-text

Attention and Feature Selection for Automatic Speech Emotion Recognition Using Utterance and Syllable-Level Prosodic Features

Circuits Systems and Signal Processing ◽

10.1007/s00034-020-01429-3 ◽

2020 ◽

Vol 39 (11) ◽

pp. 5681-5709

Author(s):

Starlet Ben Alex ◽

Leena Mary ◽

Ben P. Babu

Keyword(s):

Feature Selection ◽

Emotion Recognition ◽

Speech Emotion Recognition ◽

Prosodic Features ◽

Selection For

Download Full-text