A Generalizable Speech Emotion Recognition Model Reveals Depression and Remission

Two-level discriminative speech emotion recognition model with wave field dynamics: A personalized speech emotion recognition method

Computer Communications ◽

10.1016/j.comcom.2021.09.013 ◽

2021 ◽

Author(s):

Ning Jia ◽

Chunjun Zheng

Keyword(s):

Emotion Recognition ◽

Wave Field ◽

Speech Emotion Recognition ◽

Recognition Method ◽

Recognition Model

Download Full-text

Construction and Research of E-sports Speech Emotion Recognition Model

Lecture Notes in Electrical Engineering - Innovative Computing ◽

10.1007/978-981-16-4258-6_4 ◽

2022 ◽

pp. 23-31

Author(s):

Jason C. Hung ◽

Jin-Che Chen

Keyword(s):

Emotion Recognition ◽

Speech Emotion Recognition ◽

Recognition Model

Download Full-text

CNN-based Speech Emotion Recognition Model Applying Transfer Learning and Attention Mechanism

Journal of KIISE ◽

10.5626/jok.2020.47.7.665 ◽

2020 ◽

Vol 47 (7) ◽

pp. 665-673

Author(s):

Jung Hyun Lee ◽

Ui Nyoung Yoon ◽

Geun-Sik Jo

Keyword(s):

Emotion Recognition ◽

Transfer Learning ◽

Attention Mechanism ◽

Speech Emotion Recognition ◽

Recognition Model

Download Full-text

Speech emotion recognition model based on Bi-GRU and Focal Loss

Pattern Recognition Letters ◽

10.1016/j.patrec.2020.11.009 ◽

2020 ◽

Vol 140 ◽

pp. 358-365

Author(s):

Zijiang Zhu ◽

Weihuang Dai ◽

Yi Hu ◽

Junshan Li

Keyword(s):

Emotion Recognition ◽

Speech Emotion Recognition ◽

Recognition Model ◽

Model Based

Download Full-text

Speech Emotion Recognition Model Based on CRNN-CTC

Advances in Intelligent Systems and Computing - 2020 International Conference on Applications and Techniques in Cyber Intelligence ◽

10.1007/978-3-030-53980-1_113 ◽

2020 ◽

pp. 771-778

Author(s):

Zijiang Zhu ◽

Weihuang Dai ◽

Yi Hu ◽

Junhua Wang ◽

Junshan Li

Keyword(s):

Emotion Recognition ◽

Speech Emotion Recognition ◽

Recognition Model ◽

Model Based

Download Full-text

A Speech Emotion Recognition Model Based on Multi-Level Local Binary and Local Ternary Patterns

IEEE Access ◽

10.1109/access.2020.3031763 ◽

2020 ◽

Vol 8 ◽

pp. 190784-190796

Author(s):

Yesim Ulgen Sonmez ◽

Asaf Varol

Keyword(s):

Emotion Recognition ◽

Speech Emotion Recognition ◽

Recognition Model ◽

Model Based ◽

Local Ternary Patterns ◽

Multi Level

Download Full-text

Pattern recognition and features selection for speech emotion recognition model using deep learning

International Journal of Speech Technology ◽

10.1007/s10772-020-09690-2 ◽

2020 ◽

Vol 23 (4) ◽

pp. 799-806

Author(s):

Kittisak Jermsittiparsert ◽

Abdurrahman Abdurrahman ◽

Parinya Siriattakul ◽

Ludmila A. Sundeeva ◽

Wahidah Hashim ◽

...

Keyword(s):

Pattern Recognition ◽

Deep Learning ◽

Emotion Recognition ◽

Speech Emotion Recognition ◽

Features Selection ◽

Recognition Model ◽

Selection For

Download Full-text

Performance Improvement of Speech Emotion Recognition Model Using Generative Adversarial Networks

The Journal of Korean Institute of Information Technology ◽

10.14801/jkiit.2019.17.11.77 ◽

2019 ◽

Vol 17 (11) ◽

pp. 77-85

Author(s):

You-Jung Ko ◽

Yoon-Joong Kim

Keyword(s):

Emotion Recognition ◽

Performance Improvement ◽

Generative Adversarial Networks ◽

Speech Emotion Recognition ◽

Recognition Model ◽

Adversarial Networks

Download Full-text

A Generalizable Speech Emotion Recognition Model Reveals Depression and Remission

10.1101/2021.09.01.458536 ◽

2021 ◽

Author(s):

Lasse Hansen ◽

Yan-Ping Zhang ◽

Detlef Wolf ◽

Konstantinos Sechidis ◽

Nicolai Ladegaard ◽

...

Keyword(s):

Emotion Recognition ◽

Background Noise ◽

Small Sample ◽

Speech Emotion Recognition ◽

First Episode ◽

Control Group ◽

Group Model ◽

Healthy Controls ◽

Voice Analysis ◽

Recognition Model

Objective: Affective disorders have long been associated with atypical voice patterns, however, current work on automated voice analysis often suffers from small sample sizes and untested generalizability. This study investigated a generalizable approach to aid clinical evaluation of depression and remission from voice. Methods: A Mixture-of-Experts machine learning model was trained to infer happy/sad emotional state using three publicly available emotional speech corpora. We examined the model's predictive ability to classify the presence of depression on Danish speaking healthy controls (N = 42), patients with first-episode major depressive disorder (MDD) (N = 40), and the same patients in remission (N = 25) based on recorded clinical interviews. The model was evaluated on raw data, data cleaned for background noise, and speaker diarized data. Results: The model showed reliable separation between healthy controls and depressed patients at the first visit, obtaining an AUC of 0.71. Further, we observed a reliable treatment effect in the depression group, with speech from patients in remission being indistinguishable from that of the control group. Model predictions were stable throughout the interview, suggesting that as little as 20-30 seconds of speech is enough to accurately screen a patient. Background noise (but not speaker diarization) heavily impacted predictions, suggesting that a controlled environment and consistent preprocessing pipelines are crucial for correct characterizations. Conclusion: A generalizable speech emotion recognition model can effectively reveal changes in speaker depressive states before and after treatment in patients with MDD. Data collection settings and data cleaning are crucial when considering automated voice analysis for clinical purposes.

Download Full-text

Speech Emotion Recognition Model with Time-Scale-Invariance MFCCs as Input

2021 International Wireless Communications and Mobile Computing (IWCMC) ◽

10.1109/iwcmc51323.2021.9498598 ◽

2021 ◽

Author(s):

Xiaohan Xie ◽

Jiaqi Lou ◽

Lingzhi Zhang

Keyword(s):

Emotion Recognition ◽

Time Scale ◽

Scale Invariance ◽

Speech Emotion Recognition ◽

Recognition Model

Download Full-text