Emotion Recognition Involving Physiological and Speech Signals: A Comprehensive Review

Author(s):  
Mouhannad Ali ◽  
Ahmad Haj Mosa ◽  
Fadi Al Machot ◽  
Kyandoghere Kyamakya
2021 ◽  
Author(s):  
Talieh Seyed Tabtabae

Automatic Emotion Recognition (AER) is an emerging research area in the Human-Computer Interaction (HCI) field. As Computers are becoming more and more popular every day, the study of interaction between humans (users) and computers is catching more attention. In order to have a more natural and friendly interface between humans and computers, it would be beneficial to give computers the ability to recognize situations the same way a human does. Equipped with an emotion recognition system, computers will be able to recognize their users' emotional state and show the appropriate reaction to that. In today's HCI systems, machines can recognize the speaker and also content of the speech, using speech recognition and speaker identification techniques. If machines are equipped with emotion recognition techniques, they can also know "how it is said" to react more appropriately, and make the interaction more natural. One of the most important human communication channels is the auditory channel which carries speech and vocal intonation. In fact people can perceive each other's emotional state by the way they talk. Therefore in this work the speech signals are analyzed in order to set up an automatic system which recognizes the human emotional state. Six discrete emotional states have been considered and categorized in this research: anger, happiness, fear, surprise, sadness, and disgust. A set of novel spectral features are proposed in this contribution. Two approaches are applied and the results are compared. In the first approach, all the acoustic features are extracted from consequent frames along the speech signals. The statistical values of features are considered to constitute the features vectors. Suport Vector Machine (SVM), which is a relatively new approach in the field of machine learning is used to classify the emotional states. In the second approach, spectral features are extracted from non-overlapping logarithmically-spaced frequency sub-bands. In order to make use of all the extracted information, sequence discriminant SVMs are adopted. The empirical results show that the employed techniques are very promising.


2011 ◽  
Vol 121-126 ◽  
pp. 815-819 ◽  
Author(s):  
Yu Qiang Qin ◽  
Xue Ying Zhang

Ensemble empirical mode decomposition(EEMD) is a newly developed method aimed at eliminating mode mixing present in the original empirical mode decomposition (EMD). To evaluate the performance of this new method, this paper investigates the effect of two parameters pertinent to EEMD: the emotional envelop and the number of emotional ensemble trials. At the same time, the proposed technique has been utilized for four kinds of emotional(angry、happy、sad and neutral) speech signals, and compute the number of each emotional ensemble trials. We obtain an emotional envelope by transforming the IMFe of emotional speech signals, and obtain a new method of emotion recognition according to different emotional envelop and emotional ensemble trials.


2007 ◽  
Vol 16 (06) ◽  
pp. 1001-1014 ◽  
Author(s):  
PANAGIOTIS ZERVAS ◽  
IOSIF MPORAS ◽  
NIKOS FAKOTAKIS ◽  
GEORGE KOKKINAKIS

This paper presents and discusses the problem of emotion recognition from speech signals with the utilization of features bearing intonational information. In particular parameters extracted from Fujisaki's model of intonation are presented and evaluated. Machine learning models were build with the utilization of C4.5 decision tree inducer, instance based learner and Bayesian learning. The datasets utilized for the purpose of training machine learning models were extracted from two emotional databases of acted speech. Experimental results showed the effectiveness of Fujisaki's model attributes since they enhanced the recognition process for most of the emotion categories and learning approaches helping to the segregation of emotion categories.


2008 ◽  
Vol 18 (4) ◽  
pp. 494-500 ◽  
Author(s):  
Byung-Wook Jung ◽  
Seung-Pyo Cheun ◽  
Youn-Tae Kim ◽  
Sung-Shin Kim

Author(s):  
Semiye Demircan ◽  
Humar Kahramanli

ER (Emotion Recognition) from speech signals has been among the attractive subjects lately. As known feature extraction and feature selection are most important process steps in ER from speech signals. The aim of present study is to select the most relevant spectral feature subset. The proposed method is based on feature selection with optimization algorithm among the features obtained from speech signals. Firstly, MFCC (Mel-Frequency Cepstrum Coefficients) were extracted from the EmoDB. Several statistical values as maximum, minimum, mean, standard deviation, skewness, kurtosis and median were obtained from MFCC. The next process of study was feature selection which was performed in two stages: In the first stage ABM (Agent-Based Modelling) that is hardly applied to this area was applied to actual features. In the second stageOpt-aiNET optimization algorithm was applied in order to choose the agent group giving the best classification success. The last process of the study is classification. ANN (Artificial Neural Network) and 10 cross-validations were used for classification and evaluation. A narrow comprehension with three emotions was performed in the application. As a result, it was seen that the classification accuracy was rising after applying proposed method. The method was shown promising performance with spectral features.


Sign in / Sign up

Export Citation Format

Share Document