Human-Machine Interaction Personalization: a Review on Gender and Emotion Recognition Through Speech Analysis

Author(s):  
Monica La Mura ◽  
Patrizia Lamberti

Emotion recognition is a rapidly growing research field. Emotions can be effectively expressed through speech and can provide insight about speaker’s intentions. Although, humans can easily interpret emotions through speech, physical gestures, and eye movement but to train a machine to do the same with similar preciseness is quite a challenging task. SER systems can improve human-machine interaction when used with automatic speech recognition, as emotions have the tendency to change the semantics of a sentence. Many researchers have contributed their extremely impressive work in this research area, leading to development of numerous classification, feature selection, feature extraction and emotional speech databases. This paper reviews recent accomplishments in the area of speech emotion recognition. It also present a detailed review of various types of emotional speech databases, and different classification techniques which can be used individually or in combination and a brief description of various speech features for emotion recognition.


Author(s):  
Hai-Duong Nguyen ◽  
Soonja Yeom ◽  
Guee-Sang Lee ◽  
Hyung-Jeong Yang ◽  
In-Seop Na ◽  
...  

Emotion recognition plays an indispensable role in human–machine interaction system. The process includes finding interesting facial regions in images and classifying them into one of seven classes: angry, disgust, fear, happy, neutral, sad, and surprise. Although many breakthroughs have been made in image classification, especially in facial expression recognition, this research area is still challenging in terms of wild sampling environment. In this paper, we used multi-level features in a convolutional neural network for facial expression recognition. Based on our observations, we introduced various network connections to improve the classification task. By combining the proposed network connections, our method achieved competitive results compared to state-of-the-art methods on the FER2013 dataset.


2020 ◽  
Vol 7 ◽  
Author(s):  
Matteo Spezialetti ◽  
Giuseppe Placidi ◽  
Silvia Rossi

A fascinating challenge in the field of human–robot interaction is the possibility to endow robots with emotional intelligence in order to make the interaction more intuitive, genuine, and natural. To achieve this, a critical point is the capability of the robot to infer and interpret human emotions. Emotion recognition has been widely explored in the broader fields of human–machine interaction and affective computing. Here, we report recent advances in emotion recognition, with particular regard to the human–robot interaction context. Our aim is to review the state of the art of currently adopted emotional models, interaction modalities, and classification strategies and offer our point of view on future developments and critical issues. We focus on facial expressions, body poses and kinematics, voice, brain activity, and peripheral physiological responses, also providing a list of available datasets containing data from these modalities.


Author(s):  
Nagaraja N Poojary ◽  
Dr. Shivakumar G S ◽  
Akshath Kumar B.H

Language is human's most important communication and speech is basic medium of communication. Emotion plays a crucial role in social interaction. Recognizing the emotion in a speech is important as well as challenging because here we are dealing with human machine interaction. Emotion varies from person to person were same person have different emotions all together has different way express it. When a person express his emotion each will be having different energy, pitch and tone variation are grouped together considering upon different subject. Therefore the speech emotion recognition is a future goal of computer vision. The aim of our project is to develop the smart emotion recognition speech based on the convolutional neural network. Which uses different modules for emotion recognition and the classifier are used to differentiate emotion such as happy sad angry surprise. The machine will convert the human speech signals into waveform and process its routine at last it will display the emotion. The data is speech sample and the characteristics are extracted from the speech sample using librosa package. We are using RAVDESS dataset which are used as an experimental dataset. This study shows that for our dataset all classifiers achieve an accuracy of 68%.


Author(s):  
Rama Chaudhary ◽  
Ram Avtar Jaswal

In modern time, the human-machine interaction technology has been developed so much for recognizing human emotional states depending on physiological signals. The emotional states of human can be recognized by using facial expressions, but sometimes it doesn’t give accurate results. For example, if we detect the accuracy of facial expression of sad person, then it will not give fully satisfied result because sad expression also include frustration, irritation, anger, etc. therefore, it will not be possible to determine the particular expression. Therefore, emotion recognition using Electroencephalogram (EEG), Electrocardiogram (ECG) has gained so much attraction because these are based on brain and heart signals respectively. So, after analyzing all the factors, it is decided to recognize emotional states based on EEG using DEAP Dataset. So that, the better accuracy can be achieved.


Author(s):  
Sanghamitra Mohanty ◽  
Basanta Kumar Swain

Communication will be intelligible when conveyed message is interpreted in right-minded. Unfortunately, the rightminded interpretation of communicated message is possible for human-human communication but it’s laborious for humanmachine communication. It is due to the inherently blending of non-verbal contents such as emotion in vocal communication which leads to difficulty in human-machine interaction. In this research paper we have performed experiment to recognize emotions like anger, sadness, astonish, fear, happiness and neutral using fuzzy K-Means algorithm from Oriya elicited speech collected from 35 Oriya speaking people aged between 22- 58 years belonging to different provinces of Orissa. We have achieved the accuracy of 65.16% in recognizing above six mentioned emotions by incorporating mean pitch, first two formants, jitter, shimmer and energy as feature vectors for this research work. Emotion recognition has many vivid applications in different domains like call centers, spoken tutoring systems, spoken dialogue research, human-robotic interfaces etc.


Sign in / Sign up

Export Citation Format

Share Document