Multimodal Emotion Recognition With Transformer-Based Self Supervised Feature Fusion

Due to the complexity of human emotions, there are some similarities between different emotion features. The existing emotion recognition method has the problems of difficulty of character extraction and low accuracy, so the bidirectional LSTM and attention mechanism based on the expression EEG multimodal emotion recognition method are proposed. Firstly, facial expression features are extracted based on the bilinear convolution network (BCN), and EEG signals are transformed into three groups of frequency band image sequences, and BCN is used to fuse the image features to obtain the multimodal emotion features of expression EEG. Then, through the LSTM with the attention mechanism, important data is extracted in the process of timing modeling, which effectively avoids the randomness or blindness of sampling methods. Finally, a feature fusion network with a three-layer bidirectional LSTM structure is designed to fuse the expression and EEG features, which is helpful to improve the accuracy of emotion recognition. On the MAHNOB-HCI and DEAP datasets, the proposed method is tested based on the MATLAB simulation platform. Experimental results show that the attention mechanism can enhance the visual effect of the image, and compared with other methods, the proposed method can extract emotion features from expressions and EEG signals more effectively, and the accuracy of emotion recognition is higher.

Download Full-text

Protected multimodal emotion recognition

10.32920/ryerson.14668203.v1 ◽

2021 ◽

Author(s):

Kevin Tang

Keyword(s):

Emotion Recognition ◽

Feature Fusion ◽

Cyber Attacks ◽

Mel Frequency Cepstral Coefficients ◽

Video Frames ◽

Novel Approach ◽

Encryption Algorithms ◽

Colour Feature ◽

Multimodal Emotion Recognition ◽

Level Fusion

In this thesis, we propose Protected Multimodal Emotion recognition (PMM-ER), an emotion recognition approach that includes security features against the growing rate of cyber-attacks on various databases, including emotion databases. The analysis on the frequently used encryption algorithms has led to the modified encryption algorithm proposed in this work. The system is able to recognize 7 different emotions, i.e. happiness, sadness, surprise, fear, disgust and anger, as well as a neutral emotion state, based on 2D video frames, 3D vertices, and audio wave information. Several well-known features are employed, including the HSV colour feature, iterative closest point (ICP) and Mel-frequency cepstral coefficients (MFCCs). We also propose a novel approach to feature fusion including both decision- and feature-level fusion, and some well-known classification and feature extraction algorithms such as principle component analysis (PCA), linear discernment analysis (LDA) and canonical correlation analysis (CCA) are compared in this study.

Download Full-text

Feature Fusion Algorithm for Multimodal Emotion Recognition from Speech and Facial Expression Signal

MATEC Web of Conferences ◽

10.1051/matecconf/20166103012 ◽

2016 ◽

Vol 61 ◽

pp. 03012

Author(s):

Zhiyan Han ◽

Jian Wang

Keyword(s):

Facial Expression ◽

Emotion Recognition ◽

Feature Fusion ◽

Fusion Algorithm ◽

Multimodal Emotion Recognition ◽

Expression Signal

Download Full-text

Feature Fusion for Multimodal Emotion Recognition based on Deep Canonical Correlation Analysis

IEEE Signal Processing Letters ◽

10.1109/lsp.2021.3112314 ◽

2021 ◽

pp. 1-1

Author(s):

Ke Zhang ◽

Yuanqing Li ◽

Jingyu Wang ◽

Zhen Wang ◽

Xuelong Li

Keyword(s):

Correlation Analysis ◽

Emotion Recognition ◽

Canonical Correlation Analysis ◽

Canonical Correlation ◽

Feature Fusion ◽

Multimodal Emotion Recognition

Download Full-text

Protected multimodal emotion recognition

10.32920/ryerson.14668203 ◽

2021 ◽

Author(s):

Kevin Tang

Keyword(s):

Emotion Recognition ◽

Feature Fusion ◽

Cyber Attacks ◽

Mel Frequency Cepstral Coefficients ◽

Video Frames ◽

Novel Approach ◽

Encryption Algorithms ◽

Colour Feature ◽

Multimodal Emotion Recognition ◽

Level Fusion

In this thesis, we propose Protected Multimodal Emotion recognition (PMM-ER), an emotion recognition approach that includes security features against the growing rate of cyber-attacks on various databases, including emotion databases. The analysis on the frequently used encryption algorithms has led to the modified encryption algorithm proposed in this work. The system is able to recognize 7 different emotions, i.e. happiness, sadness, surprise, fear, disgust and anger, as well as a neutral emotion state, based on 2D video frames, 3D vertices, and audio wave information. Several well-known features are employed, including the HSV colour feature, iterative closest point (ICP) and Mel-frequency cepstral coefficients (MFCCs). We also propose a novel approach to feature fusion including both decision- and feature-level fusion, and some well-known classification and feature extraction algorithms such as principle component analysis (PCA), linear discernment analysis (LDA) and canonical correlation analysis (CCA) are compared in this study.

Download Full-text

MULTIMODAL EMOTION RECOGNITION BASED ON FEATURE FUSION FOR ENHANCEMENT OF HUMAN-COMPUTER INTERACTION

IARJSET ◽

10.17148/iarjset.2021.8837 ◽

2021 ◽

Vol 8 (8) ◽

Author(s):

D. Saisanthiya ◽

P. Supraja

Keyword(s):

Human Computer Interaction ◽

Emotion Recognition ◽

Feature Fusion ◽

Multimodal Emotion Recognition ◽

Computer Interaction

Download Full-text

Multimodal emotion recognition with hierarchical memory networks

Intelligent Data Analysis ◽

10.3233/ida-205183 ◽

2021 ◽

Vol 25 (4) ◽

pp. 1031-1045

Author(s):

Helang Lai ◽

Keke Wu ◽

Lingli Li

Keyword(s):

Emotion Recognition ◽

The Self ◽

Global Memory ◽

Human Computer Interactions ◽

Accuracy Improvement ◽

Local Memory ◽

Hierarchical Memory ◽

Multimodal Emotion Recognition ◽

Novel Model ◽

Computer Interactions

Emotion recognition in conversations is crucial as there is an urgent need to improve the overall experience of human-computer interactions. A promising improvement in this field is to develop a model that can effectively extract adequate contexts of a test utterance. We introduce a novel model, termed hierarchical memory networks (HMN), to address the issues of recognizing utterance level emotions. HMN divides the contexts into different aspects and employs different step lengths to represent the weights of these aspects. To model the self dependencies, HMN takes independent local memory networks to model these aspects. Further, to capture the interpersonal dependencies, HMN employs global memory networks to integrate the local outputs into global storages. Such storages can generate contextual summaries and help to find the emotional dependent utterance that is most relevant to the test utterance. With an attention-based multi-hops scheme, these storages are then merged with the test utterance using an addition operation in the iterations. Experiments on the IEMOCAP dataset show our model outperforms the compared methods with accuracy improvement.

Download Full-text

Multimodal Emotion Recognition Fusion Analysis Adapting BERT With Heterogeneous Feature Unification

IEEE Access ◽

10.1109/access.2021.3092735 ◽

2021 ◽

pp. 1-1

Author(s):

Sanghyun Lee ◽

David K. Han ◽

Hanseok Ko

Keyword(s):

Emotion Recognition ◽

Multimodal Emotion Recognition ◽

Fusion Analysis ◽

Feature Unification ◽

Heterogeneous Feature

Download Full-text