scholarly journals Multimodal Emotion Recognition With Transformer-Based Self Supervised Feature Fusion

IEEE Access ◽  
2020 ◽  
Vol 8 ◽  
pp. 176274-176285
Author(s):  
Shamane Siriwardhana ◽  
Tharindu Kaluarachchi ◽  
Mark Billinghurst ◽  
Suranga Nanayakkara
2018 ◽  
Vol 174 ◽  
pp. 33-42 ◽  
Author(s):  
Dung Nguyen ◽  
Kien Nguyen ◽  
Sridha Sridharan ◽  
David Dean ◽  
Clinton Fookes

2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Yifeng Zhao ◽  
Deyun Chen

Due to the complexity of human emotions, there are some similarities between different emotion features. The existing emotion recognition method has the problems of difficulty of character extraction and low accuracy, so the bidirectional LSTM and attention mechanism based on the expression EEG multimodal emotion recognition method are proposed. Firstly, facial expression features are extracted based on the bilinear convolution network (BCN), and EEG signals are transformed into three groups of frequency band image sequences, and BCN is used to fuse the image features to obtain the multimodal emotion features of expression EEG. Then, through the LSTM with the attention mechanism, important data is extracted in the process of timing modeling, which effectively avoids the randomness or blindness of sampling methods. Finally, a feature fusion network with a three-layer bidirectional LSTM structure is designed to fuse the expression and EEG features, which is helpful to improve the accuracy of emotion recognition. On the MAHNOB-HCI and DEAP datasets, the proposed method is tested based on the MATLAB simulation platform. Experimental results show that the attention mechanism can enhance the visual effect of the image, and compared with other methods, the proposed method can extract emotion features from expressions and EEG signals more effectively, and the accuracy of emotion recognition is higher.


2021 ◽  
Author(s):  
Kevin Tang

In this thesis, we propose Protected Multimodal Emotion recognition (PMM-ER), an emotion recognition approach that includes security features against the growing rate of cyber-attacks on various databases, including emotion databases. The analysis on the frequently used encryption algorithms has led to the modified encryption algorithm proposed in this work. The system is able to recognize 7 different emotions, i.e. happiness, sadness, surprise, fear, disgust and anger, as well as a neutral emotion state, based on 2D video frames, 3D vertices, and audio wave information. Several well-known features are employed, including the HSV colour feature, iterative closest point (ICP) and Mel-frequency cepstral coefficients (MFCCs). We also propose a novel approach to feature fusion including both decision- and feature-level fusion, and some well-known classification and feature extraction algorithms such as principle component analysis (PCA), linear discernment analysis (LDA) and canonical correlation analysis (CCA) are compared in this study.


2021 ◽  
Author(s):  
Kevin Tang

In this thesis, we propose Protected Multimodal Emotion recognition (PMM-ER), an emotion recognition approach that includes security features against the growing rate of cyber-attacks on various databases, including emotion databases. The analysis on the frequently used encryption algorithms has led to the modified encryption algorithm proposed in this work. The system is able to recognize 7 different emotions, i.e. happiness, sadness, surprise, fear, disgust and anger, as well as a neutral emotion state, based on 2D video frames, 3D vertices, and audio wave information. Several well-known features are employed, including the HSV colour feature, iterative closest point (ICP) and Mel-frequency cepstral coefficients (MFCCs). We also propose a novel approach to feature fusion including both decision- and feature-level fusion, and some well-known classification and feature extraction algorithms such as principle component analysis (PCA), linear discernment analysis (LDA) and canonical correlation analysis (CCA) are compared in this study.


2021 ◽  
Vol 25 (4) ◽  
pp. 1031-1045
Author(s):  
Helang Lai ◽  
Keke Wu ◽  
Lingli Li

Emotion recognition in conversations is crucial as there is an urgent need to improve the overall experience of human-computer interactions. A promising improvement in this field is to develop a model that can effectively extract adequate contexts of a test utterance. We introduce a novel model, termed hierarchical memory networks (HMN), to address the issues of recognizing utterance level emotions. HMN divides the contexts into different aspects and employs different step lengths to represent the weights of these aspects. To model the self dependencies, HMN takes independent local memory networks to model these aspects. Further, to capture the interpersonal dependencies, HMN employs global memory networks to integrate the local outputs into global storages. Such storages can generate contextual summaries and help to find the emotional dependent utterance that is most relevant to the test utterance. With an attention-based multi-hops scheme, these storages are then merged with the test utterance using an addition operation in the iterations. Experiments on the IEMOCAP dataset show our model outperforms the compared methods with accuracy improvement.


Sign in / Sign up

Export Citation Format

Share Document