Multimodal information fusion application to human emotion recognition from face and speech

2009 ◽  
Vol 49 (2) ◽  
pp. 277-297 ◽  
Author(s):  
Muharram Mansoorizadeh ◽  
Nasrollah Moghaddam Charkari
2021 ◽  
Author(s):  
Zhibing Xie

Understanding human emotional states is indispensable for our daily interaction, and we can enjoy more natural and friendly human computer interaction (HCI) experience by fully utilizing human’s affective states. In the application of emotion recognition, multimodal information fusion is widely used to discover the relationships of multiple information sources and make joint use of a number of channels, such as speech, facial expression, gesture and physiological processes. This thesis proposes a new framework of emotion recognition using information fusion based on the estimation of information entropy. The novel techniques of information theoretic learning are applied to feature level fusion and score level fusion. The most critical issues for feature level fusion are feature transformation and dimensionality reduction. The existing methods depend on the second order statistics, which is only optimal for Gaussian-like distributions. By incorporating information theoretic tools, a new feature level fusion method based on kernel entropy component analysis is proposed. For score level fusion, most previous methods focus on predefined rule based approaches, which are usually heuristic. In this thesis, a connection between information fusion and maximum correntropy criterion is established for effective score level fusion. Feature level fusion and score level fusion methods are then combined to introduce a two-stage fusion platform. The proposed methods are applied to audiovisual emotion recognition, and their effectiveness is evaluated by experiments on two publicly available audiovisual emotion databases. The experimental results demonstrate that the proposed algorithms achieve improved performance in comparison with the existing methods. The work of this thesis offers a promising direction to design more advanced emotion recognition systems based on multimodal information fusion and has great significance to the development of intelligent human computer interaction systems.


2020 ◽  
Vol 53 ◽  
pp. 209-221 ◽  
Author(s):  
Yingying Jiang ◽  
Wei Li ◽  
M. Shamim Hossain ◽  
Min Chen ◽  
Abdulhameed Alelaiwi ◽  
...  

2021 ◽  
Author(s):  
Zhibing Xie

Understanding human emotional states is indispensable for our daily interaction, and we can enjoy more natural and friendly human computer interaction (HCI) experience by fully utilizing human’s affective states. In the application of emotion recognition, multimodal information fusion is widely used to discover the relationships of multiple information sources and make joint use of a number of channels, such as speech, facial expression, gesture and physiological processes. This thesis proposes a new framework of emotion recognition using information fusion based on the estimation of information entropy. The novel techniques of information theoretic learning are applied to feature level fusion and score level fusion. The most critical issues for feature level fusion are feature transformation and dimensionality reduction. The existing methods depend on the second order statistics, which is only optimal for Gaussian-like distributions. By incorporating information theoretic tools, a new feature level fusion method based on kernel entropy component analysis is proposed. For score level fusion, most previous methods focus on predefined rule based approaches, which are usually heuristic. In this thesis, a connection between information fusion and maximum correntropy criterion is established for effective score level fusion. Feature level fusion and score level fusion methods are then combined to introduce a two-stage fusion platform. The proposed methods are applied to audiovisual emotion recognition, and their effectiveness is evaluated by experiments on two publicly available audiovisual emotion databases. The experimental results demonstrate that the proposed algorithms achieve improved performance in comparison with the existing methods. The work of this thesis offers a promising direction to design more advanced emotion recognition systems based on multimodal information fusion and has great significance to the development of intelligent human computer interaction systems.


Author(s):  
Zhibing Xie ◽  
Ling Guan

This paper aims at providing general theoretical analysis for the issue of multimodal information fusion and implementing novel information theoretic tools in multimedia application. The most essential issues for information fusion include feature transformation and reduction of feature dimensionality. Most previous solutions are largely based on the second order statistics, which is only optimal for Gaussian-like distribution, while in this paper we describe kernel entropy component analysis (KECA) which utilizes descriptor of information entropy and achieves improved performance by entropy estimation. The authors present a new solution based on the integration of information fusion theory and information theoretic tools in this paper. The proposed method has been applied to audiovisual emotion recognition. Information fusion has been implemented for audio and video channels at feature level and decision level. Experimental results demonstrate that the proposed algorithm achieves improved performance in comparison with the existing methods, especially when the dimension of feature space is substantially reduced.


2013 ◽  
Vol 07 (01) ◽  
pp. 25-42 ◽  
Author(s):  
ZHIBING XIE ◽  
LING GUAN

This paper focuses on the application of novel information theoretic tools in the area of information fusion. Feature transformation and fusion is critical for the performance of information fusion, however, the majority of the existing works depend on second order statistics, which is only optimal for Gaussian-like distribution. In this paper, the integration of information fusion techniques and kernel entropy component analysis provides a new information theoretic tool. The fusion of features is realized using descriptor of information entropy and is optimized by entropy estimation. A novel multimodal information fusion strategy of audio emotion recognition based on kernel entropy component analysis (KECA) has been presented. The effectiveness of the proposed solution is evaluated through experimentation on two audiovisual emotion databases. Experimental results show that the proposed solution outperforms the existing methods, especially when the dimension of feature space is substantially reduced. The proposed method offers general theoretical analysis which gives us an approach to implement information theory into multimedia research.


Sign in / Sign up

Export Citation Format

Share Document