scholarly journals Deep Stacked Autoencoder-Based Automatic Emotion Recognition Using an Efficient Hybrid Local Texture Descriptor

2022 ◽  
Vol 15 (1) ◽  
pp. 1-26
Author(s):  
Shanthi Pitchaiyan ◽  
Nickolas Savarimuthu

Extracting an effective facial feature representation is the critical task for an automatic expression recognition system. Local Binary Pattern (LBP) is known to be a popular texture feature for facial expression recognition. However, only a few approaches utilize the relationship between local neighborhood pixels itself. This paper presents a Hybrid Local Texture Descriptor (HLTD) which is derived from the logical fusion of Local Neighborhood XNOR Patterns (LNXP) and LBP to investigate the potential of positional pixel relationship in automatic emotion recognition. The LNXP encodes texture information based on two nearest vertical and/or horizontal neighboring pixel of the current pixel whereas LBP encodes the center pixel relationship of the neighboring pixel. After logical feature fusion, the Deep Stacked Autoencoder (DSA) is established on the CK+, MMI and KDEF-dyn dataset and the results show that the proposed HLTD based approach outperforms many of the state of art methods with an average recognition rate of 97.5% for CK+, 94.1% for MMI and 88.5% for KDEF.

2020 ◽  
Vol 8 (6) ◽  
pp. 3823-3832

This work proposes an finest mapping from features space to inherited space using kernel locality non zero eigen values protecting Fisher discriminant analysis subspace approach. This approach is designed by cascading analytical and non-inherited face texture features. Both Gabor magnitude feature vector (GMFV) and phase feature vector (GPFV) are independently accessed. Feature fusion is carried out by cascading geometrical distance feature vector (GDFV) with Gabor magnitude and phase vectors. Feature fusion dataset space is converted into short dimensional inherited space by kernel locality protecting Fisher discriminant analysis method and projected space is normalized by suitable normalization technique to prevent dissimilarity between scores. Final scores of projected domains are fused using greatest fusion rule. Expressions are classified using Euclidean distance matching and support vector machine radial basis function kernel classifier. An experimental outcome emphasizes that the proposed approach is efficient for dimension reduction, competent recognition and classification. Performance of proposed approach is deliberated in comparison with connected subspace approaches. The finest average recognition rate achieves 97.61% for JAFFE and 81.48% YALE database respectively.


Author(s):  
Xinge Zhu ◽  
Liang Li ◽  
Weigang Zhang ◽  
Tianrong Rao ◽  
Min Xu ◽  
...  

Visual emotion recognition aims to associate images with appropriate emotions. There are different visual stimuli that can affect human emotion from low-level to high-level, such as color, texture, part, object, etc. However, most existing methods treat different levels of features as independent entity without having effective method for feature fusion. In this paper, we propose a unified CNN-RNN model to predict the emotion based on the fused features from different levels by exploiting the dependency among them. Our proposed architecture leverages convolutional neural network (CNN) with multiple layers to extract different levels of features with in a multi-task learning framework, in which two related loss functions are introduced to learn the feature representation. Considering the dependencies within the low-level and high-level features, a new bidirectional recurrent neural network (RNN) is proposed to integrate the learned features from different layers in the CNN model. Extensive experiments on both Internet images and art photo datasets demonstrate that our method outperforms the state-of-the-art methods with at least 7% performance improvement.


2014 ◽  
Vol 536-537 ◽  
pp. 115-120
Author(s):  
Ting Gong ◽  
Yu Biao Liu

The Gabor wavelet is the important technique widely used in the areas of images recognition such as human face expression, it extract the more important grain features for face expression effective, but it does not take into account the relative changes in the important characteristics of each location of the point features. Aiming at recognizing the information of human face expression, fuse the geometry feature based on angle changes at key parts on face expression, and then a radial basis function (RBF) neural network is designed as the classifier to perform recognition. The results of the experiment in the human face expression database indicate that the recognition rate by the feature fusion is obviously superior to that of traditional method.


2021 ◽  
Vol 2021 ◽  
pp. 1-17
Author(s):  
Hao Meng ◽  
Fei Yuan ◽  
Yue Wu ◽  
Tianhao Yan

In allusion to the shortcomings of traditional facial expression recognition (FER) that only uses a single feature and the recognition rate is not high, a FER method based on fusion of transformed multilevel features and improved weighted voting SVM (FTMS) is proposed. The algorithm combines the transformed traditional shallow features and convolutional neural network (CNN) deep semantic features and uses an improved weighted voting method to make a comprehensive decision on the results of the four trained SVM classifiers to obtain the final recognition result. The shallow features include local Gabor features, LBP features, and joint geometric features designed in this study, which are composed of distance and deformation characteristics. The deep feature of CNN is the multilayer feature fusion of CNN proposed in this study. This study also proposes to use a better performance SVM classifier with CNN to replace Softmax since the poor distinction between facial expressions. Experiments on the FERPlus database show that the recognition rate of this method is 17.2% higher than that of the traditional CNN, which proves the effectiveness of the fusion of the multilayer convolutional layer features and SVM. FTMS-based facial expression recognition experiments are carried out on the JAFFE and CK+ datasets. Experimental results show that, compared with the single feature, the proposed algorithm has higher recognition rate and robustness and makes full use of the advantages and characteristics of different features.


2021 ◽  
Vol 12 ◽  
Author(s):  
Zhenjie Song

Facial expression emotion recognition is an intuitive reflection of a person’s mental state, which contains rich emotional information, and is one of the most important forms of interpersonal communication. It can be used in various fields, including psychology. As a celebrity in ancient China, Zeng Guofan’s wisdom involves facial emotion recognition techniques. His book Bing Jian summarizes eight methods on how to identify people, especially how to choose the right one, which means “look at the eyes and nose for evil and righteousness, the lips for truth and falsehood; the temperament for success and fame, the spirit for wealth and fortune; the fingers and claws for ideas, the hamstrings for setback; if you want to know his consecution, you can focus on what he has said.” It is said that a person’s personality, mind, goodness, and badness can be showed by his face. However, due to the complexity and variability of human facial expression emotion features, traditional facial expression emotion recognition technology has the disadvantages of insufficient feature extraction and susceptibility to external environmental influences. Therefore, this article proposes a novel feature fusion dual-channel expression recognition algorithm based on machine learning theory and philosophical thinking. Specifically, the feature extracted using convolutional neural network (CNN) ignores the problem of subtle changes in facial expressions. The first path of the proposed algorithm takes the Gabor feature of the ROI area as input. In order to make full use of the detailed features of the active facial expression emotion area, first segment the active facial expression emotion area from the original face image, and use the Gabor transform to extract the emotion features of the area. Focus on the detailed description of the local area. The second path proposes an efficient channel attention network based on depth separable convolution to improve linear bottleneck structure, reduce network complexity, and prevent overfitting by designing an efficient attention module that combines the depth of the feature map with spatial information. It focuses more on extracting important features, improves emotion recognition accuracy, and outperforms the competition on the FER2013 dataset.


Facial expression based emotion recognition is one of the popular research domains in the computer vision field. Many machine vision-based feature extraction methods are available to increase the accuracy of the Facial Expression Recognition (FER). In feature extraction, neighboring pixel values are manipulated in different ways to encode the texture information of muscle movements. However, defining the robust feature descriptor is still a challenging task to handle the external factors. This paper introduces the Merged Local Neighborhood Difference Pattern (MLNDP) to encode and merge the two-level of representation. At the first level, each pixel is encoded with respect to center pixel, and at the second level, encoding is carried out based on the relationship with the closest neighboring pixel. Finally, two levels of encodings are logically merged to retain only the texture that is positively encoded from the two levels. Further, the feature dimension is reduced using chi-square statistical test, and the final classification is carried out using multiclass SVM on two datasets namely, CK+ and MMI. The proposed descriptor compared against other local descriptors such as LDP, LTP, LDN, and LGP. Experimental results show that our proposed feature descriptor is outperformed other descriptors with 97.86% on CK+ dataset and 95.29% on MMI dataset. The classifier comparison confirms the results that the combination of MLNDP with multiclass SVM performs better than other combinations in terms of local descriptor and classifier.


Author(s):  
FRANK Y. SHIH ◽  
CHAO-FA CHUANG ◽  
PATRICK S. P. WANG

Facial expression provides an important behavioral measure for studies of emotion, cognitive processes, and social interaction. Facial expression recognition has recently become a promising research area. Its applications include human-computer interfaces, human emotion analysis, and medical care and cure. In this paper, we investigate various feature representation and expression classification schemes to recognize seven different facial expressions, such as happy, neutral, angry, disgust, sad, fear and surprise, in the JAFFE database. Experimental results show that the method of combining 2D-LDA (Linear Discriminant Analysis) and SVM (Support Vector Machine) outperforms others. The recognition rate of this method is 95.71% by using leave-one-out strategy and 94.13% by using cross-validation strategy. It takes only 0.0357 second to process one image of size 256 × 256.


Sensors ◽  
2021 ◽  
Vol 21 (16) ◽  
pp. 5452
Author(s):  
Xin Chang ◽  
Władysław Skarbek

Emotion recognition is an important research field for human–computer interaction. Audio–video emotion recognition is now attacked with deep neural network modeling tools. In published papers, as a rule, the authors show only cases of the superiority in multi-modality over audio-only or video-only modality. However, there are cases of superiority in uni-modality that can be found. In our research, we hypothesize that for fuzzy categories of emotional events, the within-modal and inter-modal noisy information represented indirectly in the parameters of the modeling neural network impedes better performance in the existing late fusion and end-to-end multi-modal network training strategies. To take advantage of and overcome the deficiencies in both solutions, we define a multi-modal residual perceptron network which performs end-to-end learning from multi-modal network branches, generalizing better multi-modal feature representation. For the proposed multi-modal residual perceptron network and the novel time augmentation for streaming digital movies, the state-of-the-art average recognition rate was improved to 91.4% for the Ryerson Audio–Visual Database of Emotional Speech and Song dataset and to 83.15% for the Crowd-Sourced Emotional Multi Modal Actors dataset. Moreover, the multi-modal residual perceptron network concept shows its potential for multi-modal applications dealing with signal sources not only of optical and acoustical types.


Sensors ◽  
2019 ◽  
Vol 19 (8) ◽  
pp. 1899 ◽  
Author(s):  
Jucheng Yang ◽  
Xiaojing Wang ◽  
Shujie Han ◽  
Jie Wang ◽  
Dong Sun Park ◽  
...  

In the field of Facial Expression Recognition (FER), traditional local texture coding methods have a low computational complexity, while providing a robust solution with respect to occlusion, illumination, and other factors. However, there is still need for improving the accuracy of these methods while maintaining their real-time nature and low computational complexity. In this paper, we propose a feature-based FER system with a novel local texture coding operator, named central symmetric local gradient coding (CS-LGC), to enhance the performance of real-time systems. It uses four different directional gradients on 5 × 5 grids, and the gradient is computed in the center-symmetric way. The averages of the gradients are used to reduce the sensitivity to noise. These characteristics lead to symmetric of features by the CS-LGC operator, thus providing a better generalization capability in comparison to existing local gradient coding (LGC) variants. The proposed system further transforms the extracted features into an eigen-space using a principal component analysis (PCA) for better representation and less computation; it estimates the intended classes by training an extreme learning machine. The recognition rate for the JAFFE database is 95.24%, whereas that for the CK+ database is 98.33%. The results show that the system has advantages over the existing local texture coding methods.


Sign in / Sign up

Export Citation Format

Share Document