scholarly journals Three-Stream Convolutional Neural Network with Squeeze-and-Excitation Block for Near-Infrared Facial Expression Recognition

Electronics ◽  
2019 ◽  
Vol 8 (4) ◽  
pp. 385 ◽  
Author(s):  
Ying Chen ◽  
Zhihao Zhang ◽  
Lei Zhong ◽  
Tong Chen ◽  
Juxiang Chen ◽  
...  

Near-infrared (NIR) facial expression recognition is resistant to illumination change. In this paper, we propose a three-stream three-dimensional convolution neural network with a squeeze-and-excitation (SE) block for NIR facial expression recognition. We fed each stream with different local regions, namely the eyes, nose, and mouth. By using an SE block, the network automatically allocated weights to different local features to further improve recognition accuracy. The experimental results on the Oulu-CASIA NIR facial expression database showed that the proposed method has a higher recognition rate than some state-of-the-art algorithms.

2013 ◽  
Vol 380-384 ◽  
pp. 4057-4060
Author(s):  
Lang Guo ◽  
Jian Wang

Analyzing the defects of two-dimensional facial expression recognition algorithm, this paper proposes a new three-dimensional facial expression recognition algorithm. The algorithm is tested in JAFFE facial expression database. The results show that the proposed algorithm dynamically determines the size of the local neighborhood according to the manifold structure, effectively solves the problem of facial expression recognition, and has good recognition rate.


2021 ◽  
Vol 2083 (3) ◽  
pp. 032030
Author(s):  
Cui Dong ◽  
Rongfu Wang ◽  
Yuanqin Hang

Abstract With the development of artificial intelligence, facial expression recognition based on deep learning has become a current research hotspot. The article analyzes and improves the VGG16 network. First, the three fully connected layers of the original network are changed to two convolutional layers and one fully connected layer, which reduces the complexity of the network; Then change the maximum pooling in the network to local-based adaptive pooling to help the network select feature information that is more conducive to facial expression recognition, so that the network can be used on the facial expression datasets RAF-DB and SFEW. The recognition rate increased by 4.7% and 7% respectively.


2021 ◽  
Vol 25 (1) ◽  
pp. 139-154
Author(s):  
Yongxiang Cai ◽  
Jingwen Gao ◽  
Gen Zhang ◽  
Yuangang Liu

The goal of research in Facial Expression Recognition (FER) is to build a robust and strong recognizability model. In this paper, we propose a new scheme for FER systems based on convolutional neural network. Part of the regular convolution operation is replaced by depthwise separable convolution to reduce the number of parameters and the computational workload; the self-adaption joint loss function is adopted to improve the classification performance. In addition, we balance our train set through data augmentation, and we preprocess the input images through illumination processing, face detection, and other methods, effectively maximizing the expression recognition rate. Experiments to validate our methods are conducted based on the TensorFlow platform and Fer2013 dataset. We analyze the experimental results before and after train set balancing and network model modification, and we compare our results with those of other researchers. The results show that our method is effective at increasing the expression recognition rate under the same experiment conditions. We further conduct an experiment on our own expression dataset relevant to driving safety, and it yields similar results.


2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Olalekan Agbolade ◽  
Azree Nazri ◽  
Razali Yaakob ◽  
Abdul Azim Ghani ◽  
Yoke Kqueen Cheah

Abstract Background Expression in H-sapiens plays a remarkable role when it comes to social communication. The identification of this expression by human beings is relatively easy and accurate. However, achieving the same result in 3D by machine remains a challenge in computer vision. This is due to the current challenges facing facial data acquisition in 3D; such as lack of homology and complex mathematical analysis for facial point digitization. This study proposes facial expression recognition in human with the application of Multi-points Warping for 3D facial landmark by building a template mesh as a reference object. This template mesh is thereby applied to each of the target mesh on Stirling/ESRC and Bosphorus datasets. The semi-landmarks are allowed to slide along tangents to the curves and surfaces until the bending energy between a template and a target form is minimal and localization error is assessed using Procrustes ANOVA. By using Principal Component Analysis (PCA) for feature selection, classification is done using Linear Discriminant Analysis (LDA). Result The localization error is validated on the two datasets with superior performance over the state-of-the-art methods and variation in the expression is visualized using Principal Components (PCs). The deformations show various expression regions in the faces. The results indicate that Sad expression has the lowest recognition accuracy on both datasets. The classifier achieved a recognition accuracy of 99.58 and 99.32% on Stirling/ESRC and Bosphorus, respectively. Conclusion The results demonstrate that the method is robust and in agreement with the state-of-the-art results.


2020 ◽  
Vol 13 (4) ◽  
pp. 527-543
Author(s):  
Wenjuan Shen ◽  
Xiaoling Li

Purposerecent years, facial expression recognition has been widely used in human machine interaction, clinical medicine and safe driving. However, there is a limitation that conventional recurrent neural networks can only learn the time-series characteristics of expressions based on one-way propagation information.Design/methodology/approachTo solve such limitation, this paper proposes a novel model based on bidirectional gated recurrent unit networks (Bi-GRUs) with two-way propagations, and the theory of identity mapping residuals is adopted to effectively prevent the problem of gradient disappearance caused by the depth of the introduced network. Since the Inception-V3 network model for spatial feature extraction has too many parameters, it is prone to overfitting during training. This paper proposes a novel facial expression recognition model to add two reduction modules to reduce parameters, so as to obtain an Inception-W network with better generalization.FindingsFinally, the proposed model is pretrained to determine the best settings and selections. Then, the pretrained model is experimented on two facial expression data sets of CK+ and Oulu- CASIA, and the recognition performance and efficiency are compared with the existing methods. The highest recognition rate is 99.6%, which shows that the method has good recognition accuracy in a certain range.Originality/valueBy using the proposed model for the applications of facial expression, the high recognition accuracy and robust recognition results with lower time consumption will help to build more sophisticated applications in real world.


Sensors ◽  
2020 ◽  
Vol 20 (18) ◽  
pp. 5184
Author(s):  
Min Kyu Lee ◽  
Dae Ha Kim ◽  
Byung Cheol Song

Facial expression recognition (FER) technology has made considerable progress with the rapid development of deep learning. However, conventional FER techniques are mainly designed and trained for videos that are artificially acquired in a limited environment, so they may not operate robustly on videos acquired in a wild environment suffering from varying illuminations and head poses. In order to solve this problem and improve the ultimate performance of FER, this paper proposes a new architecture that extends a state-of-the-art FER scheme and a multi-modal neural network that can effectively fuse image and landmark information. To this end, we propose three methods. To maximize the performance of the recurrent neural network (RNN) in the previous scheme, we first propose a frame substitution module that replaces the latent features of less important frames with those of important frames based on inter-frame correlation. Second, we propose a method for extracting facial landmark features based on the correlation between frames. Third, we propose a new multi-modal fusion method that effectively fuses video and facial landmark information at the feature level. By applying attention based on the characteristics of each modality to the features of the modality, novel fusion is achieved. Experimental results show that the proposed method provides remarkable performance, with 51.4% accuracy for the wild AFEW dataset, 98.5% accuracy for the CK+ dataset and 81.9% accuracy for the MMI dataset, outperforming the state-of-the-art networks.


2020 ◽  
Vol 1 (6) ◽  
Author(s):  
Pablo Barros ◽  
Nikhil Churamani ◽  
Alessandra Sciutti

AbstractCurrent state-of-the-art models for automatic facial expression recognition (FER) are based on very deep neural networks that are effective but rather expensive to train. Given the dynamic conditions of FER, this characteristic hinders such models of been used as a general affect recognition. In this paper, we address this problem by formalizing the FaceChannel, a light-weight neural network that has much fewer parameters than common deep neural networks. We introduce an inhibitory layer that helps to shape the learning of facial features in the last layer of the network and, thus, improving performance while reducing the number of trainable parameters. To evaluate our model, we perform a series of experiments on different benchmark datasets and demonstrate how the FaceChannel achieves a comparable, if not better, performance to the current state-of-the-art in FER. Our experiments include cross-dataset analysis, to estimate how our model behaves on different affective recognition conditions. We conclude our paper with an analysis of how FaceChannel learns and adapts the learned facial features towards the different datasets.


Information ◽  
2019 ◽  
Vol 10 (12) ◽  
pp. 375 ◽  
Author(s):  
Yingying Wang ◽  
Yibin Li ◽  
Yong Song ◽  
Xuewen Rong

As an important part of emotion research, facial expression recognition is a necessary requirement in human–machine interface. Generally, a face expression recognition system includes face detection, feature extraction, and feature classification. Although great success has been made by the traditional machine learning methods, most of them have complex computational problems and lack the ability to extract comprehensive and abstract features. Deep learning-based methods can realize a higher recognition rate for facial expressions, but a large number of training samples and tuning parameters are needed, and the hardware requirement is very high. For the above problems, this paper proposes a method combining features that extracted by the convolutional neural network (CNN) with the C4.5 classifier to recognize facial expressions, which not only can address the incompleteness of handcrafted features but also can avoid the high hardware configuration in the deep learning model. Considering some problems of overfitting and weak generalization ability of the single classifier, random forest is applied in this paper. Meanwhile, this paper makes some improvements for C4.5 classifier and the traditional random forest in the process of experiments. A large number of experiments have proved the effectiveness and feasibility of the proposed method.


Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-10
Author(s):  
Qing Lin ◽  
Ruili He ◽  
Peihe Jiang

State-of-the-art facial expression methods outperform human beings, especially, thanks to the success of convolutional neural networks (CNNs). However, most of the existing works focus mainly on analyzing an adult’s face and ignore the important problems: how can we recognize facial expression from a baby’s face image and how difficult is it? In this paper, we first introduce a new face image database, named BabyExp, which contains 12,000 images from babies younger than two years old, and each image is with one of three facial expressions (i.e., happy, sad, and normal). To the best of our knowledge, the proposed dataset is the first baby face dataset for analyzing a baby’s face image, which is complementary to the existing adult face datasets and can shed some light on exploring baby face analysis. We also propose a feature guided CNN method with a new loss function, called distance loss, to optimize interclass distance. In order to facilitate further research, we provide the benchmark of expression recognition on the BabyExp dataset. Experimental results show that the proposed network achieves the recognition accuracy of 87.90% on BabyExp.


Sensors ◽  
2020 ◽  
Vol 20 (7) ◽  
pp. 1936 ◽  
Author(s):  
Dami Jeong ◽  
Byung-Gyu Kim ◽  
Suh-Yeon Dong

Understanding a person’s feelings is a very important process for the affective computing. People express their emotions in various ways. Among them, facial expression is the most effective way to present human emotional status. We propose efficient deep joint spatiotemporal features for facial expression recognition based on the deep appearance and geometric neural networks. We apply three-dimensional (3D) convolution to extract spatial and temporal features at the same time. For the geometric network, 23 dominant facial landmarks are selected to express the movement of facial muscle through the analysis of energy distribution of whole facial landmarks.We combine these features by the designed joint fusion classifier to complement each other. From the experimental results, we verify the recognition accuracy of 99.21%, 87.88%, and 91.83% for CK+, MMI, and FERA datasets, respectively. Through the comparative analysis, we show that the proposed scheme is able to improve the recognition accuracy by 4% at least.


Sign in / Sign up

Export Citation Format

Share Document