Facial expression recognition on real world face images using intelligent techniques: A survey

Optik ◽  
2016 ◽  
Vol 127 (15) ◽  
pp. 6195-6203 ◽  
Author(s):  
Sajid Ali Khan ◽  
Ayyaz Hussain ◽  
Muhammad Usman
Sensors ◽  
2021 ◽  
Vol 21 (6) ◽  
pp. 2003 ◽  
Author(s):  
Xiaoliang Zhu ◽  
Shihao Ye ◽  
Liang Zhao ◽  
Zhicheng Dai

As a sub-challenge of EmotiW (the Emotion Recognition in the Wild challenge), how to improve performance on the AFEW (Acted Facial Expressions in the wild) dataset is a popular benchmark for emotion recognition tasks with various constraints, including uneven illumination, head deflection, and facial posture. In this paper, we propose a convenient facial expression recognition cascade network comprising spatial feature extraction, hybrid attention, and temporal feature extraction. First, in a video sequence, faces in each frame are detected, and the corresponding face ROI (range of interest) is extracted to obtain the face images. Then, the face images in each frame are aligned based on the position information of the facial feature points in the images. Second, the aligned face images are input to the residual neural network to extract the spatial features of facial expressions corresponding to the face images. The spatial features are input to the hybrid attention module to obtain the fusion features of facial expressions. Finally, the fusion features are input in the gate control loop unit to extract the temporal features of facial expressions. The temporal features are input to the fully connected layer to classify and recognize facial expressions. Experiments using the CK+ (the extended Cohn Kanade), Oulu-CASIA (Institute of Automation, Chinese Academy of Sciences) and AFEW datasets obtained recognition accuracy rates of 98.46%, 87.31%, and 53.44%, respectively. This demonstrated that the proposed method achieves not only competitive performance comparable to state-of-the-art methods but also greater than 2% performance improvement on the AFEW dataset, proving the significant outperformance of facial expression recognition in the natural environment.


Optik ◽  
2018 ◽  
Vol 158 ◽  
pp. 1016-1025 ◽  
Author(s):  
Asim Munir ◽  
Ayyaz Hussain ◽  
Sajid Ali Khan ◽  
Muhammad Nadeem ◽  
Sadia Arshid

Sensors ◽  
2020 ◽  
Vol 20 (9) ◽  
pp. 2639
Author(s):  
Quan T. Ngo ◽  
Seokhoon Yoon

Facial expression recognition (FER) is a challenging problem in the fields of pattern recognition and computer vision. The recent success of convolutional neural networks (CNNs) in object detection and object segmentation tasks has shown promise in building an automatic deep CNN-based FER model. However, in real-world scenarios, performance degrades dramatically owing to the great diversity of factors unrelated to facial expressions, and due to a lack of training data and an intrinsic imbalance in the existing facial emotion datasets. To tackle these problems, this paper not only applies deep transfer learning techniques, but also proposes a novel loss function called weighted-cluster loss, which is used during the fine-tuning phase. Specifically, the weighted-cluster loss function simultaneously improves the intra-class compactness and the inter-class separability by learning a class center for each emotion class. It also takes the imbalance in a facial expression dataset into account by giving each emotion class a weight based on its proportion of the total number of images. In addition, a recent, successful deep CNN architecture, pre-trained in the task of face identification with the VGGFace2 database from the Visual Geometry Group at Oxford University, is employed and fine-tuned using the proposed loss function to recognize eight basic facial emotions from the AffectNet database of facial expression, valence, and arousal computing in the wild. Experiments on an AffectNet real-world facial dataset demonstrate that our method outperforms the baseline CNN models that use either weighted-softmax loss or center loss.


Sensors ◽  
2019 ◽  
Vol 19 (8) ◽  
pp. 1863 ◽  
Author(s):  
Samadiani ◽  
Huang ◽  
Cai ◽  
Luo ◽  
Chi ◽  
...  

Facial Expression Recognition (FER) can be widely applied to various research areas, such as mental diseases diagnosis and human social/physiological interaction detection. With the emerging advanced technologies in hardware and sensors, FER systems have been developed to support real-world application scenes, instead of laboratory environments. Although the laboratory-controlled FER systems achieve very high accuracy, around 97%, the technical transferring from the laboratory to real-world applications faces a great barrier of very low accuracy, approximately 50%. In this survey, we comprehensively discuss three significant challenges in the unconstrained real-world environments, such as illumination variation, head pose, and subject-dependence, which may not be resolved by only analysing images/videos in the FER system. We focus on those sensors that may provide extra information and help the FER systems to detect emotion in both static images and video sequences. We introduce three categories of sensors that may help improve the accuracy and reliability of an expression recognition system by tackling the challenges mentioned above in pure image/video processing. The first group is detailed-face sensors, which detect a small dynamic change of a face component, such as eye-trackers, which may help differentiate the background noise and the feature of faces. The second is non-visual sensors, such as audio, depth, and EEG sensors, which provide extra information in addition to visual dimension and improve the recognition reliability for example in illumination variation and position shift situation. The last is target-focused sensors, such as infrared thermal sensors, which can facilitate the FER systems to filter useless visual contents and may help resist illumination variation. Also, we discuss the methods of fusing different inputs obtained from multimodal sensors in an emotion system. We comparatively review the most prominent multimodal emotional expression recognition approaches and point out their advantages and limitations. We briefly introduce the benchmark data sets related to FER systems for each category of sensors and extend our survey to the open challenges and issues. Meanwhile, we design a framework of an expression recognition system, which uses multimodal sensor data (provided by the three categories of sensors) to provide complete information about emotions to assist the pure face image/video analysis. We theoretically analyse the feasibility and achievability of our new expression recognition system, especially for the use in the wild environment, and point out the future directions to design an efficient, emotional expression recognition system.


2021 ◽  
Vol 2021 (1) ◽  
Author(s):  
Bin Jiang ◽  
Qiuwen Zhang ◽  
Zuhe Li ◽  
Qinggang Wu ◽  
Huanlong Zhang

AbstractMethods using salient facial patches (SFPs) play a significant role in research on facial expression recognition. However, most SFP methods use only frontal face images or videos for recognition, and they do not consider head position variations. We contend that SFP can be an effective approach for recognizing facial expressions under different head rotations. Accordingly, we propose an algorithm, called profile salient facial patches (PSFP), to achieve this objective. First, to detect facial landmarks and estimate head poses from profile face images, a tree-structured part model is used for pose-free landmark localization. Second, to obtain the salient facial patches from profile face images, the facial patches are selected using the detected facial landmarks while avoiding their overlap or the transcending of the actual face range. To analyze the PSFP recognition performance, three classical approaches for local feature extraction, specifically the histogram of oriented gradients (HOG), local binary pattern, and Gabor, were applied to extract profile facial expression features. Experimental results on the Radboud Faces Database show that PSFP with HOG features can achieve higher accuracies under most head rotations.


2020 ◽  
Author(s):  
Bin Jiang ◽  
Qiuwen Zhang ◽  
Zuhe Li ◽  
Qinggang Wu ◽  
Huanlong Zhang

Abstract Methods using salient facial patches (SFP) play a significant role in research on facial expression recognition. However, most SFP methods use only frontal face images or videos for recognition, and do not consider variations of head position. In our view, SFP can also be a good choice to recognize facial expression under different head rotations, and thus we propose an algorithm for this purpose, called Profile Salient Facial Patches (PSFP). First, in order to detect the facial landmarks from profile face images, the tree-structured part model is used for pose-free landmark localization; this approach excels at detecting facial landmarks and estimating head poses. Second, to obtain the salient facial patches from profile face images, the facial patches are selected using the detected facial landmarks, while avoiding overlap with each other or going beyond the range of the actual face. For the purpose of analyzing the recognition performance of PSFP, three classical approaches for local feature extraction-histogram of oriented Gradients (HOG), local binary pattern (LBP), and Gabor were applied to extract profile facial expression features. Experimental results on radboud faces database show that PSFP with HOG features can achieve higher accuracies under the most head rotations.


Sign in / Sign up

Export Citation Format

Share Document