Face and Eye Detection

Face detection is the most fundamental step for the research on image-based automated face analysis such as face tracking, face recognition, face authentication, facial expression recognition and facial gesture recognition. When a novel face image is given we must know where the face is located, and how large the scale is to limit our concern to the face patch in the image and normalize the scale and orientation of the face patch. Usually, the face detection results are not stable; the scale of the detected face rectangle can be larger or smaller than that of the real face in the image. Therefore, many researchers use eye detectors to obtain stable normalized face images. Because the eyes have salient patterns in the human face image, they can be located stably and used for face image normalization. The eye detection becomes more important when we want to apply model-based face image analysis approaches.

Download Full-text

Human Emotion Recognition Based on Face and Facial Expression Detection Using Deep Belief Network Under Complicated Backgrounds

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001420560108 ◽

2020 ◽

Vol 34 (14) ◽

pp. 2056010

Author(s):

Lei Huang ◽

Fei Xie ◽

Jing Zhao ◽

Shibin Shen ◽

Weiran Guang ◽

...

Keyword(s):

Facial Expression ◽

Emotion Recognition ◽

Face Detection ◽

Skin Color ◽

Facial Expression Recognition ◽

Expression Recognition ◽

Human Face ◽

Human Emotion ◽

The Face ◽

Color Enhancement

The human emotion recognition based on facial expression has a significant meaning in the application of intelligent man–machine interaction. However, the human face images vary largely in real environments due to the complex backgrounds and luminance. To solve this problem, this paper proposes a robust face detection method based on skin color enhancement model and a facial expression recognition algorithm with block principal component analysis (PCA). First, the luminance range of human face image is broadened and the contrast ratio of skin color is strengthened by the homomorphic filter. Second, the skin color enhancement model is established using YCbCr color space components to locate the face area. Third, the feature based on differential horizontal integral projection is extracted from the face. Finally, the block PCA with deep neural network is used to accomplish the facial expression recognition. The experimental results indicate that in the case of weaker illumination and more complicated backgrounds, both the face detection and facial expression recognition can be achieved effectively by the proposed algorithm, meanwhile the mean recognition rate obtained by the facial expression recognition method is improved by 2.7% comparing with the traditional Local Binary Patterns (LBPs) method.

Download Full-text

A Three-Dimensional Anisotropic Diffusion Equation-Based Video Recognition Model for Classroom Concentration Evaluation in English Language Teaching

Advances in Mathematical Physics ◽

10.1155/2021/2209526 ◽

2021 ◽

Vol 2021 ◽

pp. 1-12

Author(s):

Yanghong Liu ◽

Jintao Liu

Keyword(s):

Diffusion Equation ◽

Face Detection ◽

Anisotropic Diffusion ◽

Three Dimensional ◽

Face Image ◽

Expression Recognition ◽

Face Images ◽

The Face ◽

Video Recognition ◽

Anisotropic Diffusion Equation

In this paper, a three-dimensional anisotropic diffusion equation is used to conduct an in-depth study and analysis of students’ concentration in video recognition in English teaching classrooms. A multifeature fusion face live detection method based on diffusion model extracts Diffusion Kernel (DK) features and depth features from diffusion-processed face images, respectively. DK features provide a nonlinear description of the correlation between successive face images and express face image sequences in the temporal dimension; depth features are extracted by a pretrained depth neural network model that can express the complex nonlinear mapping relationships of images and reflect the more abstract implicit information inside face images. To improve the effectiveness of the face image features, the extracted DK features and depth features are fused using a multicore learning method to obtain the best combination and the corresponding weights. The two features complement each other, and the fused features are more discriminative, which provides a strong basis for the live determination of face images. Experiments show that the method has excellent performance and can effectively discriminate the live nature of faces in images and resist forged face attacks. Based on the above face detection and expression recognition algorithms, the classroom concentration analysis system based on expression recognition is designed to achieve real-time acquisition and processing of classroom images, complete student classroom attendance records using face detection and face recognition methods, and analyze students’ concentration from the face integrity and facial expression of students facing the blackboard by combining face detection and expression recognition to visualize and display students’ classroom data for teachers, students, and parents with more data support and help.

Download Full-text

Hybrid Attention Cascade Network for Facial Expression Recognition

Sensors ◽

10.3390/s21062003 ◽

2021 ◽

Vol 21 (6) ◽

pp. 2003 ◽

Cited By ~ 1

Author(s):

Xiaoliang Zhu ◽

Shihao Ye ◽

Liang Zhao ◽

Zhicheng Dai

Keyword(s):

Facial Expression ◽

Facial Expressions ◽

Facial Expression Recognition ◽

Expression Recognition ◽

Spatial Features ◽

Face Images ◽

Temporal Features ◽

The Face ◽

In The Wild ◽

Fusion Features

As a sub-challenge of EmotiW (the Emotion Recognition in the Wild challenge), how to improve performance on the AFEW (Acted Facial Expressions in the wild) dataset is a popular benchmark for emotion recognition tasks with various constraints, including uneven illumination, head deflection, and facial posture. In this paper, we propose a convenient facial expression recognition cascade network comprising spatial feature extraction, hybrid attention, and temporal feature extraction. First, in a video sequence, faces in each frame are detected, and the corresponding face ROI (range of interest) is extracted to obtain the face images. Then, the face images in each frame are aligned based on the position information of the facial feature points in the images. Second, the aligned face images are input to the residual neural network to extract the spatial features of facial expressions corresponding to the face images. The spatial features are input to the hybrid attention module to obtain the fusion features of facial expressions. Finally, the fusion features are input in the gate control loop unit to extract the temporal features of facial expressions. The temporal features are input to the fully connected layer to classify and recognize facial expressions. Experiments using the CK+ (the extended Cohn Kanade), Oulu-CASIA (Institute of Automation, Chinese Academy of Sciences) and AFEW datasets obtained recognition accuracy rates of 98.46%, 87.31%, and 53.44%, respectively. This demonstrated that the proposed method achieves not only competitive performance comparable to state-of-the-art methods but also greater than 2% performance improvement on the AFEW dataset, proving the significant outperformance of facial expression recognition in the natural environment.

Download Full-text

Facial Expression Recognition Based on Auxiliary Models

Algorithms ◽

10.3390/a12110227 ◽

2019 ◽

Vol 12 (11) ◽

pp. 227 ◽

Cited By ~ 1

Author(s):

Yingying Wang ◽

Yibin Li ◽

Yong Song ◽

Xuewen Rong

Keyword(s):

Facial Expression ◽

Facial Expressions ◽

Facial Expression Recognition ◽

State Of The Art ◽

Face Image ◽

Great Success ◽

Expression Recognition ◽

Input Information ◽

The Face ◽

Feature Information

In recent years, with the development of artificial intelligence and human–computer interaction, more attention has been paid to the recognition and analysis of facial expressions. Despite much great success, there are a lot of unsatisfying problems, because facial expressions are subtle and complex. Hence, facial expression recognition is still a challenging problem. In most papers, the entire face image is often chosen as the input information. In our daily life, people can perceive other’s current emotions only by several facial components (such as eye, mouth and nose), and other areas of the face (such as hair, skin tone, ears, etc.) play a smaller role in determining one’s emotion. If the entire face image is used as the only input information, the system will produce some unnecessary information and miss some important information in the process of feature extraction. To solve the above problem, this paper proposes a method that combines multiple sub-regions and the entire face image by weighting, which can capture more important feature information that is conducive to improving the recognition accuracy. Our proposed method was evaluated based on four well-known publicly available facial expression databases: JAFFE, CK+, FER2013 and SFEW. The new method showed better performance than most state-of-the-art methods.

Download Full-text

Multi-task Cascaded and Densely Connected Convolutional Networks Applied to Human Face Detection and Facial Expression Recognition System

2019 International Automatic Control Conference (CACS) ◽

10.1109/cacs47674.2019.9024357 ◽

2019 ◽

Cited By ~ 1

Author(s):

Kuan-Yu Chou ◽

Yi-Wen Cheng ◽

Wei-Ren Chen ◽

Yon-Ping Chen

Keyword(s):

Facial Expression ◽

Face Detection ◽

Facial Expression Recognition ◽

Recognition System ◽

Expression Recognition ◽

Human Face ◽

Convolutional Networks ◽

Human Face Detection

Download Full-text

Deep-Emotion: Facial Expression Recognition Using Attentional Convolutional Network

Sensors ◽

10.3390/s21093046 ◽

2021 ◽

Vol 21 (9) ◽

pp. 3046

Author(s):

Shervin Minaee ◽

Mehdi Minaei ◽

Amirali Abdolrashidi

Keyword(s):

Deep Learning ◽

Facial Expression ◽

Facial Expression Recognition ◽

Expression Recognition ◽

Visualization Technique ◽

Convolutional Network ◽

The Past ◽

Multiple Datasets ◽

The Face ◽

Traditional Approaches

Facial expression recognition has been an active area of research over the past few decades, and it is still challenging due to the high intra-class variation. Traditional approaches for this problem rely on hand-crafted features such as SIFT, HOG, and LBP, followed by a classifier trained on a database of images or videos. Most of these works perform reasonably well on datasets of images captured in a controlled condition but fail to perform as well on more challenging datasets with more image variation and partial faces. In recent years, several works proposed an end-to-end framework for facial expression recognition using deep learning models. Despite the better performance of these works, there are still much room for improvement. In this work, we propose a deep learning approach based on attentional convolutional network that is able to focus on important parts of the face and achieves significant improvement over previous models on multiple datasets, including FER-2013, CK+, FERG, and JAFFE. We also use a visualization technique that is able to find important facial regions to detect different emotions based on the classifier’s output. Through experimental results, we show that different emotions are sensitive to different parts of the face.

Download Full-text

Facial Expression Recognition Based on Multi-Features Cooperative Deep Convolutional Network

Applied Sciences ◽

10.3390/app11041428 ◽

2021 ◽

Vol 11 (4) ◽

pp. 1428

Author(s):

Haopeng Wu ◽

Zhiying Lu ◽

Jianfeng Zhang ◽

Xin Li ◽

Mingyue Zhao ◽

...

Keyword(s):

Facial Expression ◽

Facial Expressions ◽

Facial Expression Recognition ◽

Video Data ◽

Expression Recognition ◽

Convolutional Network ◽

Facial Movements ◽

The Face ◽

Deep Convolutional Network ◽

Selection Of

This paper addresses the problem of Facial Expression Recognition (FER), focusing on unobvious facial movements. Traditional methods often cause overfitting problems or incomplete information due to insufficient data and manual selection of features. Instead, our proposed network, which is called the Multi-features Cooperative Deep Convolutional Network (MC-DCN), maintains focus on the overall feature of the face and the trend of key parts. The processing of video data is the first stage. The method of ensemble of regression trees (ERT) is used to obtain the overall contour of the face. Then, the attention model is used to pick up the parts of face that are more susceptible to expressions. Under the combined effect of these two methods, the image which can be called a local feature map is obtained. After that, the video data are sent to MC-DCN, containing parallel sub-networks. While the overall spatiotemporal characteristics of facial expressions are obtained through the sequence of images, the selection of keys parts can better learn the changes in facial expressions brought about by subtle facial movements. By combining local features and global features, the proposed method can acquire more information, leading to better performance. The experimental results show that MC-DCN can achieve recognition rates of 95%, 78.6% and 78.3% on the three datasets SAVEE, MMI, and edited GEMEP, respectively.

Download Full-text

Enhanced use of gaze cue in a face-following task after brief trial experience in individuals with autism spectrum disorder

Scientific Reports ◽

10.1038/s41598-021-90230-6 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Takao Fukui ◽

Mrinmoy Chakrabarty ◽

Misako Sano ◽

Ari Tanaka ◽

Mayuko Suzuki ◽

...

Keyword(s):

Autism Spectrum ◽

Face Image ◽

Fixation Time ◽

Gaze Shift ◽

The Gaze ◽

Face Images ◽

Trial Experience ◽

Short Period ◽

The Face ◽

Gaze Cues

AbstractEye movements toward sequentially presented face images with or without gaze cues were recorded to investigate whether those with ASD, in comparison to their typically developing (TD) peers, could prospectively perform the task according to gaze cues. Line-drawn face images were sequentially presented for one second each on a laptop PC display, and the face images shifted from side-to-side and up-and-down. In the gaze cue condition, the gaze of the face image was directed to the position where the next face would be presented. Although the participants with ASD looked less at the eye area of the face image than their TD peers, they could perform comparable smooth gaze shift to the gaze cue of the face image in the gaze cue condition. This appropriate gaze shift in the ASD group was more evident in the second half of trials in than in the first half, as revealed by the mean proportion of fixation time in the eye area to valid gaze data in the early phase (during face image presentation) and the time to first fixation on the eye area. These results suggest that individuals with ASD may benefit from the short-period trial experiment by enhancing the usage of gaze cue.

Download Full-text

RoboCoDraw: Robotic Avatar Drawing with GAN-Based Style Transfer and Time-Efficient Path Optimization

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i06.6609 ◽

2020 ◽

Vol 34 (06) ◽

pp. 10402-10409

Author(s):

Tianying Wang ◽

Wei Qi Toh ◽

Hao Zhang ◽

Xiuchao Sui ◽

Shaohua Li ◽

...

Keyword(s):

Face Image ◽

Path Optimization ◽

Robotic Arm ◽

Human Face ◽

Generative Adversarial Network ◽

Style Transfer ◽

Adversarial Network ◽

Face Images ◽

Drawing System ◽

Collaborative Robot

Robotic drawing has become increasingly popular as an entertainment and interactive tool. In this paper we present RoboCoDraw, a real-time collaborative robot-based drawing system that draws stylized human face sketches interactively in front of human users, by using the Generative Adversarial Network (GAN)-based style transfer and a Random-Key Genetic Algorithm (RKGA)-based path optimization. The proposed RoboCoDraw system takes a real human face image as input, converts it to a stylized avatar, then draws it with a robotic arm. A core component in this system is the AvatarGAN proposed by us, which generates a cartoon avatar face image from a real human face. AvatarGAN is trained with unpaired face and avatar images only and can generate avatar images of much better likeness with human face images in comparison with the vanilla CycleGAN. After the avatar image is generated, it is fed to a line extraction algorithm and converted to sketches. An RKGA-based path optimization algorithm is applied to find a time-efficient robotic drawing path to be executed by the robotic arm. We demonstrate the capability of RoboCoDraw on various face images using a lightweight, safe collaborative robot UR5.

Download Full-text

Human Face Detection in Color Images

10.32920/ryerson.14665272 ◽

2021 ◽

Author(s):

Jun Gao

Keyword(s):

Face Detection ◽

Skin Color ◽

Video Retrieval ◽

Color Images ◽

Human Face ◽

Detection Scheme ◽

Retrieval Systems ◽

Human Face Detection ◽

The Face ◽

Video Retrieval Systems

Detection of human face has many realistic and important applications such as human and computer interface, face recognition, face image database management, security access control systems and content-based indexing video retrieval systems. In this report a face detection scheme will be presented. The scheme is designed to operate on color images. In the first stage of algorithm, the skin color regions are detected based on the chrominance information. A color segmentation stage is then employed to make skin color regions to be divided into smaller regions which have homogenous color. Then, we use the iterative luminance segmentation to further separate the detected skin region from other skin-colored objects such as hair, clothes, and wood, based on the high variance of the luminance component in the neighborhood of edges of objects. Post-processing is applied to determine whether skin color regions fit the face constrains on density of skin, size, shape and symmetry and contain the facial features such as eyes and mouths. Experimental results show that the algorithm is robust and is capable of detecting multiple faces in the presence of a complex background which contains the color similar to the skin tone.

Download Full-text