Extracting Dynamic Facial Expressions from Naturalistic Videos

Facial expressions are critical in our daily interactions. Studying how humans recognize dynamic facial expressions is an important area of research in social perception, but advancements are hampered by the difficulty of creating well-controlled stimuli. Research on the perception of static faces has made significant progress thanks to techniques that make it possible to generate synthetic face stimuli. However, synthetic dynamic expressions are more difficult to generate; methods that yield realistic dynamics typically rely on the use of infrared markers applied on the face, making it expensive to create datasets that include large numbers of different expressions. In addition, the use of markers might interfere with facial dynamics. In this paper, we contribute a new method to generate large amounts of realistic and well-controlled facial expression videos. We use a deep convolutional neural network with attention and asymmetric loss to extract the dynamics of action units from videos, and demonstrate that this approach outperforms a baseline model based on convolutional neural networks without attention on the same stimuli. Next, we develop a pipeline to use the action unit dynamics to render realistic synthetic videos. This pipeline makes it possible to generate large scale naturalistic and controllable facial expression datasets to facilitate future research in social cognitive science.

Download Full-text

The Effects of Dynamic and Static Emotional Facial Expressions of Humans and Their Avatars on the EEG: An ERP and ERD/ERS Study

Frontiers in Neuroscience ◽

10.3389/fnins.2021.651044 ◽

2021 ◽

Vol 15 ◽

Author(s):

Teresa Sollfrank ◽

Oona Kohnen ◽

Peter Hilfiker ◽

Lorena C. Kegel ◽

Hennric Jokeit ◽

...

Keyword(s):

Facial Expressions ◽

Event Related Potentials ◽

Emotional Facial Expressions ◽

Emotional Faces ◽

Dynamic Facial Expressions ◽

The Face ◽

Related Potentials ◽

Face Stimuli ◽

Human Faces ◽

Healthy Participants

This study aimed to examine whether the cortical processing of emotional faces is modulated by the computerization of face stimuli (”avatars”) in a group of 25 healthy participants. Subjects were passively viewing 128 static and dynamic facial expressions of female and male actors and their respective avatars in neutral or fearful conditions. Event-related potentials (ERPs), as well as alpha and theta event-related synchronization and desynchronization (ERD/ERS), were derived from the EEG that was recorded during the task. All ERP features, except for the very early N100, differed in their response to avatar and actor faces. Whereas the N170 showed differences only for the neutral avatar condition, later potentials (N300 and LPP) differed in both emotional conditions (neutral and fear) and the presented agents (actor and avatar). In addition, we found that the avatar faces elicited significantly stronger reactions than the actor face for theta and alpha oscillations. Especially theta EEG frequencies responded specifically to visual emotional stimulation and were revealed to be sensitive to the emotional content of the face, whereas alpha frequency was modulated by all the stimulus types. We can conclude that the computerized avatar faces affect both, ERP components and ERD/ERS and evoke neural effects that are different from the ones elicited by real faces. This was true, although the avatars were replicas of the human faces and contained similar characteristics in their expression.

Download Full-text

Emotion Recognition of Facial Expressions Presented in Profile

Psychological Reports ◽

10.1177/00332941211018403 ◽

2021 ◽

pp. 003329412110184

Author(s):

Paola Surcinelli ◽

Federica Andrei ◽

Ornella Montebarocci ◽

Silvana Grandi

Keyword(s):

Emotion Recognition ◽

Facial Expressions ◽

Response Times ◽

Recognition Accuracy ◽

Recognition Of Facial Expressions ◽

Profile View ◽

The Face ◽

Within Subjects ◽

Face Stimuli ◽

Affective Information

Aim of the research The literature on emotion recognition from facial expressions shows significant differences in recognition ability depending on the proposed stimulus. Indeed, affective information is not distributed uniformly in the face and recent studies showed the importance of the mouth and the eye regions for a correct recognition. However, previous studies used mainly facial expressions presented frontally and studies which used facial expressions in profile view used a between-subjects design or children faces as stimuli. The present research aims to investigate differences in emotion recognition between faces presented in frontal and in profile views by using a within subjects experimental design. Method The sample comprised 132 Italian university students (88 female, Mage = 24.27 years, SD = 5.89). Face stimuli displayed both frontally and in profile were selected from the KDEF set. Two emotion-specific recognition accuracy scores, viz., frontal and in profile, were computed from the average of correct responses for each emotional expression. In addition, viewing times and response times (RT) were registered. Results Frontally presented facial expressions of fear, anger, and sadness were significantly better recognized than facial expressions of the same emotions in profile while no differences were found in the recognition of the other emotions. Longer viewing times were also found when faces expressing fear and anger were presented in profile. In the present study, an impairment in recognition accuracy was observed only for those emotions which rely mostly on the eye regions.

Download Full-text

The face of future: Face expressions during future thinking

Quarterly Journal of Experimental Psychology ◽

10.1177/1747021821992991 ◽

2021 ◽

pp. 174702182199299

Author(s):

Mohamad El Haj ◽

Emin Altintas ◽

Ahmed A Moustafa ◽

Abdel Halim Boudoukha

Keyword(s):

Facial Expression ◽

Facial Expressions ◽

Emotional Experience ◽

Future Scenarios ◽

Future Thinking ◽

Analysis Software ◽

Emotional Facial Expressions ◽

Facial Analysis ◽

The Face ◽

Face Expressions

Future thinking, which is the ability to project oneself forward in time to pre-experience an event, is intimately associated with emotions. We investigated whether emotional future thinking can activate emotional facial expressions. We invited 43 participants to imagine future scenarios, cued by the words “happy,” “sad,” and “city.” Future thinking was video recorded and analysed with a facial analysis software to classify whether facial expressions (i.e., happy, sad, angry, surprised, scared, disgusted, and neutral facial expression) of participants were neutral or emotional. Analysis demonstrated higher levels of happy facial expressions during future thinking cued by the word “happy” than “sad” or “city.” In contrast, higher levels of sad facial expressions were observed during future thinking cued by the word “sad” than “happy” or “city.” Higher levels of neutral facial expressions were observed during future thinking cued by the word “city” than “happy” or “sad.” In the three conditions, the neutral facial expressions were high compared with happy and sad facial expressions. Together, emotional future thinking, at least for future scenarios cued by “happy” and “sad,” seems to trigger the corresponding facial expression. Our study provides an original physiological window into the subjective emotional experience during future thinking.

Download Full-text

Facial Expression Recognition Based on Multi-Features Cooperative Deep Convolutional Network

Applied Sciences ◽

10.3390/app11041428 ◽

2021 ◽

Vol 11 (4) ◽

pp. 1428

Author(s):

Haopeng Wu ◽

Zhiying Lu ◽

Jianfeng Zhang ◽

Xin Li ◽

Mingyue Zhao ◽

...

Keyword(s):

Facial Expression ◽

Facial Expressions ◽

Facial Expression Recognition ◽

Video Data ◽

Expression Recognition ◽

Convolutional Network ◽

Facial Movements ◽

The Face ◽

Deep Convolutional Network ◽

Selection Of

This paper addresses the problem of Facial Expression Recognition (FER), focusing on unobvious facial movements. Traditional methods often cause overfitting problems or incomplete information due to insufficient data and manual selection of features. Instead, our proposed network, which is called the Multi-features Cooperative Deep Convolutional Network (MC-DCN), maintains focus on the overall feature of the face and the trend of key parts. The processing of video data is the first stage. The method of ensemble of regression trees (ERT) is used to obtain the overall contour of the face. Then, the attention model is used to pick up the parts of face that are more susceptible to expressions. Under the combined effect of these two methods, the image which can be called a local feature map is obtained. After that, the video data are sent to MC-DCN, containing parallel sub-networks. While the overall spatiotemporal characteristics of facial expressions are obtained through the sequence of images, the selection of keys parts can better learn the changes in facial expressions brought about by subtle facial movements. By combining local features and global features, the proposed method can acquire more information, leading to better performance. The experimental results show that MC-DCN can achieve recognition rates of 95%, 78.6% and 78.3% on the three datasets SAVEE, MMI, and edited GEMEP, respectively.

Download Full-text

Hybrid Attention Cascade Network for Facial Expression Recognition

Sensors ◽

10.3390/s21062003 ◽

2021 ◽

Vol 21 (6) ◽

pp. 2003 ◽

Cited By ~ 1

Author(s):

Xiaoliang Zhu ◽

Shihao Ye ◽

Liang Zhao ◽

Zhicheng Dai

Keyword(s):

Facial Expression ◽

Facial Expressions ◽

Facial Expression Recognition ◽

Expression Recognition ◽

Spatial Features ◽

Face Images ◽

Temporal Features ◽

The Face ◽

In The Wild ◽

Fusion Features

As a sub-challenge of EmotiW (the Emotion Recognition in the Wild challenge), how to improve performance on the AFEW (Acted Facial Expressions in the wild) dataset is a popular benchmark for emotion recognition tasks with various constraints, including uneven illumination, head deflection, and facial posture. In this paper, we propose a convenient facial expression recognition cascade network comprising spatial feature extraction, hybrid attention, and temporal feature extraction. First, in a video sequence, faces in each frame are detected, and the corresponding face ROI (range of interest) is extracted to obtain the face images. Then, the face images in each frame are aligned based on the position information of the facial feature points in the images. Second, the aligned face images are input to the residual neural network to extract the spatial features of facial expressions corresponding to the face images. The spatial features are input to the hybrid attention module to obtain the fusion features of facial expressions. Finally, the fusion features are input in the gate control loop unit to extract the temporal features of facial expressions. The temporal features are input to the fully connected layer to classify and recognize facial expressions. Experiments using the CK+ (the extended Cohn Kanade), Oulu-CASIA (Institute of Automation, Chinese Academy of Sciences) and AFEW datasets obtained recognition accuracy rates of 98.46%, 87.31%, and 53.44%, respectively. This demonstrated that the proposed method achieves not only competitive performance comparable to state-of-the-art methods but also greater than 2% performance improvement on the AFEW dataset, proving the significant outperformance of facial expression recognition in the natural environment.

Download Full-text

Transmitting and Decoding Facial Expressions

Psychological Science ◽

10.1111/j.0956-7976.2005.00801.x ◽

2005 ◽

Vol 16 (3) ◽

pp. 184-189 ◽

Cited By ~ 345

Author(s):

Marie L. Smith ◽

Garrison W. Cottrell ◽

FrédéAric Gosselin ◽

Philippe G. Schyns

Keyword(s):

Facial Expression ◽

Facial Expressions ◽

Brain Structures ◽

Human Face ◽

The Face ◽

The Brain

This article examines the human face as a transmitter of expression signals and the brain as a decoder of these expression signals. If the face has evolved to optimize transmission of such signals, the basic facial expressions should have minimal overlap in their information. If the brain has evolved to optimize categorization of expressions, it should be efficient with the information available from the transmitter for the task. In this article, we characterize the information underlying the recognition of the six basic facial expression signals and evaluate how efficiently each expression is decoded by the underlying brain structures.

Download Full-text

Pain Assessment through Facial Expression

Improving Health Management through Clinical Decision Support Systems - Advances in Healthcare Information Systems and Administration ◽

10.4018/978-1-4666-9432-3.ch003 ◽

2016 ◽

pp. 59-80

Author(s):

Sanjay Kumar Singh ◽

V. Rastogi ◽

S. K. Singh

Keyword(s):

Facial Expression ◽

Facial Expressions ◽

Pain Assessment ◽

Medical Psychology ◽

Important Symptom ◽

The Face ◽

Proper Diagnosis ◽

Heath Care ◽

Visual Changes

Pain, assumed to be the fifth vital sign, is an important symptom that needs to be adequately assessed in heath care. The visual changes reflected on the face of a person in pain may be apparent for only a few seconds and occur instinctively. Tracking these changes is a difficult and time-consuming process in a clinical setting. This is why it is motivating researchers and experts from medical, psychology and computer fields to conduct inter-disciplinary research in capturing facial expressions. This chapter contains a comprehensive review of technologies in the study of facial expression along with its application in pain assessment. The facial expressions of pain in children's (0-2 years) and in non-communicative patients need to be recognized as they are of utmost importance for proper diagnosis. Well designed computerized methodologies would streamline the process of patient assessment, increasing its accessibility to physicians and improving quality of care.

Download Full-text

LARGE-SCALE INTERACTION STRATEGIES FOR ASYNCHRONOUS ONLINE DISCUSSION

Understanding Online Instructional Modeling ◽

10.4018/978-1-59904-723-2.ch005 ◽

2011 ◽

pp. 70-86

Author(s):

Paul Giguere ◽

Scott W. Formica ◽

Wayne M. Harding ◽

Michele R. Cummins

Keyword(s):

Instructional Design ◽

Learning Environments ◽

Large Scale ◽

Online Discussion ◽

Learning Objectives ◽

Future Research ◽

Design Issues ◽

Scale Interaction ◽

Large Numbers ◽

Online Instructional Design

Designing online trainings or courses for large numbers of participants can prove to be challenging for instructors and facilitators. Online learning environments need to be structured in a way that preserves actual or perceived levels of interaction, participant perceptions of value and utility, and achievement of the learning objectives. This chapter describes five Large-Scale Interaction Strategies that offer guidance for addressing some of these online instructional design issues. Evaluation data are presented in support of two of the strategies, and recommendations are provided about how future research in this area might be conducted.

Download Full-text

Neural Responses to Facial Expression and Face Identity in the Monkey Amygdala

Journal of Neurophysiology ◽

10.1152/jn.00714.2006 ◽

2007 ◽

Vol 97 (2) ◽

pp. 1671-1683 ◽

Cited By ~ 182

Author(s):

K. M. Gothard ◽

F. P. Battaglia ◽

C. A. Erickson ◽

K. M. Spitler ◽

D. G. Amaral

Keyword(s):

Facial Expression ◽

Firing Rate ◽

Face Processing ◽

Neural Responses ◽

Firing Rates ◽

Face Identity ◽

Relative Contribution ◽

The Face ◽

Face Stimuli ◽

Human Faces

The amygdala is purported to play an important role in face processing, yet the specificity of its activation to face stimuli and the relative contribution of identity and expression to its activation are unknown. In the current study, neural activity in the amygdala was recorded as monkeys passively viewed images of monkey faces, human faces, and objects on a computer monitor. Comparable proportions of neurons responded selectively to images from each category. Neural responses to monkey faces were further examined to determine whether face identity or facial expression drove the face-selective responses. The majority of these neurons (64%) responded both to identity and facial expression, suggesting that these parameters are processed jointly in the amygdala. Large fractions of neurons, however, showed pure identity-selective or expression-selective responses. Neurons were selective for a particular facial expression by either increasing or decreasing their firing rate compared with the firing rates elicited by the other expressions. Responses to appeasing faces were often marked by significant decreases of firing rates, whereas responses to threatening faces were strongly associated with increased firing rate. Thus global activation in the amygdala might be larger to threatening faces than to neutral or appeasing faces.

Download Full-text

The Auditory Kuleshov Effect: Multisensory Integration in Movie Editing

Perception ◽

10.1177/0301006616682754 ◽

2016 ◽

Vol 46 (5) ◽

pp. 624-631 ◽

Cited By ~ 5

Author(s):

Andreas M. Baranowski ◽

H. Hecht

Keyword(s):

Facial Expression ◽

Facial Expressions ◽

Multisensory Integration ◽

Interaction Effect ◽

Neutral Face ◽

The Face ◽

Sad Music ◽

Ambiguous Situations

Almost a hundred years ago, the Russian filmmaker Lev Kuleshov conducted his now famous editing experiment in which different objects were added to a given film scene featuring a neutral face. It is said that the audience interpreted the unchanged facial expression as a function of the added object (e.g., an added soup made the face express hunger). This interaction effect has been dubbed “Kuleshov effect.” In the current study, we explored the role of sound in the evaluation of facial expressions in films. Thirty participants watched different clips of faces that were intercut with neutral scenes, featuring either happy music, sad music, or no music at all. This was crossed with the facial expressions of happy, sad, or neutral. We found that the music significantly influenced participants’ emotional judgments of facial expression. Thus, the intersensory effects of music are more specific than previously thought. They alter the evaluation of film scenes and can give meaning to ambiguous situations.

Download Full-text