scholarly journals Dual Semi-Supervised Learning for Facial Action Unit Recognition

Author(s):  
Guozhu Peng ◽  
Shangfei Wang

Current works on facial action unit (AU) recognition typically require fully AU-labeled training samples. To reduce the reliance on time-consuming manual AU annotations, we propose a novel semi-supervised AU recognition method leveraging two kinds of readily available auxiliary information. The method leverages the dependencies between AUs and expressions as well as the dependencies among AUs, which are caused by facial anatomy and therefore embedded in all facial images, independent on their AU annotation status. The other auxiliary information is facial image synthesis given AUs, the dual task of AU recognition from facial images, and therefore has intrinsic probabilistic connections with AU recognition, regardless of AU annotations. Specifically, we propose a dual semi-supervised generative adversarial network for AU recognition from partially AU-labeled and fully expressionlabeled facial images. The proposed network consists of an AU classifier C, an image generator G, and a discriminator D. In addition to minimize the supervised losses of the AU classifier and the face generator for labeled training data, we explore the probabilistic duality between the tasks using adversary learning to force the convergence of the face-AU-expression tuples generated from the AU classifier and the face generator, and the ground-truth distribution in labeled data for all training data. This joint distribution also includes the inherent AU dependencies. Furthermore, we reconstruct the facial image using the output of the AU classifier as the input of the face generator, and create AU labels by feeding the output of the face generator to the AU classifier. We minimize reconstruction losses for all training data, thus exploiting the informative feedback provided by the dual tasks. Within-database and cross-database experiments on three benchmark databases demonstrate the superiority of our method in both AU recognition and face synthesis compared to state-of-the-art works.

2020 ◽  
Vol 2020 ◽  
pp. 1-10
Author(s):  
Huan Yang ◽  
Pengjiang Qian ◽  
Chao Fan

Multimodal registration is a challenging task due to the significant variations exhibited from images of different modalities. CT and MRI are two of the most commonly used medical images in clinical diagnosis, since MRI with multicontrast images, together with CT, can provide complementary auxiliary information. The deformable image registration between MRI and CT is essential to analyze the relationships among different modality images. Here, we proposed an indirect multimodal image registration method, i.e., sCT-guided multimodal image registration and problematic image completion method. In addition, we also designed a deep learning-based generative network, Conditional Auto-Encoder Generative Adversarial Network, called CAE-GAN, combining the idea of VAE and GAN under a conditional process to tackle the problem of synthetic CT (sCT) synthesis. Our main contributions in this work can be summarized into three aspects: (1) We designed a new generative network called CAE-GAN, which incorporates the advantages of two popular image synthesis methods, i.e., VAE and GAN, and produced high-quality synthetic images with limited training data. (2) We utilized the sCT generated from multicontrast MRI as an intermediary to transform multimodal MRI-CT registration into monomodal sCT-CT registration, which greatly reduces the registration difficulty. (3) Using normal CT as guidance and reference, we repaired the abnormal MRI while registering the MRI to the normal CT.


2021 ◽  
Vol 11 (23) ◽  
pp. 11171
Author(s):  
Shushi Namba ◽  
Wataru Sato ◽  
Sakiko Yoshikawa

Automatic facial action detection is important, but no previous studies have evaluated pre-trained models on the accuracy of facial action detection as the angle of the face changes from frontal to profile. Using static facial images obtained at various angles (0°, 15°, 30°, and 45°), we investigated the performance of three automated facial action detection systems (FaceReader, OpenFace, and Py-feat). The overall performance was best for OpenFace, followed by FaceReader and Py-Feat. The performance of FaceReader significantly decreased at 45° compared to that at other angles, while the performance of Py-Feat did not differ among the four angles. The performance of OpenFace decreased as the target face turned sideways. Prediction accuracy and robustness to angle changes varied with the target facial components and action detection system.


2019 ◽  
Vol 30 (6) ◽  
pp. 893-906 ◽  
Author(s):  
Zachary Witkower ◽  
Jessica L. Tracy

Research on face perception tends to focus on facial morphology and the activation of facial muscles while ignoring any impact of head position. We raise questions about this approach by demonstrating that head movements can dramatically shift the appearance of the face to shape social judgments without engaging facial musculature. In five studies (total N = 1,517), we found that when eye gaze was directed forward, tilting one’s head downward (compared with a neutral angle) increased perceptions of dominance, and this effect was due to the illusory appearance of lowered and V-shaped eyebrows caused by a downward head tilt. Tilting one’s head downward therefore functions as an action-unit imposter, creating the artificial appearance of a facial action unit that has a strong effect on social perception. Social judgments about faces are therefore driven not only by facial shape and musculature but also by movements in the face’s physical foundation: the head.


2018 ◽  
Author(s):  
Gongbo Liang ◽  
Sajjad Fouladvand ◽  
Jie Zhang ◽  
Michael A. Brooks ◽  
Nathan Jacobs ◽  
...  

AbstractComputed tomography (CT) is a widely-used diag-reproducibility regarding radiomic features, such as intensity, nostic image modality routinely used for assessing anatomical tissue characteristics. However, non-standardized imaging pro-tocols are commonplace, which poses a fundamental challenge in large-scale cross-center CT image analysis. One approach to address the problem is to standardize CT images using generative adversarial network models (GAN). GAN learns the data distribution of training images and generate synthesized images under the same distribution. However, existing GAN models are not directly applicable to this task mainly due to the lack of constraints on the mode of data to generate. Furthermore, they treat every image equally, but in real applications, some images are more difficult to standardize than the others. All these may lead to the lack-of-detail problem in CT image synthesis. We present a new GAN model called GANai to mitigate the differences in radiomic features across CT images captured using non-standard imaging protocols. Given source images, GANai composes new images by specifying a high-level goal that the image features of the synthesized images should be similar to those of the standard images. GANai introduces an alternative improvement training strategy to alternatively and steadily improve model performance. The new training strategy enables a series of technical improvements, including phase-specific loss functions, phase-specific training data, and the adoption of ensemble learning, leading to better model performance. The experimental results show that GANai is significantly better than the existing state-of-the-art image synthesis algorithms on CT image standardization. Also, it significantly improves the efficiency and stability of GAN model training.


2020 ◽  
Vol 10 (7) ◽  
pp. 2628 ◽  
Author(s):  
Hyeon Kang ◽  
Jang-Sik Park ◽  
Kook Cho ◽  
Do-Young Kang

Conventional data augmentation (DA) techniques, which have been used to improve the performance of predictive models with a lack of balanced training data sets, entail an effort to define the proper repeating operation (e.g., rotation and mirroring) according to the target class distribution. Although DA using generative adversarial network (GAN) has the potential to overcome the disadvantages of conventional DA, there are not enough cases where this technique has been applied to medical images, and in particular, not enough cases where quantitative evaluation was used to determine whether the generated images had enough realism and diversity to be used for DA. In this study, we synthesized 18F-Florbetaben (FBB) images using CGAN. The generated images were evaluated using various measures, and we presented the state of the images and the similarity value of quantitative measurement that can be expected to successfully augment data from generated images for DA. The method includes (1) conditional WGAN-GP to learn the axial image distribution extracted from pre-processed 3D FBB images, (2) pre-trained DenseNet121 and model-agnostic metrics for visual and quantitative measurements of generated image distribution, and (3) a machine learning model for observing improvement in generalization performance by generated dataset. The Visual Turing test showed similarity in the descriptions of typical patterns of amyloid deposition for each of the generated images. However, differences in similarity and classification performance per axial level were observed, which did not agree with the visual evaluation. Experimental results demonstrated that quantitative measurements were able to detect the similarity between two distributions and observe mode collapse better than the Visual Turing test and t-SNE.


2021 ◽  
Vol 6 (1) ◽  
pp. 1
Author(s):  
Rifki Kosasih

To find out if an employee is present, attendance is usually used. Attendance can be done in several ways, one of which is by filling in the attendance list that has been provided (manual attendance). However, this method is less effective because there is a possibility that employees who are not present will entrust attendance to employees who are present. Therefore, other ways are needed so that this does not happen. In this study, attendance was carried out using facial recognition. Face recognition is one of the fields used to recognize someone. A person's face usually has special characteristics that are easily recognized by people. These special characteristics are also called features. In this study, these features can be searched using the Principle Component Analysis (PCA) method. The PCA method is one of the methods used to produce features by reducing dimensions using eigenvectors from facial images (eigenface). The facial image used in this study consisted of 40 people with each person having 10 facial images with various expressions. Image data is divided into two parts, namely training data and test data. In this study, it is proposed to pay attention to the amount of training data and the number of eigenvectors used to get the best level of accuracy. From the research results, the highest level of accuracy occurs when the training data for each person is 7 and the test data for each person is 3 with an accuracy rate of 96.67%.


Author(s):  
R. L. Palmer ◽  
P. Helmholz ◽  
G. Baynam

Abstract. Facial appearance has long been understood to offer insight into a person’s health. To an experienced clinician, atypical facial features may signify the presence of an underlying rare or genetic disease. Clinicians use their knowledge of how disease affects facial appearance along with the patient’s physiological and behavioural traits, and their medical history, to determine a diagnosis. Specialist expertise and experience is needed to make a dysmorphological facial analysis. Key to this is accurately assessing how a face is significantly different in shape and/or growth compared to expected norms. Modern photogrammetric systems can acquire detailed 3D images of the face which can be used to conduct a facial analysis in software with greater precision than can be obtained in person. Measurements from 3D facial images are already used as an alternative to direct measurement using instruments such as tape measures, rulers, or callipers. However, the ability to take accurate measurements – whether virtual or not – presupposes the assessor’s facility to accurately place the endpoints of the measuring tool at the positions of standardised anatomical facial landmarks. In this paper, we formally introduce Cliniface – a free and open source application that uses a recently published highly precise method of detecting facial landmarks from 3D facial images by non-rigidly transforming an anthropometric mask (AM) to the target face. Inter-landmark measurements are then used to automatically identify facial traits that may be of clinical significance. Herein, we show how non-experts with minimal guidance can use Cliniface to extract facial anthropometrics from a 3D facial image at a level of accuracy comparable to an expert. We further show that Cliniface itself is able to extract the same measurements at a similar level of accuracy – completely automatically.


Perception ◽  
1995 ◽  
Vol 24 (5) ◽  
pp. 563-575 ◽  
Author(s):  
Masami K Yamaguchi ◽  
Tastu Hirukawa ◽  
So Kanazawa

Japanese male and female undergraduate students judged the gender of a variety of facial images. These images were combinations of the following facial parts: eyebrows, eyes, nose, mouth, and the face outline (cheek and chin). These parts were extracted from averaged facial images of Japanese males and females aged 18 and 19 years by means of the Facial Image Processing System. The results suggested that, in identifying gender, subjects performed identification on the basis of the eyebrows and the face outline, and both males and females were more likely to identify the faces as those of their own gender. The results are discussed in relation to previous studies, with particular attention paid to the matter of race differences.


2020 ◽  
Vol 10 (6) ◽  
pp. 1995 ◽  
Author(s):  
Jeong gi Kwak ◽  
Hanseok Ko

The processing of facial images is an important task, because it is required for a large number of real-world applications. As deep-learning models evolve, they require a huge number of images for training. In reality, however, the number of images available is limited. Generative adversarial networks (GANs) have thus been utilized for database augmentation, but they suffer from unstable training, low visual quality, and a lack of diversity. In this paper, we propose an auto-encoder-based GAN with an enhanced network structure and training scheme for Database (DB) augmentation and image synthesis. Our generator and decoder are divided into two separate modules that each take input vectors for low-level and high-level features; these input vectors affect all layers within the generator and decoder. The effectiveness of the proposed method is demonstrated by comparing it with baseline methods. In addition, we introduce a new scheme that can combine two existing images without the need for extra networks based on the auto-encoder structure of the discriminator in our model. We add a novel double-constraint loss to make the encoded latent vectors equal to the input vectors.


2020 ◽  
Vol 402 ◽  
pp. 359-365 ◽  
Author(s):  
Jijun He ◽  
Jinjin Zheng ◽  
Yuan Shen ◽  
Yutang Guo ◽  
Hongjun Zhou

Sign in / Sign up

Export Citation Format

Share Document