Annotated Facial Landmarks in the Wild: A large-scale, real-world database for facial landmark localization

Author(s):  
Martin Kostinger ◽  
Paul Wohlhart ◽  
Peter M. Roth ◽  
Horst Bischof
2020 ◽  
Vol 34 (07) ◽  
pp. 12621-12628 ◽  
Author(s):  
Jing Yang ◽  
Adrian Bulat ◽  
Georgios Tzimiropoulos

It is known that facial landmarks provide pose, expression and shape information. In addition, when matching, for example, a profile and/or expressive face to a frontal one, knowledge of these landmarks is useful for establishing correspondence which can help improve recognition. However, in prior work on face recognition, facial landmarks are only used for face cropping in order to remove scale, rotation and translation variations. This paper proposes a simple approach to face recognition which gradually integrates features from different layers of a facial landmark localization network into different layers of the recognition network. To this end, we propose an appropriate feature integration layer which makes the features compatible before integration. We show that such a simple approach systematically improves recognition on the most difficult face recognition datasets, setting a new state-of-the-art on IJB-B, IJB-C and MegaFace datasets.


2020 ◽  
Vol 2020 (8) ◽  
pp. 185-1-185-6
Author(s):  
Ruiyi Mao ◽  
Qian Lin ◽  
Jan P. Allebach

Facial landmark localization plays a critical role in many face analysis tasks. In this paper, we present a novel local-global aggregate network (LGA-Net) for robust facial landmark localization of faces in the wild. The network consists of two convolutional neural network levels which aggregate local and global information for better prediction accuracy and robustness. Experimental results show our method overcomes typical problems of cascaded networks and outperforms state-of-the-art methods on the 300-W [1] benchmark.


2020 ◽  
Author(s):  
Diego L Guarin ◽  
Babak Taati ◽  
Tessa Hadlock ◽  
Yana Yunusova

Abstract Background Automatic facial landmark localization in videos is an important first step in many computer vision applications, including the objective assessment of orofacial function. Convolutional neural networks (CNN) for facial landmarks localization are typically trained on faces of healthy and young adults, so model performance is inferior when applied to faces of older adults or people with diseases that affect facial movements, a phenomenon known as algorithmic bias. Fine-tuning pre-trained CNN models with representative data is a well-known technique used to reduce algorithmic bias and improve performance on clinical populations. However, the question of how much data is needed to properly fine-tune the model remains. Methods In this paper, we fine-tuned a popular CNN model for automatic facial landmarks localization using different number of manually annotated photographs from patients with facial palsy and evaluated the effects of the number of photographs used for model fine-tuning in the model performance by computing the normalized root mean squared error between the facial landmarks positions predicted by the model and those provided by manual annotators. Furthermore, we studied the effect of annotator bias by fine-tuning and evaluating the model with data provided by multiple annotators. Results Our results showed that fine-tuning the model with as little as 8 photographs from a single patient significantly improved the model performance on other individuals from the same clinical population, and that the best performance was achieved by fine-tuning the model with 320 photographs from 40 patients. Using more photographs for fine-tuning did not improve the model performance further. Regarding the annotator bias, we found that fine-tuning a CNN model with data from one annotator resulted in models biased against other annotators; our results also showed that this effect can be diminished by averaging data from multiple annotators. Conclusions It is possible to remove the algorithmic bias of a\textbf{depth} CNN model for automatic facial landmark localization using data from only 40 participants (total of 320 photographs). These results pave the way to future clinical applications of CNN models for the automatic assessment of orofacial function in different clinical populations, including patients with Parkinson’s disease and stroke.


Sign in / Sign up

Export Citation Format

Share Document