Disentangled Variational Representation for Heterogeneous Face Recognition

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33019005 ◽

2019 ◽

Vol 33 ◽

pp. 9005-9012 ◽

Cited By ~ 15

Author(s):

Xiang Wu ◽

Huaibo Huang ◽

Vishal M. Patel ◽

Ran He ◽

Zhenan Sun

Keyword(s):

Face Recognition ◽

Latent Variable ◽

Near Infrared ◽

Heterogeneous Data ◽

Face Representation ◽

Identity Information ◽

Variational Representation ◽

Heterogeneous Face Recognition ◽

Latent Space ◽

Variable Space

Visible (VIS) to near infrared (NIR) face matching is a challenging problem due to the significant domain discrepancy between the domains and a lack of sufficient data for training cross-modal matching algorithms. Existing approaches attempt to tackle this problem by either synthesizing visible faces from NIR faces, extracting domain-invariant features from these modalities, or projecting heterogeneous data onto a common latent space for cross-modal matching. In this paper, we take a different approach in which we make use of the Disentangled Variational Representation (DVR) for crossmodal matching. First, we model a face representation with an intrinsic identity information and its within-person variations. By exploring the disentangled latent variable space, a variational lower bound is employed to optimize the approximate posterior for NIR and VIS representations. Second, aiming at obtaining more compact and discriminative disentangled latent space, we impose a minimization of the identity information for the same subject and a relaxed correlation alignment constraint between the NIR and VIS modality variations. An alternative optimization scheme is proposed for the disentangled variational representation part and the heterogeneous face recognition network part. The mutual promotion between these two parts effectively reduces the NIR and VIS domain discrepancy and alleviates over-fitting. Extensive experiments on three challenging NIR-VIS heterogeneous face recognition databases demonstrate that the proposed method achieves significant improvements over the state-of-the-art methods.

Download Full-text

Seeing the Forest from the Trees: A Holistic Approach to Near-Infrared Heterogeneous Face Recognition

2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) ◽

10.1109/cvprw.2016.47 ◽

2016 ◽

Cited By ~ 21

Author(s):

Christopher Reale ◽

Nasser M. Nasrabadi ◽

Heesung Kwon ◽

Rama Chellappa

Keyword(s):

Face Recognition ◽

Near Infrared ◽

Holistic Approach ◽

Heterogeneous Face Recognition

Download Full-text

Generative models for capturing and exploiting the influence of process conditions on process curves

Journal of Intelligent Manufacturing ◽

10.1007/s10845-021-01846-4 ◽

2021 ◽

Author(s):

Tarek Iraki ◽

Norbert Link

Keyword(s):

Latent Variable ◽

Welding Process ◽

Feature Space ◽

Generative Models ◽

Process Conditions ◽

Process Data ◽

Task Learning ◽

Latent Space ◽

Variable Space ◽

Novel Method

AbstractVariations of dedicated process conditions (such as workpiece and tool properties) yield different process state evolutions, which are reflected by different time series of the observable quantities (process curves). A novel method is presented, which firstly allows to extract the statistical influence of these conditions on the process curves and its representation via generative models, and secondly represents their influence on the ensemble of curves by transformations of the representation space. A latent variable space is derived from sampled process data, which represents the curves with only few features. Generative models are formed based on conditional propability functions estimated in this space. Furthermore, the influence of conditions on the ensemble of process curves is represented by estimated transformations of the feature space, which map the process curve densities with different conditions on each other. The latent space is formed via Multi-Task-Learning of an auto-encoder and condition-detectors. The latter classifies the latent space representations of the process curves into the considered conditions. The Bayes framework and the Multi-task Learning models are used to obtain the process curve probabilty densities from the latent space densities. The methods are shown to reveal and represent the influence of combinations of workpiece and tool properties on resistance spot welding process curves.

Download Full-text

Pose-preserving Cross Spectral Face Hallucination

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/143 ◽

2019 ◽

Cited By ~ 7

Author(s):

Junchi Yu ◽

Jie Cao ◽

Yi Li ◽

Xiaofei Jia ◽

Ran He

Keyword(s):

Near Infrared ◽

State Of The Art ◽

Generative Models ◽

Training Data ◽

Face Hallucination ◽

Identity Information ◽

Current State ◽

Heterogeneous Face Recognition ◽

Face Datasets ◽

Domain Level

To narrow the inherent sensing gap in heterogeneous face recognition (HFR), recent methods have resorted to generative models and explored the ?recognition via generation? framework. Even though, it remains a very challenging task to synthesize photo-realistic visible faces (VIS) from near-infrared (NIR) images especially when paired training data are unavailable. We present an approach to avert the data misalignment problem and faithfully preserve pose, expression and identity information during cross-spectral face hallucination. At the pixel level, we introduce an unsupervised attention mechanism to warping that is jointly learned with the generator to derive pixel-wise correspondence from unaligned data. At the image level, an auxiliary generator is employed to facilitate the learning of mapping from NIR to VIS domain. At the domain level, we first apply the mutual information constraint to explicitly measure the correlation between domains and thus benefit synthesis. Extensive experiments on three heterogeneous face datasets demonstrate that our approach not only outperforms current state-of-the-art HFR methods but also produce visually appealing results at a high resolution.

Download Full-text

SADG: Self-Aligned Dual NIR-VIS Generation for Heterogeneous Face Recognition

Applied Sciences ◽

10.3390/app11030987 ◽

2021 ◽

Vol 11 (3) ◽

pp. 987

Author(s):

Pengcheng Zhao ◽

Fuping Zhang ◽

Jianming Wei ◽

Yingbo Zhou ◽

Xiao Wei

Keyword(s):

Face Recognition ◽

Near Infrared ◽

External Information ◽

Significant Alignment ◽

Heterogeneous Face Recognition ◽

Image Translation ◽

Challenging Tasks ◽

The Mean ◽

Ablation Study ◽

Public Datasets

Heterogeneous face recognition (HFR) has aroused significant interest in recent years, with some challenging tasks such as misalignment problems and limited HFR data. Misalignment occurs among different modalities’ images mainly because of misaligned semantics. Although recent methods have attempted to settle the low-shot problem, they suffer from the misalignment problem between paired near infrared (NIR) and visible (VIS) images. Misalignment can bring performance degradation to most image-to-image translation networks. In this work, we propose a self-aligned dual generation (SADG) architecture for generating semantics-aligned pairwise NIR-VIS images with the same identity, but without the additional guidance of external information learning. Specifically, we propose a self-aligned generator to align the data distributions between two modalities. Then, we present a multiscale patch discriminator to get high quality images. Furthermore, we raise the mean landmark distance (MLD) to test the alignment performance between NIR and VIS images with the same identity. Extensive experiments and an ablation study of SADG on three public datasets show significant alignment performance and recognition results. Specifically, the Rank1 accuracy achieved was close to 99.9% for the CASIA NIR-VIS 2.0, Oulu-CASIA NIR-VIS and BUAA VIS-NIR datasets, respectively.

Download Full-text

A Multi-angle Face Recognition Algorithm Based on Modified Gaussian Process Latent Variable Mode

JOURNAL OF ELECTRONICS INFORMATION TECHNOLOGY ◽

10.3724/sp.j.1146.2013.00412 ◽

2014 ◽

Vol 35 (9) ◽

pp. 2033-2039

Author(s):

Jian Liu ◽

Zhi-heng Gong ◽

Cheng-dong Wu ◽

En-yang Gao

Keyword(s):

Face Recognition ◽

Gaussian Process ◽

Latent Variable ◽

Recognition Algorithm ◽

Variable Mode

Download Full-text

DVG-Face: Dual Variational Generation for Heterogeneous Face Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence ◽

10.1109/tpami.2021.3052549 ◽

2021 ◽

pp. 1-1

Author(s):

Chaoyou Fu ◽

Xiang Wu ◽

Yibo Hu ◽

Huaibo Huang ◽

Ran He

Keyword(s):

Face Recognition ◽

Heterogeneous Face Recognition

Download Full-text

Interpretable Variational Graph Autoencoder with Noninformative Prior

Future Internet ◽

10.3390/fi13020051 ◽

2021 ◽

Vol 13 (2) ◽

pp. 51

Author(s):

Lili Sun ◽

Xueyan Liu ◽

Min Zhao ◽

Bo Yang

Keyword(s):

Latent Variables ◽

Latent Variable ◽

Expert Knowledge ◽

Structural Information ◽

Standard Normal Distribution ◽

Noninformative Prior ◽

Latent Space ◽

Distribution Parameters ◽

Standard Normal ◽

Low Dimensional

Variational graph autoencoder, which can encode structural information and attribute information in the graph into low-dimensional representations, has become a powerful method for studying graph-structured data. However, most existing methods based on variational (graph) autoencoder assume that the prior of latent variables obeys the standard normal distribution which encourages all nodes to gather around 0. That leads to the inability to fully utilize the latent space. Therefore, it becomes a challenge on how to choose a suitable prior without incorporating additional expert knowledge. Given this, we propose a novel noninformative prior-based interpretable variational graph autoencoder (NPIVGAE). Specifically, we exploit the noninformative prior as the prior distribution of latent variables. This prior enables the posterior distribution parameters to be almost learned from the sample data. Furthermore, we regard each dimension of a latent variable as the probability that the node belongs to each block, thereby improving the interpretability of the model. The correlation within and between blocks is described by a block–block correlation matrix. We compare our model with state-of-the-art methods on three real datasets, verifying its effectiveness and superiority.

Download Full-text

View-tuned and view-invariant face encoding in IT cortex is explained by selected natural image fragments

Scientific Reports ◽

10.1038/s41598-021-86842-7 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Yunjun Nam ◽

Takayuki Sato ◽

Go Uchida ◽

Ekaterina Malakhova ◽

Shimon Ullman ◽

...

Keyword(s):

Face Recognition ◽

Visual Pathway ◽

Visual Features ◽

Natural Image ◽

Face Representation ◽

Low Level ◽

Ventral Visual Pathway ◽

Neural Substrate ◽

The Face

AbstractHumans recognize individual faces regardless of variation in the facial view. The view-tuned face neurons in the inferior temporal (IT) cortex are regarded as the neural substrate for view-invariant face recognition. This study approximated visual features encoded by these neurons as combinations of local orientations and colors, originated from natural image fragments. The resultant features reproduced the preference of these neurons to particular facial views. We also found that faces of one identity were separable from the faces of other identities in a space where each axis represented one of these features. These results suggested that view-invariant face representation was established by combining view sensitive visual features. The face representation with these features suggested that, with respect to view-invariant face representation, the seemingly complex and deeply layered ventral visual pathway can be approximated via a shallow network, comprised of layers of low-level processing for local orientations and colors (V1/V2-level) and the layers which detect particular sets of low-level elements derived from natural image fragments (IT-level).

Download Full-text