scholarly journals Semantic Relationships Guided Representation Learning for Facial Action Unit Recognition

Author(s):  
Guanbin Li ◽  
Xin Zhu ◽  
Yirui Zeng ◽  
Qing Wang ◽  
Liang Lin

Facial action unit (AU) recognition is a crucial task for facial expressions analysis and has attracted extensive attention in the field of artificial intelligence and computer vision. Existing works have either focused on designing or learning complex regional feature representations, or delved into various types of AU relationship modeling. Albeit with varying degrees of progress, it is still arduous for existing methods to handle complex situations. In this paper, we investigate how to integrate the semantic relationship propagation between AUs in a deep neural network framework to enhance the feature representation of facial regions, and propose an AU semantic relationship embedded representation learning (SRERL) framework. Specifically, by analyzing the symbiosis and mutual exclusion of AUs in various facial expressions, we organize the facial AUs in the form of structured knowledge-graph and integrate a Gated Graph Neural Network (GGNN) in a multi-scale CNN framework to propagate node information through the graph for generating enhanced AU representation. As the learned feature involves both the appearance characteristics and the AU relationship reasoning, the proposed model is more robust and can cope with more challenging cases, e.g., illumination change and partial occlusion. Extensive experiments on the two public benchmarks demonstrate that our method outperforms the previous work and achieves state of the art performance.

Sensors ◽  
2020 ◽  
Vol 20 (7) ◽  
pp. 2030 ◽  
Author(s):  
Byeongkeun Kang ◽  
Yeejin Lee

Driving is a task that puts heavy demands on visual information, thereby the human visual system plays a critical role in making proper decisions for safe driving. Understanding a driver’s visual attention and relevant behavior information is a challenging but essential task in advanced driver-assistance systems (ADAS) and efficient autonomous vehicles (AV). Specifically, robust prediction of a driver’s attention from images could be a crucial key to assist intelligent vehicle systems where a self-driving car is required to move safely interacting with the surrounding environment. Thus, in this paper, we investigate a human driver’s visual behavior in terms of computer vision to estimate the driver’s attention locations in images. First, we show that feature representations at high resolution improves visual attention prediction accuracy and localization performance when being fused with features at low-resolution. To demonstrate this, we employ a deep convolutional neural network framework that learns and extracts feature representations at multiple resolutions. In particular, the network maintains the feature representation with the highest resolution at the original image resolution. Second, attention prediction tends to be biased toward centers of images when neural networks are trained using typical visual attention datasets. To avoid overfitting to the center-biased solution, the network is trained using diverse regions of images. Finally, the experimental results verify that our proposed framework improves the prediction accuracy of a driver’s attention locations.


2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Zheng Kou ◽  
Junjie Li ◽  
Xinyue Fan ◽  
Saeed Kosari ◽  
Xiaoli Qiang

Swine influenza viruses (SIVs) can unforeseeably cross the species barriers and directly infect humans, which pose huge challenges for public health and trigger pandemic risk at irregular intervals. Computational tools are needed to predict infection phenotype and early pandemic risk of SIVs. For this purpose, we propose a feature representation algorithm to predict cross-species infection of SIVs. We built a high-quality dataset of 1902 viruses. A feature representation learning scheme was applied to learn feature representations from 64 well-trained random forest models with multiple feature descriptors of mutant amino acid in the viral proteins, including compositional information, position-specific information, and physicochemical properties. Class and probabilistic information were integrated into the feature representations, and redundant features were removed by feature space optimization. High performance was achieved using 20 informative features and 22 probabilistic information. The proposed method will facilitate SIV characterization of transmission phenotype.


2020 ◽  
Vol 21 (S13) ◽  
Author(s):  
Jiajie Peng ◽  
Jingyi Li ◽  
Xuequn Shang

Abstract Background Drug-target interaction prediction is of great significance for narrowing down the scope of candidate medications, and thus is a vital step in drug discovery. Because of the particularity of biochemical experiments, the development of new drugs is not only costly, but also time-consuming. Therefore, the computational prediction of drug target interactions has become an essential way in the process of drug discovery, aiming to greatly reducing the experimental cost and time. Results We propose a learning-based method based on feature representation learning and deep neural network named DTI-CNN to predict the drug-target interactions. We first extract the relevant features of drugs and proteins from heterogeneous networks by using the Jaccard similarity coefficient and restart random walk model. Then, we adopt a denoising autoencoder model to reduce the dimension and identify the essential features. Third, based on the features obtained from last step, we constructed a convolutional neural network model to predict the interaction between drugs and proteins. The evaluation results show that the average AUROC score and AUPR score of DTI-CNN were 0.9416 and 0.9499, which obtains better performance than the other three existing state-of-the-art methods. Conclusions All the experimental results show that the performance of DTI-CNN is better than that of the three existing methods and the proposed method is appropriately designed.


CNS Spectrums ◽  
2019 ◽  
Vol 24 (1) ◽  
pp. 204-205
Author(s):  
Mina Boazak ◽  
Robert Cotes

AbstractIntroductionFacial expressivity in schizophrenia has been a topic of clinical interest for the past century. Besides the schizophrenia sufferers difficulty decoding the facial expressions of others, they often have difficulty encoding facial expressions. Traditionally, evaluations of facial expressions have been conducted by trained human observers using the facial action coding system. The process was slow and subject to intra and inter-observer variability. In the past decade the traditional facial action coding system developed by Ekman has been adapted for use in affective computing. Here we assess the applications of this adaptation for schizophrenia, the findings of current groups, and the future role of this technology.Materials and MethodsWe review the applications of computer vision technology in schizophrenia using pubmed and google scholar search criteria of “computer vision” AND “Schizophrenia” from January of 2010 to June of 2018.ResultsFive articles were selected for inclusion representing 1 case series and 4 case-control analysis. Authors assessed variations in facial action unit presence, intensity, various measures of length of activation, action unit clustering, congruence, and appropriateness. Findings point to variations in each of these areas, except action unit appropriateness, between control and schizophrenia patients. Computer vision techniques were also demonstrated to have high accuracy in classifying schizophrenia from control patients, reaching an AUC just under 0.9 in one study, and to predict psychometric scores, reaching pearson’s correlation values of under 0.7.DiscussionOur review of the literature demonstrates agreement in findings of traditional and contemporary assessment techniques of facial expressivity in schizophrenia. Our findings also demonstrate that current computer vision techniques have achieved capacity to differentiate schizophrenia from control populations and to predict psychometric scores. Nevertheless, the predictive accuracy of these technologies leaves room for growth. On analysis our group found two modifiable areas that may contribute to improving algorithm accuracy: assessment protocol and feature inclusion. Based on our review we recommend assessment of facial expressivity during a period of silence in addition to an assessment during a clinically structured interview utilizing emotionally evocative questions. Furthermore, where underfit is a problem we recommend progressive inclusion of features including action unit activation, intensity, action unit rate of onset and offset, clustering (including richness, distribution, and typicality), and congruence. Inclusion of each of these features may improve algorithm predictive accuracy.ConclusionWe review current applications of computer vision in the assessment of facial expressions in schizophrenia. We present the results of current innovative works in the field and discuss areas for continued development.


Author(s):  
Yingjie Chen ◽  
Han Wu ◽  
Tao Wang ◽  
Yizhou Wang ◽  
Yun Liang

2018 ◽  
Vol 10 (12) ◽  
pp. 1890 ◽  
Author(s):  
Mohamad Al Rahhal ◽  
Yakoub Bazi ◽  
Taghreed Abdullah ◽  
Mohamed Mekhalfi ◽  
Haikel AlHichri ◽  
...  

In this paper we propose a multi-branch neural network, called MB-Net, for solving the problem of knowledge adaptation from multiple remote sensing scene datasets acquired with different sensors over diverse locations and manually labeled with different experts. Our aim is to learn invariant feature representations from multiple source domains with labeled images and one target domain with unlabeled images. To this end, we define for MB-Net an objective function that mitigates the multiple domain shifts at both feature representation and decision levels, while retaining the ability to discriminate between different land-cover classes. The complete architecture is trainable end-to-end via the backpropagation algorithm. In the experiments, we demonstrate the effectiveness of the proposed method on a new multiple domain dataset created from four heterogonous scene datasets well known to the remote sensing community, namely, the University of California (UC-Merced) dataset, the Aerial Image dataset (AID), the PatternNet dataset, and the Northwestern Polytechnical University (NWPU) dataset. In particular, this method boosts the average accuracy over all transfer scenarios up to 89.05% compared to standard architecture based only on cross-entropy loss, which yields an average accuracy of 78.53%.


Sensors ◽  
2019 ◽  
Vol 19 (17) ◽  
pp. 3703 ◽  
Author(s):  
Yang Tao ◽  
Chunyan Li ◽  
Zhifang Liang ◽  
Haocheng Yang ◽  
Juan Xu

Electronic nose (E-nose), a kind of instrument which combines with the gas sensor and the corresponding pattern recognition algorithm, is used to detect the type and concentration of gases. However, the sensor drift will occur in realistic application scenario of E-nose, which makes a variation of data distribution in feature space and causes a decrease in prediction accuracy. Therefore, studies on the drift compensation algorithms are receiving increasing attention in the field of the E-nose. In this paper, a novel method, namely Wasserstein Distance Learned Feature Representations (WDLFR), is put forward for drift compensation, which is based on the domain invariant feature representation learning. It regards a neural network as a domain discriminator to measure the empirical Wasserstein distance between the source domain (data without drift) and target domain (drift data). The WDLFR minimizes Wasserstein distance by optimizing the feature extractor in an adversarial manner. The Wasserstein distance for domain adaption has good gradient and generalization bound. Finally, the experiments are conducted on a real dataset of E-nose from the University of California, San Diego (UCSD). The experimental results demonstrate that the effectiveness of the proposed method outperforms all compared drift compensation methods, and the WDLFR succeeds in significantly reducing the sensor drift.


Sign in / Sign up

Export Citation Format

Share Document