A Multi-Linear Statistical Method for Discriminant Analysis of 2D Frontal Face Images

Author(s):  
Carlos Eduardo Thomaz ◽  
Vagner do Amaral ◽  
Gilson Antonio Giraldi ◽  
Edson Caoru Kitani ◽  
João Ricardo Sato ◽  
...  

This chapter describes a multi-linear discriminant method of constructing and quantifying statistically significant changes on human identity photographs. The approach is based on a general multivariate two-stage linear framework that addresses the small sample size problem in high-dimensional spaces. Starting with a 2D data set of frontal face images, the authors determine a most characteristic direction of change by organizing the data according to the patterns of interest. These experiments on publicly available face image sets show that the multi-linear approach does produce visually plausible results for gender, facial expression and aging facial changes in a simple and efficient way. The authors believe that such approach could be widely applied for modeling and reconstruction in face recognition and possibly in identifying subjects after a lapse of time.

2012 ◽  
Vol 2012 ◽  
pp. 1-18
Author(s):  
Jiajuan Liang

High-dimensional data with a small sample size, such as microarray data and image data, are commonly encountered in some practical problems for which many variables have to be measured but it is too costly or time consuming to repeat the measurements for many times. Analysis of this kind of data poses a great challenge for statisticians. In this paper, we develop a new graphical method for testing spherical symmetry that is especially suitable for high-dimensional data with small sample size. The new graphical method associated with the local acceptance regions can provide a quick visual perception on the assumption of spherical symmetry. The performance of the new graphical method is demonstrated by a Monte Carlo study and illustrated by a real data set.


Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Jing Zhang ◽  
Guang Lu ◽  
Jiaquan Li ◽  
Chuanwen Li

Mining useful knowledge from high-dimensional data is a hot research topic. Efficient and effective sample classification and feature selection are challenging tasks due to high dimensionality and small sample size of microarray data. Feature selection is necessary in the process of constructing the model to reduce time and space consumption. Therefore, a feature selection model based on prior knowledge and rough set is proposed. Pathway knowledge is used to select feature subsets, and rough set based on intersection neighborhood is then used to select important feature in each subset, since it can select features without redundancy and deals with numerical features directly. In order to improve the diversity among base classifiers and the efficiency of classification, it is necessary to select part of base classifiers. Classifiers are grouped into several clusters by k-means clustering using the proposed combination distance of Kappa-based diversity and accuracy. The base classifier with the best classification performance in each cluster will be selected to generate the final ensemble model. Experimental results on three Arabidopsis thaliana stress response datasets showed that the proposed method achieved better classification performance than existing ensemble models.


2021 ◽  
Author(s):  
Xin Chen ◽  
Qingrun Zhang ◽  
Thierry Chekouo

Abstract Background: DNA methylations in critical regions are highly involved in cancer pathogenesis and drug response. However, to identify causal methylations out of a large number of potential polymorphic DNA methylation sites is challenging. This high-dimensional data brings two obstacles: first, many established statistical models are not scalable to so many features; second, multiple-test and overfitting become serious. To this end, a method to quickly filter candidate sites to narrow down targets for downstream analyses is urgently needed. Methods: BACkPAy is a pre-screening Bayesian approach to detect biological meaningful clusters of potential differential methylation levels with small sample size. BACkPAy prioritizes potentially important biomarkers by the Bayesian false discovery rate (FDR) approach. It filters non-informative sites (i.e. non-differential) with flat methylation pattern levels accross experimental conditions. In this work, we applied BACkPAy to a genome-wide methylation dataset with 3 tissue types and each type contains 3 gastric cancer samples. We also applied LIMMA (Linear Models for Microarray and RNA-Seq Data) to compare its results with what we achieved by BACkPAy. Then, Cox proportional hazards regression models were utilized to visualize prognostics significant markers with The Cancer Genome Atlas (TCGA) data for survival analysis. Results: Using BACkPAy, we identified 8 biological meaningful clusters/groups of differential probes from the DNA methylation dataset. Using TCGA data, we also identified five prognostic genes (i.e. predictive to the progression of gastric cancer) that contain some differential methylation probes, whereas no significant results was identified using the Benjamin-Hochberg FDR in LIMMA. Conclusions: We showed the importance of using BACkPAy for the analysis of DNA methylation data with extremely small sample size in gastric cancer. We revealed that RDH13, CLDN11, TMTC1, UCHL1 and FOXP2 can serve as predictive biomarkers for gastric cancer treatment and the promoter methylation level of these five genes in serum could have prognostic and diagnostic functions in gastric cancer patients.


Author(s):  
Tinen L. Iles ◽  
Timothy G. Laske ◽  
Paul A. Iaizzo ◽  
Elishai Ezra Tsur

Abstract Brain-inspired (neuromorphic) systems realize biological neural principles with Spiking Neural Networks (SNN) to provide high-performing, energy-efficient frameworks for robotics, artificial intelligence, and adaptive control. The Neural Engineering Framework (NEF) brings forth a theoretical framework approach for the representation of high-dimensional mathematical constructs with spiking neurons for the implementation of functional large-scale neural networks. Here, we explore the utilization of neuromorphic adaptive control for circadian modulated cardiac pacing by examining the neuromorphic representation of high-dimensional cardiac data. For this study, we have utilized a model from a data set acquired from an American black bear during hibernation. Black bears in Minnesota will hibernate for 4-6 months without eating and drinking while losing little muscle mass and remain relatively normothermic throughout the winter [10]. In the current study, we obtained EEG and ECG data from one black bear throughout the winter months in Grand Rapids, MN, represented with NEF. Our results demonstrated opposing requirements for neuromorphic representation. While using high synaptic time constants for obtained ECG data, provided desirable low pass filtering, representation of EEG data requires fast synapses and a high number of neurons. Although this is only an analysis of a small sample of the data available, these guidelines provided the robust pilot dataset to observe the SNN patterns during prolonged hibernation and pair this data with the cardiac responses and thus support research questions related to the autonomic tone during hibernation. This preliminary research will help further develop our neuromorphic adaptive controller to better adapt cardiac pacing to circadian rhythms. This unique dataset may pave the way toward deciphering the underlying neural mechanisms of hibernation, providing translational to humans.


Author(s):  
Xiaoyu Lu ◽  
Szu-Wei Tu ◽  
Wennan Chang ◽  
Changlin Wan ◽  
Jiashi Wang ◽  
...  

Abstract Deconvolution of mouse transcriptomic data is challenged by the fact that mouse models carry various genetic and physiological perturbations, making it questionable to assume fixed cell types and cell type marker genes for different data set scenarios. We developed a Semi-Supervised Mouse data Deconvolution (SSMD) method to study the mouse tissue microenvironment. SSMD is featured by (i) a novel nonparametric method to discover data set-specific cell type signature genes; (ii) a community detection approach for fixing cell types and their marker genes; (iii) a constrained matrix decomposition method to solve cell type relative proportions that is robust to diverse experimental platforms. In summary, SSMD addressed several key challenges in the deconvolution of mouse tissue data, including: (i) varied cell types and marker genes caused by highly divergent genotypic and phenotypic conditions of mouse experiment; (ii) diverse experimental platforms of mouse transcriptomics data; (iii) small sample size and limited training data source and (iv) capable to estimate the proportion of 35 cell types in blood, inflammatory, central nervous or hematopoietic systems. In silico and experimental validation of SSMD demonstrated its high sensitivity and accuracy in identifying (sub) cell types and predicting cell proportions comparing with state-of-the-arts methods. A user-friendly R package and a web server of SSMD are released via https://github.com/xiaoyulu95/SSMD.


Author(s):  
David Zhang ◽  
Fengxi Song ◽  
Yong Xu ◽  
Zhizhen Liang

This chapter is a brief introduction to biometric discriminant analysis technologies — Section I of the book. Section 2.1 describes two kinds of linear discriminant analysis (LDA) approaches: classification-oriented LDA and feature extraction-oriented LDA. Section 2.2 discusses LDA for solving the small sample size (SSS) pattern recognition problems. Section 2.3 shows the organization of Section I.


Sign in / Sign up

Export Citation Format

Share Document