scholarly journals Kernel principal components based cascade forest towards disease identification with human microbiota

2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Jiayu Zhou ◽  
Yanqing Ye ◽  
Jiang Jiang

Abstract Background Numerous pieces of clinical evidence have shown that many phenotypic traits of human disease are related to their gut microbiome, i.e., inflammation, obesity, HIV, and diabetes. Through supervised classification, it is feasible to determine the human disease states by revealing the intestinal microbiota compositional information. However, the abundance matrix of microbiome data is so sparse, an interpretable deep model is crucial to further represent and mine the data for expansion, such as the deep forest model. What’s more, overfitting can still exist in the original deep forest model when dealing with such “large p, small n” biology data. Feature reduction is considered to improve the ensemble forest model especially towards the disease identification in the human microbiota. Methods In this work, we propose the kernel principal components based cascade forest method, so-called KPCCF, to classify the disease states of patients by using taxonomic profiles of the microbiome at the family level. In detail, the kernel principal components analysis method is first used to reduce the original dimension of human microbiota datasets. Besides, the processed data is fed into the cascade forest to preliminarily discriminate against the disease state of the samples. Results The proposed KPCCF algorithm can represent the small-scale and high-dimension human microbiota datasets with the sparse feature matrix. Systematic comparison experiments demonstrate that our method consistently outperforms the state-of-the-art methods with the comparative study on 4 datasets. Conclusion Despite sharing some common characteristics, a one-size-fits-all solution does not exist in any space. The traditional depth model has limitations in the biological application of the unbalanced scale between small samples and high dimensions. KPCCF distinguishes from the standard deep forest model for its excellent performance in the microbiota field. Additionally, compared to other dimensionality reduction methods, the kernel principal components analysis method is more suitable for microbiota datasets.

2020 ◽  
Author(s):  
Jiayu Zhou ◽  
Xuwen Wang ◽  
Yanqing Ye ◽  
Jiang Jiang

Abstract Numerous pieces of clinical evidence have shown that many phenotypic traits of human disease are related to their gut microbiome. Through supervised classification, it is feasible to determine the human disease states by revealing the intestinal microbiota compositional information. However, the abundance matrix of microbiome data is so sparse, an interpretable deep model is crucial to further represent and mine the data for expansion, such as the deep forest. What's more, overfitting can still exist in the original deep forest model when dealing with such “large p, small n” biology data. Feature reduction is considered to improve the ensemble forest model especially towards the disease identification in the human microbiota. In this work, we propose the kernel principal components based cascade forest method, so-called KPCCF, to classify the disease states of patients by using taxonomic profiles of the microbiome at the family level. In detail, the kernel principal components analysis method is first used to reduce the original dimension of human microbiota datasets. Besides, the processed data is fed into the cascade forest to preliminarily discriminate the disease state of the samples. Thus, the proposed KPCCF algorithm can represent the small-scale and high-dimension human microbiota datasets with the sparse feature matrix. Systematic comparison experiments demonstrate that our method consistently outperforms the state-of-the-art methods with the comparative study on 4 datasets. Additionally, compared to other dimensionality reduction methods, the kernel principal components analysis method is more suitable for microbiota datasets.


2020 ◽  
Author(s):  
Jiayu Zhou ◽  
Xuwen Wang ◽  
Yanqing Ye ◽  
Jiang Jiang

Abstract Numerous pieces of clinical evidence have shown that many phenotypic traits of human disease are related to their gut microbiome. Through supervised classification, it is feasible to determine the human disease states by revealing the intestinal microbiota compositional information. However, the abundance matrix of microbiome data is so sparse, an interpretable deep model is crucial to further represent and mine the data for expansion, such as the deep forest. What's more, overfitting can still exist in the original deep forest model when dealing with such “large p, small n” biology data. Feature reduction is considered to improve the ensemble forest model especially towards the disease identification in the human microbiota. In this work, we propose the kernel principal components based cascade forest method, so-called KPCCF, to classify the disease states of patients by using taxonomic profiles of the microbiome at the family level. In detail, the kernel principal components analysis method is first used to reduce the original dimension of human microbiota datasets. Besides, the processed data is fed into the cascade forest to preliminarily discriminate the disease state of the samples. Thus, the proposed KPCCF algorithm can represent the small-scale and high-dimension human microbiota datasets with the sparse feature matrix. Systematic comparison experiments demonstrate that our method consistently outperforms the state-of-the-art methods with the comparative study on 4 datasets. Additionally, compared to other dimensionality reduction methods, kernel principal components analysis method is more suitable for microbiota datasets.


2013 ◽  
Vol 756-759 ◽  
pp. 3590-3595
Author(s):  
Liang Zhang ◽  
Ji Wen Dong

Aiming at solving the problems of occlusion and illumination in face recognition, a new method of face recognition based on Kernel Principal Components Analysis (KPCA) and Collaborative Representation Classifier (CRC) is developed. The KPCA can obtain effective discriminative information and reduce the feature dimensions by extracting faces nonlinear structures features, the decisive factor. Considering the collaboration among the samples, the CRC which synthetically consider the relationship among samples is used. Experimental results demonstrate that the algorithm obtains good recognition rates and also improves the efficiency. The KCRC algorithm can effectively solve the problem of illumination and occlusion in face recognition.


2010 ◽  
Vol 113-116 ◽  
pp. 938-942
Author(s):  
Mu Hua Cui

This article is designed to carry out design of index system for evaluation of ecological city which is applicable to features of city of Ha’erbin on basis of actual conditions of Ha’erbin in principle of combination of qualitative analysis and quantitative analysis and to conduct evaluation on effect of restoration of ecological city of Ha’erbin with principal components analysis method. Results of evaluation show that some accomplishment has been made in terms of construction of ecological city of Ha’erbin and sub-system of environment, economy and society of Ha’erbin has been greatly improved since 2002.


2013 ◽  
Vol 763 ◽  
pp. 242-245
Author(s):  
Xu Sheng Gan ◽  
Hao Lin Cui ◽  
Ya Rong Wu ◽  
Yue Bo Meng

In order to better describe the dynamic characteristics of aircraft through aerodynamic modeling, a Wavelet Neural Network (WNN) aerodynamic modeling method based on Kernel Principal Components Analysis (KPCA) is proposed. Firstly, the training samples are used to execute KPCA for extracting basic features of samples, and then using the extracted basic features, WNN aerodynamic model was established. The simulation result shows that, the modeling ability of the method proposed is better than that of another 3 methods. It can easily determine of model parameters. This enables it to be effective and feasible to establish the aerodynamic modeling for aircraft.


New Medit ◽  
2020 ◽  
Vol 19 (4) ◽  
Author(s):  
Riadh Béchir ◽  
Nadia Ounalli ◽  
Mhemed Jaouad ◽  
Sghaier Mongi

Sustainable development is regarded today as a goal which has to be reached by all countries. Therefore cooperation for development is more than ever necessary to face the global challenges such as poverty, human health, food crisis etc. This work aims to study the regional disparity that may exist between provinces in the south of tunisia. To this end, a data analysis applied to a set of regional development indicators using the principal components analysis method (ACP) was conducted.


Sign in / Sign up

Export Citation Format

Share Document