scholarly journals Comparison of supervised models in hepatocellular carcinoma tumor classification based on expression data using principal component analysis (PCA)

2020 ◽  
Author(s):  
Anggrainy Togi Marito Siregar ◽  
Titin Siswantining ◽  
Alhadi Bustamam ◽  
Devvi Sarwinda
2005 ◽  
Vol 2005 (2) ◽  
pp. 155-159 ◽  
Author(s):  
Zhenqiu Liu ◽  
Dechang Chen ◽  
Halima Bensmail

One important feature of the gene expression data is that the number of genesMfar exceeds the number of samplesN. Standard statistical methods do not work well whenN<M. Development of new methodologies or modification of existing methodologies is needed for the analysis of the microarray data. In this paper, we propose a novel analysis procedure for classifying the gene expression data. This procedure involves dimension reduction using kernel principal component analysis (KPCA) and classification with logistic regression (discrimination). KPCA is a generalization and nonlinear version of principal component analysis. The proposed algorithm was applied to five different gene expression datasets involving human tumor samples. Comparison with other popular classification methods such as support vector machines and neural networks shows that our algorithm is very promising in classifying gene expression data.


2015 ◽  
Author(s):  
Florian Wagner

Genome-wide expression profiling is a cost-efficient and widely used method to characterize heterogeneous populations of cells, tissues, biopsies, or other biological specimen. The exploratory analysis of such datasets typically relies on generic unsupervised methods, e.g. principal component analysis or hierarchical clustering. However, generic methods fail to exploit the significant amount of knowledge that exists about the molecular functions of genes. Here, I introduce GO-PCA, an unsupervised method that incorporates prior knowledge about gene functions in the form of gene ontology (GO) annotations. GO-PCA aims to discover and represent biological heterogeneity along all major axes of variation in a given dataset, while suppressing heterogeneity due to technical biases. To this end, GO-PCA combines principal component analysis (PCA) with nonparametric GO enrichment analysis, and uses the results to generate expression signatures based on small sets of functionally related genes. I first applied GO-PCA to expression data from diverse lineages of the human hematopoietic system, and obtained a small set of signatures that captured known cell characteristics for most lineages. I then applied the method to expression profiles of glioblastoma (GBM) tumor biopsies, and obtained signatures that were strongly associated with multiple previously described GBM subtypes. Surprisingly, GO-PCA discovered a cell cycle-related signature that exhibited significant differences between the Proneural and the prognostically favorable GBM CpG Island Methylator (G-CIMP) subtypes, suggesting that the G-CIMP subtype is characterized in part by lower mitotic activity. Previous expression-based classifications have failed to separate these subtypes, demonstrating that GO-PCA can detect heterogeneity that is missed by other methods. My results show that GO-PCA is a powerful and versatile expression-based method that facilitates exploration of large-scale expression data, without requiring additional types of experimental data. The low-dimensional representation generated by GO-PCA lends itself to interpretation, hypothesis generation, and further analysis.


Biometrics ◽  
2005 ◽  
Vol 61 (2) ◽  
pp. 632-634
Author(s):  
Luc Wouters ◽  
Hinrich W. Göhlmann ◽  
Luc Bijnens ◽  
Stefan U. Kass ◽  
Geert Molenberghs ◽  
...  

2017 ◽  
Vol 41 (8) ◽  
pp. 844-865 ◽  
Author(s):  
Mengque Liu ◽  
Xinyan Fan ◽  
Kuangnan Fang ◽  
Qingzhao Zhang ◽  
Shuangge Ma

Sign in / Sign up

Export Citation Format

Share Document