scholarly journals Partial least squares and logistic regression random-effects estimates for gene selection in supervised classification of gene expression data

2013 ◽  
Vol 46 (4) ◽  
pp. 697-709 ◽  
Author(s):  
Arief Gusnanto ◽  
Alexander Ploner ◽  
Farag Shuweihdi ◽  
Yudi Pawitan
2007 ◽  
Vol 11 (2) ◽  
pp. 219-222 ◽  
Author(s):  
Mohd Saberi Mohamad ◽  
Sigeru Omatu ◽  
Safaai Deris ◽  
Siti Zaiton Mohd Hashim

Mathematics ◽  
2019 ◽  
Vol 7 (5) ◽  
pp. 457 ◽  
Author(s):  
Md Sarker ◽  
Michael Pokojovy ◽  
Sangjin Kim

In high-dimensional gene expression data analysis, the accuracy and reliability of cancer classification and selection of important genes play a very crucial role. To identify these important genes and predict future outcomes (tumor vs. non-tumor), various methods have been proposed in the literature. But only few of them take into account correlation patterns and grouping effects among the genes. In this article, we propose a rank-based modification of the popular penalized logistic regression procedure based on a combination of ℓ 1 and ℓ 2 penalties capable of handling possible correlation among genes in different groups. While the ℓ 1 penalty maintains sparsity, the ℓ 2 penalty induces smoothness based on the information from the Laplacian matrix, which represents the correlation pattern among genes. We combined logistic regression with the BH-FDR (Benjamini and Hochberg false discovery rate) screening procedure and a newly developed rank-based selection method to come up with an optimal model retaining the important genes. Through simulation studies and real-world application to high-dimensional colon cancer gene expression data, we demonstrated that the proposed rank-based method outperforms such currently popular methods as lasso, adaptive lasso and elastic net when applied both to gene selection and classification.


Sign in / Sign up

Export Citation Format

Share Document