Feature Grouping and Selection on High-Dimensional Microarray Data

In this paper, we propose a novel method for sparse logistic regression with non-convex regularization Lp (p <1). Based on smooth approximation, we develop several fast algorithms for learning the classifier that is applicable to high dimensional dataset such as gene expression. To the best of our knowledge, these are the first algorithms to perform sparse logistic regression with an Lp and elastic net (Le) penalty. The regularization parameters are decided through maximizing the area under the ROC curve (AUC) of the test data. Experimental results on methylation and microarray data attest the accuracy, sparsity, and efficiency of the proposed algorithms. Biomarkers identified with our methods are compared with that in the literature. Our computational results show that Lp Logistic regression (p <1) outperforms the L1 logistic regression and SCAD SVM. Software is available upon request from the first author.

Download Full-text

Improved Lasso (ILASSO) for Gene Selection and Classification in High Dimensional DNA Microarray Data

International Journal of Online and Biomedical Engineering (iJOE) ◽

10.3991/ijoe.v17i08.24601 ◽

2021 ◽

Vol 17 (08) ◽

pp. 91

Author(s):

Isah Aliyu Kargi ◽

Norazlina Bint Ismail ◽

Ismail Bin Mohamad

Keyword(s):

Microarray Data ◽

Gene Selection ◽

Poor Performance ◽

Real Data ◽

Ordinary Least Squares ◽

Elastic Net ◽

Likelihood Method ◽

High Dimensional ◽

Initial Weight ◽

Highly Correlated

<p class="0abstract">Classification and selection of gene in high dimensional microarray data has become a challenging problem in molecular biology and genetics. Penalized Adaptive likelihood method has been employed recently for classification of cancer to address both gene selection consistency and estimation of gene coefficients in high dimensional data simultaneously. Many studies from the literature have proposed the use of ordinary least squares (OLS), maximum likelihood estimation (MLE) and Elastic net as the initial weight in the Adaptive elastic net, but in high dimensional microarray data the MLE and OLS are not suitable. Likewise, considering the Elastic net as the initial weight in Adaptive elastic yields a poor performance, because the ridge penalty in the Elastic net grouped coefficient of highly correlated genes closer to each other. As a result, the estimator fails to differentiate coefficients of highly correlated genes that have different sign being grouped together. To tackle this issue, the present study proposed Improved LASSO (ILASSO) estimator which add the ridge penalty to the original LASSO with an Adaptive weight to both and simultaneously. Results from the real data indicated that ILASSO has a better performance compared to other methods in terms of the number of gene selected, classification precision, Sensitivity and Specificity.</p>

Download Full-text

Feature Grouping and Selection on High-Dimensional Microarray Data

Scatter search for high-dimensional feature selection using feature grouping

An efficient approach for feature construction of high-dimensional microarray data by random projections

Feature Selection Algorithms for Mining High Dimensional DNA Microarray Data

Naive Bayes combined with partial least squares for classification of high dimensional microarray data

Ensemble of Deep Learning Approach for the Feature Selection from High-Dimensional Microarray Data

Support Vector Machine Classification for High Dimensional Microarray Data Analysis, With Applications in Cancer Research

High-dimensional Microarray Data Analysis

A new method for identifying bivariate differential expression in high dimensional microarray data using quadratic discriminant analysis

Sparse Logistic Regression with Lp Penalty for Biomarker Identification

Improved Lasso (ILASSO) for Gene Selection and Classification in High Dimensional DNA Microarray Data

Export Citation Format