Gene Selection in Cancer Classification Using Sparse Logistic Regression with L1/2 Regularization

Shengbing Wu; Hongkun Jiang; Haiwei Shen; Ziyi Yang

doi:10.3390/app8091569

Gene Selection in Cancer Classification Using Sparse Logistic Regression with L1/2 Regularization

Applied Sciences ◽

10.3390/app8091569 ◽

2018 ◽

Vol 8 (9) ◽

pp. 1569 ◽

Cited By ~ 3

Author(s):

Shengbing Wu ◽

Hongkun Jiang ◽

Haiwei Shen ◽

Ziyi Yang

Keyword(s):

Logistic Regression ◽

Gene Selection ◽

Classification Performance ◽

Cancer Classification ◽

Sparse Logistic Regression ◽

The Subject ◽

Selection For ◽

Microarray Datasets ◽

Sparse Methods

In recent years, gene selection for cancer classification based on the expression of a small number of gene biomarkers has been the subject of much research in genetics and molecular biology. The successful identification of gene biomarkers will help in the classification of different types of cancer and improve the prediction accuracy. Recently, regularized logistic regression using the L 1 regularization has been successfully applied in high-dimensional cancer classification to tackle both the estimation of gene coefficients and the simultaneous performance of gene selection. However, the L 1 has a biased gene selection and dose not have the oracle property. To address these problems, we investigate L 1 / 2 regularized logistic regression for gene selection in cancer classification. Experimental results on three DNA microarray datasets demonstrate that our proposed method outperforms other commonly used sparse methods ( L 1 and L E N ) in terms of classification performance.

Download Full-text

Hybrid Correlation based Gene Selection for Accurate Cancer Classification of Gene Expression Data

International Journal of Computer Applications ◽

10.5120/6170-8591 ◽

2012 ◽

Vol 43 (14) ◽

pp. 13-18 ◽

Cited By ~ 3

Author(s):

Vibhav PrakashSingh ◽

Singh Gaurav Arvind ◽

Arindam G Mahapatra

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Gene Selection ◽

Cancer Classification ◽

Expression Data ◽

Selection For

Download Full-text

Gene selection in cancer classification using sparse logistic regression with Bayesian regularization

Bioinformatics ◽

10.1093/bioinformatics/btl386 ◽

2006 ◽

Vol 22 (19) ◽

pp. 2348-2355 ◽

Cited By ~ 139

Author(s):

G. C. Cawley ◽

N. L. C. Talbot

Keyword(s):

Logistic Regression ◽

Gene Selection ◽

Cancer Classification ◽

Bayesian Regularization ◽

Sparse Logistic Regression

Download Full-text

Gene selection via BPSO and Backward generation for cancer classification

RAIRO - Operations Research ◽

10.1051/ro/2018059 ◽

2019 ◽

Vol 53 (1) ◽

pp. 269-288 ◽

Cited By ~ 2

Author(s):

Ahmed Bir-Jmel ◽

Sidi Mohamed Douiri ◽

Souad Elbernoussi

Keyword(s):

Gene Selection ◽

Cancer Classification ◽

Selection Problem ◽

Small Subset ◽

Large Set ◽

High Dimensions ◽

Hybrid Approaches ◽

Filter Methods ◽

Microarray Datasets

Gene expression data (DNA microarray) enable researchers to simultaneously measure the levels of expression of several thousand genes. These levels of expression are very important in the classification of different types of tumors. In this work, we are interested in gene selection, which is an essential step in the data pre-processing for cancer classification. This selection makes it possible to represent a small subset of genes from a large set, and to eliminate the redundant, irrelevant or noisy genes. The combinatorial nature of the selection problem requires the development of specific techniques such as filters and Wrappers, or hybrids combining several optimization processes. In this context, we propose two hybrid approaches (RBPSO-1NN and FBPSO-SVM) for the gene selection problem, based on the combination of the filter methods (the Fisher criterion and the ReliefF algorithm), the BPSO metaheuristic algorithms and the Backward algorithm using the classifiers (SVM and 1NN) for the evaluation of the relevance of the candidate subsets. In order to verify the performance of our methods, we have tested them on eight well-known microarray datasets of high dimensions varying from 2308 to 11225 genes. The experiments carried out on the different datasets show that our methods prove to be very competitive with the existing works.

Download Full-text

Sparse logistic regression with a L1/2 penalty for gene selection in cancer classification

BMC Bioinformatics ◽

10.1186/1471-2105-14-198 ◽

2013 ◽

Vol 14 (1) ◽

Cited By ~ 77

Author(s):

Yong Liang ◽

Cheng Liu ◽

Xin-Ze Luan ◽

Kwong-Sak Leung ◽

Tak-Ming Chan ◽

...

Keyword(s):

Logistic Regression ◽

Gene Selection ◽

Cancer Classification ◽

Sparse Logistic Regression

Download Full-text

Gene Selection for Cancer Classification using a New Hybrid of Binary Black Hole Algorithm

2020 28th Signal Processing and Communications Applications Conference (SIU) ◽

10.1109/siu49456.2020.9302351 ◽

2020 ◽

Author(s):

Elnaz Pashaei ◽

Elham Pashaei

Keyword(s):

Black Hole ◽

Gene Selection ◽

Cancer Classification ◽

Binary Black Hole ◽

Selection For ◽

Black Hole Algorithm

Download Full-text

GENE SELECTION USING LOGISTIC REGRESSIONS BASED ON AIC, BIC AND MDL CRITERIA

New Mathematics and Natural Computation ◽

10.1142/s179300570500007x ◽

2005 ◽

Vol 01 (01) ◽

pp. 129-145 ◽

Cited By ~ 15

Author(s):

XIAOBO ZHOU ◽

XIAODONG WANG ◽

EDWARD R. DOUGHERTY

Keyword(s):

Logistic Regression ◽

Regression Model ◽

Logistic Regression Model ◽

Gene Selection ◽

Information Criterion ◽

Cancer Classification ◽

Data Sets ◽

Classification Methods ◽

Gene Expressions ◽

Experimental Conditions

In microarray-based cancer classification, gene selection is an important issue owing to the large number of variables (gene expressions) and the small number of experimental conditions. Many gene-selection and classification methods have been proposed; however most of these treat gene selection and classification separately, and not under the same model. We propose a Bayesian approach to gene selection using the logistic regression model. The Akaike information criterion (AIC), the Bayesian information criterion (BIC) and the minimum description length (MDL) principle are used in constructing the posterior distribution of the chosen genes. The same logistic regression model is then used for cancer classification. Fast implementation issues for these methods are discussed. The proposed methods are tested on several data sets including those arising from hereditary breast cancer, small round blue-cell tumors, lymphoma, and acute leukemia. The experimental results indicate that the proposed methods show high classification accuracies on these data sets. Some robustness and sensitivity properties of the proposed methods are also discussed. Finally, mixing logistic-regression based gene selection with other classification methods and mixing logistic-regression-based classification with other gene-selection methods are considered.

Download Full-text

Cancer classification and biomarker selection via a penalized logsum network-based logistic regression model

Technology and Health Care ◽

10.3233/thc-218026 ◽

2021 ◽

Vol 29 ◽

pp. 287-295

Author(s):

Zhiming Zhou ◽

Haihui Huang ◽

Yong Liang

Keyword(s):

Logistic Regression ◽

Regression Model ◽

Logistic Regression Model ◽

Gene Selection ◽

Simulated Data ◽

Biological Data ◽

Cancer Classification ◽

High Dimensional ◽

Data Set ◽

Biomarker Selection

BACKGROUND: In genome research, it is particularly important to identify molecular biomarkers or signaling pathways related to phenotypes. Logistic regression model is a powerful discrimination method that can offer a clear statistical explanation and obtain the classification probability of classification label information. However, it is unable to fulfill biomarker selection. OBJECTIVE: The aim of this paper is to give the model efficient gene selection capability. METHODS: In this paper, we propose a new penalized logsum network-based regularization logistic regression model for gene selection and cancer classification. RESULTS: Experimental results on simulated data sets show that our method is effective in the analysis of high-dimensional data. For a large data set, the proposed method has achieved 89.66% (training) and 90.02% (testing) AUC performances, which are, on average, 5.17% (training) and 4.49% (testing) better than mainstream methods. CONCLUSIONS: The proposed method can be considered a promising tool for gene selection and cancer classification of high-dimensional biological data.

Download Full-text

Gene selection for classification of cancers using probabilistic model building genetic algorithm

Biosystems ◽

10.1016/j.biosystems.2005.07.003 ◽

2005 ◽

Vol 82 (3) ◽

pp. 208-225 ◽

Cited By ~ 23

Author(s):

Topon Kumar Paul ◽

Hitoshi Iba

Keyword(s):

Genetic Algorithm ◽

Probabilistic Model ◽

Model Building ◽

Gene Selection ◽

Selection For

Download Full-text

Classification of breast lesions in ultrasonography using sparse logistic regression and morphology-based texture features

Medical Physics ◽

10.1002/mp.13082 ◽

2018 ◽

Vol 45 (9) ◽

pp. 4112-4124 ◽

Cited By ~ 8

Author(s):

Hoda Nemat ◽

Hamid Fehri ◽

Nasrin Ahmadinejad ◽

Alejandro F. Frangi ◽

Ali Gooya

Keyword(s):

Logistic Regression ◽

Texture Features ◽

Breast Lesions ◽

Sparse Logistic Regression

Download Full-text

Optimal Gene Selection for Cancer Classification with Partial Correlation and k-Nearest Neighbor Classifier

PRICAI 2004: Trends in Artificial Intelligence - Lecture Notes in Computer Science ◽

10.1007/978-3-540-28633-2_75 ◽

2004 ◽

pp. 713-722 ◽

Cited By ~ 1

Author(s):

Si-Ho Yoo ◽

Sung-Bae Cho

Keyword(s):

Partial Correlation ◽

Nearest Neighbor ◽

Gene Selection ◽

Cancer Classification ◽

K Nearest Neighbor ◽

Nearest Neighbor Classifier ◽

Selection For ◽

Neighbor Classifier

Download Full-text