A Comparative Study of Gene Selection Methods for Microarray Cancer Classification

In gene selection for cancer classification using microarray data, we define an eigenvalue-ratio statistic to measure a gene's contribution to the joint discriminability when this gene is included into a set of genes. Based on this eigenvalue-ratio statistic, we define a novel hypothesis testing for gene statistical redundancy and propose two gene selection methods. Simulation studies illustrate the agreement between statistical redundancy testing and gene selection methods. Real data examples show the proposed gene selection methods can select a compact gene subset which can not only be used to build high quality cancer classifiers but also show biological relevance.

Download Full-text

A comparative study of feature selection methods for probabilistic neural networks in cancer classification

Proceedings. 15th IEEE International Conference on Tools with Artificial Intelligence ◽

10.1109/tai.2003.1250224 ◽

2004 ◽

Cited By ~ 14

Author(s):

Chenn-Jung Huang ◽

Wei-Chen Liao

Keyword(s):

Neural Networks ◽

Feature Selection ◽

Comparative Study ◽

Cancer Classification ◽

Selection Methods ◽

Probabilistic Neural Networks

Download Full-text

An Improved Elastic Net for Cancer Classification and Gene Selection

ACTA AUTOMATICA SINICA ◽

10.3724/sp.j.1004.2010.00976 ◽

2010 ◽

Vol 36 (7) ◽

pp. 976-981 ◽

Cited By ~ 8

Author(s):

Jun-Tao LI ◽

Ying-Min JIA

Keyword(s):

Gene Selection ◽

Elastic Net ◽

Cancer Classification

Download Full-text

Lung adenocarcinoma and lung squamous cell carcinoma cancer classification, biomarker identification, and gene expression analysis using overlapping feature selection methods

Scientific Reports ◽

10.1038/s41598-021-92725-8 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Joe W. Chen ◽

Joseph Dhahbi

Keyword(s):

Gene Expression ◽

Feature Selection ◽

Expression Analysis ◽

Gene Expression Analysis ◽

Biomarker Discovery ◽

Lung Squamous Cell Carcinoma ◽

Cancer Classification ◽

Selection Methods ◽

Biomarker Identification ◽

Overlapping Method

AbstractLung cancer is one of the deadliest cancers in the world. Two of the most common subtypes, lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC), have drastically different biological signatures, yet they are often treated similarly and classified together as non-small cell lung cancer (NSCLC). LUAD and LUSC biomarkers are scarce, and their distinct biological mechanisms have yet to be elucidated. To detect biologically relevant markers, many studies have attempted to improve traditional machine learning algorithms or develop novel algorithms for biomarker discovery. However, few have used overlapping machine learning or feature selection methods for cancer classification, biomarker identification, or gene expression analysis. This study proposes to use overlapping traditional feature selection or feature reduction techniques for cancer classification and biomarker discovery. The genes selected by the overlapping method were then verified using random forest. The classification statistics of the overlapping method were compared to those of the traditional feature selection methods. The identified biomarkers were validated in an external dataset using AUC and ROC analysis. Gene expression analysis was then performed to further investigate biological differences between LUAD and LUSC. Overall, our method achieved classification results comparable to, if not better than, the traditional algorithms. It also identified multiple known biomarkers, and five potentially novel biomarkers with high discriminating values between LUAD and LUSC. Many of the biomarkers also exhibit significant prognostic potential, particularly in LUAD. Our study also unraveled distinct biological pathways between LUAD and LUSC.

Download Full-text

Gene Selection for Cancer Classification using a New Hybrid of Binary Black Hole Algorithm

2020 28th Signal Processing and Communications Applications Conference (SIU) ◽

10.1109/siu49456.2020.9302351 ◽

2020 ◽

Author(s):

Elnaz Pashaei ◽

Elham Pashaei

Keyword(s):

Black Hole ◽

Gene Selection ◽

Cancer Classification ◽

Binary Black Hole ◽

Selection For ◽

Black Hole Algorithm

Download Full-text

Comparative study on total nitrogen prediction in wastewater treatment plant and effect of various feature selection methods on machine learning algorithms performance

Journal of Water Process Engineering ◽

10.1016/j.jwpe.2021.102033 ◽

2021 ◽

Vol 41 ◽

pp. 102033

Author(s):

Faramarz Bagherzadeh ◽

Mohamad-Javad Mehrani ◽

Milad Basirifard ◽

Javad Roostaei

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Wastewater Treatment ◽

Comparative Study ◽

Total Nitrogen ◽

Wastewater Treatment Plant ◽

Learning Algorithms ◽

Treatment Plant ◽

Machine Learning Algorithms ◽

Selection Methods

Download Full-text

A Hybrid Barnacles Mating Optimizer Algorithm With Support Vector Machines for Gene Selection of Microarray Cancer Classification

IEEE Access ◽

10.1109/access.2021.3075942 ◽

2021 ◽

Vol 9 ◽

pp. 64895-64905

Author(s):

Essam H. Houssein ◽

Diaa Salama Abdelminaam ◽

Hager N. Hassan ◽

Mustafa M. Al-Sayed ◽

Emad Nabil

Keyword(s):

Support Vector Machines ◽

Gene Selection ◽

Cancer Classification ◽

Support Vector ◽

Vector Machines ◽

Selection Of

Download Full-text

GENE SELECTION USING LOGISTIC REGRESSIONS BASED ON AIC, BIC AND MDL CRITERIA

New Mathematics and Natural Computation ◽

10.1142/s179300570500007x ◽

2005 ◽

Vol 01 (01) ◽

pp. 129-145 ◽

Cited By ~ 15

Author(s):

XIAOBO ZHOU ◽

XIAODONG WANG ◽

EDWARD R. DOUGHERTY

Keyword(s):

Logistic Regression ◽

Regression Model ◽

Logistic Regression Model ◽

Gene Selection ◽

Information Criterion ◽

Cancer Classification ◽

Data Sets ◽

Classification Methods ◽

Gene Expressions ◽

Experimental Conditions

In microarray-based cancer classification, gene selection is an important issue owing to the large number of variables (gene expressions) and the small number of experimental conditions. Many gene-selection and classification methods have been proposed; however most of these treat gene selection and classification separately, and not under the same model. We propose a Bayesian approach to gene selection using the logistic regression model. The Akaike information criterion (AIC), the Bayesian information criterion (BIC) and the minimum description length (MDL) principle are used in constructing the posterior distribution of the chosen genes. The same logistic regression model is then used for cancer classification. Fast implementation issues for these methods are discussed. The proposed methods are tested on several data sets including those arising from hereditary breast cancer, small round blue-cell tumors, lymphoma, and acute leukemia. The experimental results indicate that the proposed methods show high classification accuracies on these data sets. Some robustness and sensitivity properties of the proposed methods are also discussed. Finally, mixing logistic-regression based gene selection with other classification methods and mixing logistic-regression-based classification with other gene-selection methods are considered.

Download Full-text