SVM-BT-RFE: An improved gene selection framework using Bayesian T-test embedded in support vector machine (recursive feature elimination) algorithm

Gene-expression microarray datasets often consist of a limited number of samples with a large number of gene-expression measurements, usually on the order of thousands. Therefore, dimensionality reduction is critical prior to any classification task. In this work, the iterative feature perturbation method (IFP), an embedded gene selector, is introduced and applied to four microarray cancer datasets: colon cancer, leukemia, Moffitt colon cancer, and lung cancer. We compare results obtained by IFP to those of support vector machine-recursive feature elimination (SVM-RFE) and the t-test as a feature filter using a linear support vector machine as the base classifier. Analysis of the intersection of gene sets selected by the three methods across the four datasets was done. Additional experiments included an initial pre-selection of the top 200 genes based on their p values. IFP and SVM-RFE were then applied on the reduced feature sets. These results showed up to 3.32% average performance improvement for IFP across the four datasets. A statistical analysis (using the Friedman/Holm test) for both scenarios showed the highest accuracies came from the t-test as a filter on experiments without gene pre-selection. IFP and SVM-RFE had greater classification accuracy after gene pre-selection. Analysis showed the t-test is a good gene selector for microarray data. IFP and SVM-RFE showed performance improvement on a reduced by t-test dataset. The IFP approach resulted in comparable or superior average class accuracy when compared to SVM-RFE on three of the four datasets. The same or similar accuracies can be obtained with different sets of genes.

Download Full-text

Recursive Feature Selection with Significant Variables of Support Vectors

Computational and Mathematical Methods in Medicine ◽

10.1155/2012/712542 ◽

2012 ◽

Vol 2012 ◽

pp. 1-12 ◽

Cited By ~ 3

Author(s):

Chen-An Tsai ◽

Chien-Hsun Huang ◽

Ching-Wei Chang ◽

Chun-Houh Chen

Keyword(s):

Support Vector Machine ◽

Gene Selection ◽

Classification Performance ◽

Recursive Feature Elimination ◽

Support Vector ◽

Good Prediction ◽

Simulation Experiments ◽

Ranking Criteria ◽

Good Prediction Accuracy ◽

Microarray Datasets

The development of DNA microarray makes researchers screen thousands of genes simultaneously and it also helps determine high- and low-expression level genes in normal and disease tissues. Selecting relevant genes for cancer classification is an important issue. Most of the gene selection methods use univariate ranking criteria and arbitrarily choose a threshold to choose genes. However, the parameter setting may not be compatible to the selected classification algorithms. In this paper, we propose a new gene selection method (SVM-t) based on the use oft-statistics embedded in support vector machine. We compared the performance to two similar SVM-based methods: SVM recursive feature elimination (SVMRFE) and recursive support vector machine (RSVM). The three methods were compared based on extensive simulation experiments and analyses of two published microarray datasets. In the simulation experiments, we found that the proposed method is more robust in selecting informative genes than SVMRFE and RSVM and capable to attain good classification performance when the variations of informative and noninformative genes are different. In the analysis of two microarray datasets, the proposed method yields better performance in identifying fewer genes with good prediction accuracy, compared to SVMRFE and RSVM.

Download Full-text

Gene Selection Using Gaussian Kernel Support Vector Machine Based Recursive Feature Elimination with Adaptive Kernel Width Strategy

Rough Sets and Knowledge Technology - Lecture Notes in Computer Science ◽

10.1007/11795131_116 ◽

2006 ◽

pp. 799-806 ◽

Cited By ~ 3

Author(s):

Yong Mao ◽

Xiaobo Zhou ◽

Zheng Yin ◽

Daoying Pi ◽

Youxian Sun ◽

...

Keyword(s):

Support Vector Machine ◽

Gene Selection ◽

Gaussian Kernel ◽

Recursive Feature Elimination ◽

Support Vector ◽

Kernel Width ◽

Kernel Support Vector Machine ◽

Adaptive Kernel

Download Full-text

Identification of risk genes associated with myocardial infarction based on the recursive feature elimination algorithm and support vector machine classifier

Molecular Medicine Reports ◽

10.3892/mmr.2017.8044 ◽

2017 ◽

Cited By ~ 1

Author(s):

Xiaoqiang Yang

Keyword(s):

Myocardial Infarction ◽

Support Vector Machine ◽

Support Vector Machine Classifier ◽

Recursive Feature Elimination ◽

Support Vector ◽

Risk Genes ◽

Elimination Algorithm

Download Full-text

Serum N-Glycosylation in Parkinson’s Disease: A Novel Approach for Potential Alterations

Molecules ◽

10.3390/molecules24122220 ◽

2019 ◽

Vol 24 (12) ◽

pp. 2220 ◽

Cited By ~ 4

Author(s):

Csaba Váradi ◽

Károly Nehéz ◽

Olivér Hornyák ◽

Béla Viskolcz ◽

Jonathan Bones

Keyword(s):

Parkinson’S Disease ◽

Support Vector Machine ◽

Parkinson's Disease ◽

Recursive Feature Elimination ◽

Support Vector ◽

Label Free ◽

Dynamic Coating ◽

Elimination Algorithm ◽

Novel Approach ◽

Label Free Quantitation

In this study, we present the application of a novel capillary electrophoresis (CE) method in combination with label-free quantitation and support vector machine-based feature selection (support vector machine-estimated recursive feature elimination or SVM-RFE) to identify potential glycan alterations in Parkinson’s disease. Specific focus was placed on the use of neutral coated capillaries, by a dynamic capillary coating strategy, to ensure stable and repeatable separations without the need of non-mass spectrometry (MS) friendly additives within the separation electrolyte. The developed online dynamic coating strategy was applied to identify serum N-glycosylation by CE-MS/MS in combination with exoglycosidase sequencing. The annotated structures were quantified in 15 controls and 15 Parkinson’s disease patients by label-free quantitation. Lower sialylation and increased fucosylation were found in Parkinson’s disease patients on tri-antennary glycans with 2 and 3 terminal sialic acids. The set of potential glycan alterations was narrowed by a recursive feature elimination algorithm resulting in the efficient classification of male patients.

Download Full-text

Multiclass Cancer Classification by Using Fuzzy Support Vector Machine and Binary Decision Tree With Gene Selection

Journal of Biomedicine and Biotechnology ◽

10.1155/jbb.2005.160 ◽

2005 ◽

Vol 2005 (2) ◽

pp. 160-171 ◽

Cited By ~ 44

Author(s):

Yong Mao ◽

Xiaobo Zhou ◽

Daoying Pi ◽

Youxian Sun ◽

Stephen T. C. Wong

Keyword(s):

Support Vector Machine ◽

Gene Selection ◽

Binary Classification ◽

Classification Tree ◽

Cancer Classification ◽

Recursive Feature Elimination ◽

Support Vector ◽

Fuzzy Support Vector Machine ◽

F Test ◽

Leukemia Data

We investigate the problems of multiclass cancer classification with gene selection from gene expression data. Two different constructed multiclass classifiers with gene selection are proposed, which are fuzzy support vector machine (FSVM) with gene selection and binary classification tree based on SVM with gene selection. Using F test and recursive feature elimination based on SVM as gene selection methods, binary classification tree based on SVM with F test, binary classification tree based on SVM with recursive feature elimination based on SVM, and FSVM with recursive feature elimination based on SVM are tested in our experiments. To accelerate computation, preselecting the strongest genes is also used. The proposed techniques are applied to analyze breast cancer data, small round blue-cell tumors, and acute leukemia data. Compared to existing multiclass cancer classifiers and binary classification tree based on SVM with F test or binary classification tree based on SVM with recursive feature elimination based on SVM mentioned in this paper, FSVM based on recursive feature elimination based on SVM can find most important genes that affect certain types of cancer with high recognition accuracy.

Download Full-text

Feature clustering based support vector machine recursive feature elimination for gene selection

Applied Intelligence ◽

10.1007/s10489-017-0992-2 ◽

2017 ◽

Vol 48 (3) ◽

pp. 594-607 ◽

Cited By ~ 26

Author(s):

Xiaojuan Huang ◽

Li Zhang ◽

Bangjun Wang ◽

Fanzhang Li ◽

Zhao Zhang

Keyword(s):

Support Vector Machine ◽

Gene Selection ◽

Recursive Feature Elimination ◽

Support Vector ◽

Feature Clustering

Download Full-text

Fast Gaussian kernel support vector machine recursive feature elimination algorithm

Applied Intelligence ◽

10.1007/s10489-021-02298-2 ◽

2021 ◽

Author(s):

Li Zhang ◽

Xiaohan Zheng ◽

Qingqing Pang ◽

Weida Zhou

Keyword(s):

Support Vector Machine ◽

Gaussian Kernel ◽

Recursive Feature Elimination ◽

Support Vector ◽

Elimination Algorithm ◽

Kernel Support Vector Machine

Download Full-text

Feature gene selection for Chinese hamster classification based on support vector machine

Journal of Computer Applications ◽

10.3724/sp.j.1087.2011.00584 ◽

2011 ◽

Vol 31 (2) ◽

pp. 584-586

Author(s):

Jun-li YANG ◽

Tian-fu LIU

Keyword(s):

Support Vector Machine ◽

Gene Selection ◽

Chinese Hamster ◽

Support Vector ◽

Selection For

Download Full-text

Realizing an Integrated Multistage Support Vector Machine Model for Augmented Recognition of Unipolar Depression

Electronics ◽

10.3390/electronics9040647 ◽

2020 ◽

Vol 9 (4) ◽

pp. 647

Author(s):

Kathiravan Srinivasan ◽

Nivedhitha Mahendran ◽

Durai Raj Vincent ◽

Chuan-Yu Chang ◽

Shabbir Syed-Abdul

Keyword(s):

Support Vector Machine ◽

Support Vector Machine Model ◽

Sampling Technique ◽

Unipolar Depression ◽

Clinical Depression ◽

Majority Voting ◽

Recursive Feature Elimination ◽

Support Vector ◽

Daily Routine ◽

Machine Model

Unipolar depression (UD), also referred to as clinical depression, appears to be a widespread mental disorder around the world. Further, this is a vital state related to a person’s health that influences his/her daily routine. Besides, this state also influences the person’s frame of mind, behavior, and several body functionalities like sleep, appetite, and also it can cause a scenario where a person could harm himself/herself or others. In several cases, it becomes an arduous task to detect UD, since, it is a state of comorbidity. For that reason, this research proposes a more convenient approach for the physicians to detect the state of clinical depression at an initial phase using an integrated multistage support vector machine model. Initially, the dataset is preprocessed using multiple imputation by chained equations (MICE) technique. Then, for selecting the appropriate features, the support vector machine-based recursive feature elimination (SVM RFE) is deployed. Subsequently, the integrated multistage support vector machine classifier is built by employing the bagging random sampling technique. Finally, the experimental outcomes indicate that the proposed integrated multistage support vector machine model surpasses methods such as logistic regression, multilayer perceptron, random forest, and bagging SVM (majority voting), in terms of overall performance.

Download Full-text