A NOTE ON CLASSIFICATION OF GENE EXPRESSION DATA USING SUPPORT VECTOR MACHINES

2003 ◽  
Vol 11 (01) ◽  
pp. 43-56 ◽  
Author(s):  
KRZYSZTOF FUJAREWICZ ◽  
MAREK KIMMEL ◽  
JOANNA RZESZOWSKA-WOLNY ◽  
ANDRZEJ SWIERNIAK

Microarrays provide a new technique of measuring gene expression that attracted a lot of research interest in recent years. It has been suggested that gene expression data from microarrays (biochips) can be utilized in many biomedical areas, for example in cancer classification. Whereas several, new and existing, methods of classification has been tested, a selection of proper (optimal) set of genes, which expression serves during classification, is still an open problem. In this paper we propose a heuristic method of choosing suboptimal set of genes by using support vector machines (SVM). Obtained set of genes optimizes leave-one-out cross-validation error. The method is tested on microarray gene expression data of samples of two cancer types: acute myeloid leukemia (AML) and acute lymphoblastic leukemia (ALL). The results show that quality of classification is much better than for sets obtained using other methods of feature selection. In addition, we demonstrate that maximum separation in a training data set may lead to deterioration of performance in an independent validation data set, a phenomenon akin to overfitting.

PLoS ONE ◽  
2010 ◽  
Vol 5 (6) ◽  
pp. e11267 ◽  
Author(s):  
Ana Lisa V. Gomes ◽  
Lawrence J. K. Wee ◽  
Asif M. Khan ◽  
Laura H. V. G. Gil ◽  
Ernesto T. A. Marques ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document