scholarly journals Gene Selection and Classification in Microarray Datasets using a Hybrid Approach of PCC-BPSO/GA with Multi Classifiers

2018 ◽  
Vol 14 (6) ◽  
pp. 868-880 ◽  
Author(s):  
Shilan S. Hameed ◽  
Fahmi F. Muhammad ◽  
Rohayanti Hassan ◽  
Faisal Saeed
2018 ◽  
Vol 8 (9) ◽  
pp. 1569 ◽  
Author(s):  
Shengbing Wu ◽  
Hongkun Jiang ◽  
Haiwei Shen ◽  
Ziyi Yang

In recent years, gene selection for cancer classification based on the expression of a small number of gene biomarkers has been the subject of much research in genetics and molecular biology. The successful identification of gene biomarkers will help in the classification of different types of cancer and improve the prediction accuracy. Recently, regularized logistic regression using the L 1 regularization has been successfully applied in high-dimensional cancer classification to tackle both the estimation of gene coefficients and the simultaneous performance of gene selection. However, the L 1 has a biased gene selection and dose not have the oracle property. To address these problems, we investigate L 1 / 2 regularized logistic regression for gene selection in cancer classification. Experimental results on three DNA microarray datasets demonstrate that our proposed method outperforms other commonly used sparse methods ( L 1 and L E N ) in terms of classification performance.


2005 ◽  
Vol 2005 (2) ◽  
pp. 132-138 ◽  
Author(s):  
Dechang Chen ◽  
Zhenqiu Liu ◽  
Xiaobin Ma ◽  
Dong Hua

Gene selection is an important issue in analyzing multiclass microarray data. Among many proposed selection methods, the traditional ANOVA F test statistic has been employed to identify informative genes for both class prediction (classification) and discovery problems. However, the F test statistic assumes an equal variance. This assumption may not be realistic for gene expression data. This paper explores other alternative test statistics which can handle heterogeneity of the variances. We study five such test statistics, which include Brown-Forsythe test statistic and Welch test statistic. Their performance is evaluated and compared with that of F statistic over different classification methods applied to publicly available microarray datasets.


2018 ◽  
Vol 5 (1) ◽  
pp. 1-12
Author(s):  
Sunanda Das ◽  
Asit Kumar Das

Microarray datasets have a wide application in bioinformatics research. Analysis to measure the expression level of thousands of genes of this kind of high-throughput data can help for finding the cause and subsequent treatment of any disease. There are many techniques in gene analysis to extract biologically relevant information from inconsistent and ambiguous data. In this paper, the concepts of functional dependency and closure of an attribute of database technology are used for finding the most important set of genes for cancer detection. Firstly, the method computes similarity factor between each pair of genes. Based on the similarity factors a set of gene dependency is formed from which closure set is obtained. Subsequently, conditional probability based interestingness measurements are used to determine the most informative gene for disease classification. The proposed method is applied on some publicly available cancerous gene expression dataset. The result shows the effectiveness and robustness of the algorithm.


2019 ◽  
Vol 10 (2) ◽  
pp. 21-33 ◽  
Author(s):  
Nassima Dif ◽  
Zakaria Elberrichi

Feature selection is the process of identifying good performing combinations of significant features among many possibilities. This preprocess improves the classification accuracy and facilitates the learning task. For this optimization problem, the authors have used a metaheuristics approach. Their main objective is to propose an enhanced version of the firefly algorithm as a wrapper approach by adding a recursive behavior to improve the search of the optimal solution. They applied SVM classifier to investigate the proposed method. For the authors experimentations, they have used the benchmark microarray datasets. The results show that the new enhanced recursive FA (RFA) outperforms the standard version with a reduction of dimensionality for all the datasets. As an example, for the leukemia microarray dataset, they have a perfect performance score of 100% with only 18 informative selected genes among the 7,129 of the original dataset. The RFA was competitive compared to other state-of-art approaches and achieved the best results for CNS, Ovarian cancer, MLL, prostate, Leukemia_4c, and lymphoma datasets.


Sign in / Sign up

Export Citation Format

Share Document