Cancer Data Classification by Quantum Inspired Immune Clone Optimization-based Optimal Feature Selection using Gene Expression Data: Deep Learning Approach (Preprint)
UNSTRUCTURED Gene selection is considered as the fundamental process under the bioinformatics field, as the cancer classification accuracy completely focused on the genes, which provides biological relevance to the classifying problems. The accurate classification of diverse types of tumor is seeking immense demand in the cancer diagnosis task. However, the existing methodologies pertain to cancer classification are mostly clinical basis, and so its diagnosis capability is limited. Nowadays, the significant problems of cancer diagnosis are solved by the utilization of gene expression data, by which, the researchers have been introducing many possibilities to diagnose cancer in an appropriate and effective way. This paper plans to develop the cancer data classification using gene expression data. Initially, five benchmark gene expression datasets, i.e., “Colon cancer, defused B-cell Lymphoma, Leukaemia, Wisconsin Diagnostic Breast Cancer and Wisconsin Breast Cancer Data” are collected for performing the experiment. The proposed classification model involves three main phases: “(a) Feature extraction, (b) Optimal Feature Selection, and (c) Classification”. From the collected gene expression data, the feature extraction is performed using the first order and second-order statistical measures after data pre-processing. In order to diminish the length of the feature vectors, optimal feature selection is performed, in which a new meta-heuristic algorithm termed as Quantum Inspired Immune Clone Optimization Algorithm (QICO) is used. Once the relevant features are selected, the classification is performed by a deep learning model called Recurrent Neural Network (RNN). Moreover, the number of hidden neurons of RNN is optimized by the same Q-ICOA. The optimal feature selection and classification is performed for selecting the most suitable features and thus maximizing the classification accuracy. Finally, the experimental analysis reveals that the proposed model outperforms the QICO-based feature selection over other heuristic-based feature selection and optimized RNN over other machine learning algorithms