A novel gene selection algorithm for cancer identification based on random forest and particle swarm optimization

Author(s):  
Elnaz Pashaei ◽  
Mustafa Ozen ◽  
Nizamettin Aydin
2021 ◽  
pp. 1-15
Author(s):  
Zhaozhao Xu ◽  
Derong Shen ◽  
Yue Kou ◽  
Tiezheng Nie

Due to high-dimensional feature and strong correlation of features, the classification accuracy of medical data is not as good enough as expected. feature selection is a common algorithm to solve this problem, and selects effective features by reducing the dimensionality of high-dimensional data. However, traditional feature selection algorithms have the blindness of threshold setting and the search algorithms are liable to fall into a local optimal solution. Based on it, this paper proposes a hybrid feature selection algorithm combining ReliefF and Particle swarm optimization. The algorithm is mainly divided into three parts: Firstly, the ReliefF is used to calculate the feature weight, and the features are ranked by the weight. Then ranking feature is grouped according to the density equalization, where the density of features in each group is the same. Finally, the Particle Swarm Optimization algorithm is used to search the ranking feature groups, and the feature selection is performed according to a new fitness function. Experimental results show that the random forest has the highest classification accuracy on the features selected. More importantly, it has the least number of features. In addition, experimental results on 2 medical datasets show that the average accuracy of random forest reaches 90.20%, which proves that the hybrid algorithm has a certain application value.


2012 ◽  
Vol 2012 ◽  
pp. 1-7 ◽  
Author(s):  
Mohammad Javad Abdi ◽  
Seyed Mohammad Hosseini ◽  
Mansoor Rezghi

We develop a detection model based on support vector machines (SVMs) and particle swarm optimization (PSO) for gene selection and tumor classification problems. The proposed model consists of two stages: first, the well-known minimum redundancy-maximum relevance (mRMR) method is applied to preselect genes that have the highest relevance with the target class and are maximally dissimilar to each other. Then, PSO is proposed to form a novel weighted SVM (WSVM) to classify samples. In this WSVM, PSO not only discards redundant genes, but also especially takes into account the degree of importance of each gene and assigns diverse weights to the different genes. We also use PSO to find appropriate kernel parameters since the choice of gene weights influences the optimal kernel parameters and vice versa. Experimental results show that the proposed mRMR-PSO-WSVM model achieves highest classification accuracy on two popular leukemia and colon gene expression datasets obtained from DNA microarrays. Therefore, we can conclude that our proposed method is very promising compared to the previously reported results.


Author(s):  
Amit Kumar ◽  
T. V. Vijay Kumar

A data warehouse, which is a central repository of the detailed historical data of an enterprise, is designed primarily for supporting high-volume analytical processing in order to support strategic decision-making. Queries for such decision-making are exploratory, long and intricate in nature and involve the summarization and aggregation of data. Furthermore, the rapidly growing volume of data warehouses makes the response times of queries substantially large. The query response times need to be reduced in order to reduce delays in decision-making. Materializing an appropriate subset of views has been found to be an effective alternative for achieving acceptable response times for analytical queries. This problem, being an NP-Complete problem, can be addressed using swarm intelligence techniques. One such technique, i.e., the similarity interaction operator-based particle swarm optimization (SIPSO), has been used to address this problem. Accordingly, a SIPSO-based view selection algorithm (SIPSOVSA), which selects the Top-[Formula: see text] views from a multidimensional lattice, has been proposed in this paper. Experimental comparison with the most fundamental view selection algorithm shows that the former is able to select relatively better quality Top-[Formula: see text] views for materialization. As a result, the views selected using SIPSOVSA improve the performance of analytical queries that lead to greater efficiency in decision-making.


Author(s):  
Prativa Agarwalla ◽  
Sumitra Mukhopadhyay

Pathway information for cancer detection helps to find co-regulated gene groups whose collective expression is strongly associated with cancer development. In this paper, a collaborative multi-swarm binary particle swarm optimization (MS-BPSO) based gene selection technique is proposed that outperforms to identify the pathway marker genes. We have compared our proposed method with various statistical and pathway based gene selection techniques for different popular cancer datasets as well as a detailed comparative study is illustrated using different meta-heuristic algorithms like binary coded particle swarm optimization (BPSO), binary coded differential evolution (BDE), binary coded artificial bee colony (BABC) and genetic algorithm (GA). Experimental results show that the proposed MS-BPSO based method performs significantly better and the improved multi swarm concept generates a good subset of pathway markers which provides more effective insight to the gene-disease association with high accuracy and reliability.


Sign in / Sign up

Export Citation Format

Share Document