Hybrid Rough Set with Black Hole Optimization Based Feature Selection Algorithm for Protein Structure Prediction

In this paper, a new approach for hybridizing Rough Set Quick Reduct and Relative Reduct approaches with Black Hole optimization algorithm is proposed. This algorithm is inspired of black holes. A black hole is a region of spacetime where the gravitational field is so strong that nothing— not even light— that enters this region can ever escape from it. Every black hole has a mass and charge. In this Algorithm, each solution of problem is considered as a black hole and gravity force is used for global search and the electrical force is used for local search. The proposed algorithm is compared with leading algorithms such as, Rough Set Quick Reduct, Rough Set Relative Reduct, Rough Set particle swarm optimization based Quick Reduct, Rough Set based PSO Relative Reduct, Rough Set Harmony Search based Quick Reduct, and Rough Set Harmony Search based Relative Reduct.

2021 ◽  
pp. 1-15
Author(s):  
Zhaozhao Xu ◽  
Derong Shen ◽  
Yue Kou ◽  
Tiezheng Nie

Due to high-dimensional feature and strong correlation of features, the classification accuracy of medical data is not as good enough as expected. feature selection is a common algorithm to solve this problem, and selects effective features by reducing the dimensionality of high-dimensional data. However, traditional feature selection algorithms have the blindness of threshold setting and the search algorithms are liable to fall into a local optimal solution. Based on it, this paper proposes a hybrid feature selection algorithm combining ReliefF and Particle swarm optimization. The algorithm is mainly divided into three parts: Firstly, the ReliefF is used to calculate the feature weight, and the features are ranked by the weight. Then ranking feature is grouped according to the density equalization, where the density of features in each group is the same. Finally, the Particle Swarm Optimization algorithm is used to search the ranking feature groups, and the feature selection is performed according to a new fitness function. Experimental results show that the random forest has the highest classification accuracy on the features selected. More importantly, it has the least number of features. In addition, experimental results on 2 medical datasets show that the average accuracy of random forest reaches 90.20%, which proves that the hybrid algorithm has a certain application value.


2013 ◽  
Vol 2 (4) ◽  
pp. 33-46 ◽  
Author(s):  
P. K. Nizar Banu ◽  
H. Hannah Inbarani

As the micro array databases increases in dimension and results in complexity, identifying the most informative genes is a challenging task. Such difficulty is often related to the huge number of genes with very few samples. Research in medical data mining addresses this problem by applying techniques from data mining and machine learning to the micro array datasets. In this paper Unsupervised Tolerance Rough Set based Quick Reduct (U-TRS-QR), a diverse feature selection algorithm, which extends the existing equivalent rough sets for unsupervised learning, is proposed. Genes selected by the proposed method leads to a considerably improved class predictions in wide experiments on two gene expression datasets: Brain Tumor and Colon Cancer. The results indicate consistent improvement among 12 classifiers.


Complexity ◽  
2014 ◽  
Vol 20 (5) ◽  
pp. 50-62 ◽  
Author(s):  
Mohammad Taghi Rezvan ◽  
Ali Zeinal Hamadani ◽  
Seyed Reza Hejazi

2016 ◽  
Vol 66 (6) ◽  
pp. 612 ◽  
Author(s):  
M.R. Gauthama Raman ◽  
K. Kannan ◽  
S.K. Pal ◽  
V. S. Shankar Sriram

Immense growth in network-based services had resulted in the upsurge of internet users, security threats and cyber-attacks. Intrusion detection systems (IDSs) have become an essential component of any network architecture, in order to secure an IT infrastructure from the malicious activities of the intruders. An efficient IDS should be able to detect, identify and track the malicious attempts made by the intruders. With many IDSs available in the literature, the most common challenge due to voluminous network traffic patterns is the curse of dimensionality. This scenario emphasizes the importance of feature selection algorithm, which can identify the relevant features and ignore the rest without any information loss. In this paper, a novel rough set κ-Helly property technique (RSKHT) feature selection algorithm had been proposed to identify the key features for network IDSs. Experiments carried using benchmark KDD cup 1999 dataset were found to be promising, when compared with the existing feature selection algorithms with respect to reduct size, classifier’s performance and time complexity. RSKHT was found to be computationally attractive and flexible for massive datasets.


Sign in / Sign up

Export Citation Format

Share Document