A self-adaptive level-based learning artificial bee colony algorithm for feature selection on high-dimensional classification

Abstract Feature selection is an important data preprocessing method in data mining and machine learning, yet it faces the challenge of “curse of dimensionality” when dealing with high-dimensional data. In this paper, a self-adaptive level-based learning artificial bee colony (SLLABC) algorithm is proposed for high-dimensional feature selection problem. The SLLABC algorithm includes three new mechanisms: (1) A novel level-based learning mechanism is introduced to accelerate the convergence of the basic artificial bee colony algorithm, which divides the population into several levels and the individuals on each level learn from the individuals on higher levels, especially, the individuals on the highest level learn from each other. (2) A self-adaptive method is proposed to keep the balance between exploration and exploitation abilities, which takes the diversity of population into account to determine the number of levels. The lower the diversity is, the fewer the levels are divided. (3) A new update mechanism is proposed to reduce the number of selected features. In this mechanism, if the error rate of an offspring is higher than or is equal to that of its parent but selects more features, then the offspring is discarded and the parent is retained, otherwise, the offspring replaces its parent. Further, we discuss and analyze the contribution of these novelties to the diversity of population and the performance of classification. Finally, the results, compared with 8 state-of-the-art algorithms on 12 high-dimensional datasets, confirm the competitive performance of the proposed SLLABC on both classification accuracy and the size of the feature subset.

Download Full-text

Angle Modulated Artificial Bee Colony Algorithms for Feature Selection

Applied Computational Intelligence and Soft Computing ◽

10.1155/2016/9569161 ◽

2016 ◽

Vol 2016 ◽

pp. 1-6 ◽

Cited By ~ 7

Author(s):

Gürcan Yavuz ◽

Doğan Aydin

Keyword(s):

Feature Selection ◽

Artificial Bee Colony ◽

Continuous Optimization ◽

Subset Selection ◽

Machine Intelligence ◽

Feature Subset Selection ◽

High Dimensional ◽

Feature Subset ◽

Bee Colony ◽

Angle Modulation

Optimal feature subset selection is an important and a difficult task for pattern classification, data mining, and machine intelligence applications. The objective of the feature subset selection is to eliminate the irrelevant and noisy feature in order to select optimum feature subsets and increase accuracy. The large number of features in a dataset increases the computational complexity thus leading to performance degradation. In this paper, to overcome this problem, angle modulation technique is used to reduce feature subset selection problem to four-dimensional continuous optimization problem instead of presenting the problem as a high-dimensional bit vector. To present the effectiveness of the problem presentation with angle modulation and to determine the efficiency of the proposed method, six variants of Artificial Bee Colony (ABC) algorithms employ angle modulation for feature selection. Experimental results on six high-dimensional datasets show that Angle Modulated ABC algorithms improved the classification accuracy with fewer feature subsets.

Download Full-text

A Centre of Gravity-Based Preprocessing Approach for Feature Selection Using Artificial Bee Colony Algorithm on High-Dimensional Datasets

Lecture Notes in Electrical Engineering - Advances in Communication Systems and Networks ◽

10.1007/978-981-15-3992-3_23 ◽

2020 ◽

pp. 283-294

Author(s):

M. G. Bindu ◽

M. K. Sabu

Keyword(s):

Feature Selection ◽

Artificial Bee Colony Algorithm ◽

Artificial Bee Colony ◽

High Dimensional ◽

Centre Of Gravity ◽

Bee Colony ◽

High Dimensional Datasets

Download Full-text

Feature Selection for High-Dimensional Datasets through a Novel Artificial Bee Colony Framework

Algorithms ◽

10.3390/a14110324 ◽

2021 ◽

Vol 14 (11) ◽

pp. 324

Author(s):

Yuanzi Zhang ◽

Jing Wang ◽

Xiaolin Li ◽

Shiguo Huang ◽

Xiuli Wang

Keyword(s):

Feature Selection ◽

Artificial Bee Colony ◽

Solution Space ◽

High Dimensional ◽

Classification Error ◽

Feature Subset ◽

Abc Algorithm ◽

Bee Colony ◽

Whale Optimization ◽

High Dimensional Datasets

There are generally many redundant and irrelevant features in high-dimensional datasets, which leads to the decline of classification performance and the extension of execution time. To tackle this problem, feature selection techniques are used to screen out redundant and irrelevant features. The artificial bee colony (ABC) algorithm is a popular meta-heuristic algorithm with high exploration and low exploitation capacities. To balance between both capacities of the ABC algorithm, a novel ABC framework is proposed in this paper. Specifically, the solutions are first updated by the process of employing bees to retain the original exploration ability, so that the algorithm can explore the solution space extensively. Then, the solutions are modified by the updating mechanism of an algorithm with strong exploitation ability in the onlooker bee phase. Finally, we remove the scout bee phase from the framework, which can not only reduce the exploration ability but also speed up the algorithm. In order to verify our idea, the operators of the grey wolf optimization (GWO) algorithm and whale optimization algorithm (WOA) are introduced into the framework to enhance the exploitation capability of onlooker bees, named BABCGWO and BABCWOA, respectively. It has been found that these two algorithms are superior to four state-of-the-art feature selection algorithms using 12 high-dimensional datasets, in terms of the classification error rate, size of feature subset and execution speed.

Download Full-text

A Modified Artificial Bee Colony Algorithm-Based Feature Selection for the Classification of High-Dimensional Data

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2016.5255 ◽

2016 ◽

Vol 13 (7) ◽

pp. 4088-4095 ◽

Cited By ~ 3

Author(s):

Yang Zhang

Keyword(s):

Feature Selection ◽

Artificial Bee Colony Algorithm ◽

Artificial Bee Colony ◽

High Dimensional Data ◽

High Dimensional ◽

Bee Colony ◽

Selection For

Download Full-text

Self-Adaptive and Adaptive Parameter Control in Improved Artificial Bee Colony Algorithm

Informatica ◽

10.15388/informatica.2017.136 ◽

2017 ◽

Vol 28 (3) ◽

pp. 415-438 ◽

Cited By ~ 1

Author(s):

Bekir Afşar ◽

Doğan Aydin ◽

Aybars Uğur ◽

Serdar Korukoğlu

Keyword(s):

Artificial Bee Colony Algorithm ◽

Artificial Bee Colony ◽

Parameter Control ◽

Adaptive Parameter ◽

Bee Colony ◽

Adaptive Parameter Control ◽

Self Adaptive

Download Full-text

ABC-Gly: identifying protein lysine glycation sites with artificial bee colony algorithm

Current Proteomics ◽

10.2174/1570164617666191227120136 ◽

2019 ◽

Vol 17 ◽

Author(s):

Yanqiu Yao ◽

Xiaosa Zhao ◽

Qiao Ning ◽

Junping Zhou

Keyword(s):

Support Vector Machine ◽

Amino Acid ◽

Artificial Bee Colony Algorithm ◽

Artificial Bee Colony ◽

Training Dataset ◽

Support Vector ◽

Supplementary File ◽

Feature Subset ◽

Lipid Molecule ◽

Bee Colony

Background: Glycation is a nonenzymatic post-translational modification process by attaching a sugar molecule to a protein or lipid molecule. It may impair the function and change the characteristic of the proteins which may lead to some metabolic diseases. In order to understand the underlying molecular mechanisms of glycation, computational prediction methods have been developed because of their convenience and high speed. However, a more effective computational tool is still a challenging task in computational biology. Methods: In this study, we showed an accurate identification tool named ABC-Gly for predicting lysine glycation sites. At first, we utilized three informative features, including position-specific amino acid propensity, secondary structure and the composition of k-spaced amino acid pairs to encode the peptides. Moreover, to sufficiently exploit discriminative features thus can improve the prediction and generalization ability of the model, we developed a two-step feature selection, which combined the Fisher score and an improved binary artificial bee colony algorithm based on support vector machine. Finally, based on the optimal feature subset, we constructed the effective model by using Support Vector Machine on the training dataset. Results: The performance of the proposed predictor ABC-Gly was measured with the sensitivity of 76.43%, the specificity of 91.10%, the balanced accuracy of 83.76%, the area under the receiver-operating characteristic curve (AUC) of 0.9313, a Matthew’s Correlation Coefficient (MCC) of 0.6861 by 10-fold cross-validation on training dataset, and a balanced accuracy of 59.05% on independent dataset. Compared to the state-of-the-art predictors on the training dataset, the proposed predictor achieved significant improvement in the AUC of 0.156 and MCC of 0.336. Conclusion: The detailed analysis results indicated that our predictor may serve as a powerful complementary tool to other existing methods for predicting protein lysine glycation. The source code and datasets of the ABC-Gly were provided in the Supplementary File 1.

Download Full-text