scholarly journals Effective Feature Subset Identification Using Adaptive Bee Colony Algorithm

Author(s):  
Suja K ◽  
Rengarajan A

The irrelevant features, along with redundant features, severely affect the accuracy of the learning machines. Feature subset selection as the process of identifying and removing many irrelevant and redundant features. The overall process of the optimal feature selection method is divided into two main steps, such as, i) preprocessing (ii) Optimal feature selection using clustering and tree generation. At first, preprocessing is done in the input micro array dataset. Then the Possibilistic fuzzy c-means clustering algorithm with optimal minimum spanning tree algorithm is applied on the high dimensional micro array dataset to select the important features. Here the proposed method is optimally select the features with the help of Adaptive artificial bee colony algorithm.

2013 ◽  
Vol 380-384 ◽  
pp. 1593-1599
Author(s):  
Hao Yan Guo ◽  
Da Zheng Wang

The traditional motivation behind feature selection algorithms is to find the best subset of features for a task using one particular learning algorithm. However, it has been often found that no single classifier is entirely satisfactory for a particular task. Therefore, how to further improve the performance of these single systems on the basis of the previous optimal feature subset is a very important issue.We investigate the notion of optimal feature selection and present a practical feature selection approach that is based on an optimal feature subset of a single CAD system, which is referred to as a multilevel optimal feature selection method (MOFS) in this paper. Through MOFS, we select the different optimal feature subsets in order to eliminate features that are redundant or irrelevant and obtain optimal features.


The optimal feature subset selection over very high dimensional data is a vital issue. Even though the optimal features are selected, the classification of those selected features becomes a key complicated task. In order to handle these problems, a novel, Accelerated Simulated Annealing and Mutation Operator (ASAMO) feature selection algorithm is suggested in this work. For solving the classification problem, the Fuzzy Minimal Consistent Class Subset Coverage (FMCCSC) problem is introduced. In FMCCSC, consistent subset is combined with the K-Nearest Neighbour (KNN) classifier known as FMCCSC-KNN classifier. The two data sets Dorothea and Madelon from UCI machine repository are experimented for optimal feature selection and classification. The experimental results substantiate the efficiency of proposed ASAMO with FMCCSC-KNN classifier compared to Particle Swarm Optimization (PSO) and Accelerated PSO feature selection algorithms.


Big mining plays a more critical role in the real world environment due to presence of large volume of data with different varieties and type. Handling these data values and predicting the information would be the more difficult task which needs to be concerned more to obtain the useful knowledge. This is achieved in our previous research work by introducing the Enhanced Particle Swarm Optimization with Genetic Algorithm – Modified Artificial Neural Network (EPSOGA -MANN) which can select the optimal features from the big volume of data. However this research work might be reduced in its performance due to presence of missing values in the dataset. And also this method is more complex to perform due to increased computational overhead of ANN algorithm. This is resolved in the proposed research method by introducing the method namely Missing Value concerned Optimal Feature Selection Method (MV-OFSM). In this research method Improved KNN imputation algorithm is introduced to handle the missing values. And then Dynamic clustering method is introduced to cluster the dataset based on closeness measure. Then Anarchies Society Optimization (ASO) based feature selection approach is applied for performing feature selection in the given dataset. Finally a Hybrid ANN-GA classification technique is applied for implementing the classification. The overall performance evaluation of the research method is performed in the matlab simulation environment from which it is proved that the proposed research method leads to provide the better performance than the existing research technique.


2021 ◽  
Vol 12 ◽  
Author(s):  
Dongxu Zhao ◽  
Zhixia Teng ◽  
Yanjuan Li ◽  
Dong Chen

Recently, several anti-inflammatory peptides (AIPs) have been found in the process of the inflammatory response, and these peptides have been used to treat some inflammatory and autoimmune diseases. Therefore, identifying AIPs accurately from a given amino acid sequences is critical for the discovery of novel and efficient anti-inflammatory peptide-based therapeutics and the acceleration of their application in therapy. In this paper, a random forest-based model called iAIPs for identifying AIPs is proposed. First, the original samples were encoded with three feature extraction methods, including g-gap dipeptide composition (GDC), dipeptide deviation from the expected mean (DDE), and amino acid composition (AAC). Second, the optimal feature subset is generated by a two-step feature selection method, in which the feature is ranked by the analysis of variance (ANOVA) method, and the optimal feature subset is generated by the incremental feature selection strategy. Finally, the optimal feature subset is inputted into the random forest classifier, and the identification model is constructed. Experiment results showed that iAIPs achieved an AUC value of 0.822 on an independent test dataset, which indicated that our proposed model has better performance than the existing methods. Furthermore, the extraction of features for peptide sequences provides the basis for evolutionary analysis. The study of peptide identification is helpful to understand the diversity of species and analyze the evolutionary history of species.


2020 ◽  
Author(s):  
Mumine Kaya Keles ◽  
Umit Kilic ◽  
Abdullah Emre Keles

Abstract Datasets have relevant and irrelevant features whose evaluations are fundamental for classification or clustering processes. The effects of these relevant features make classification accuracy more accurate and stable. At this point, optimization methods are used for feature selection process. This process is a feature reduction process finding the most relevant feature subset without decrement of the accuracy rate obtained by original feature sets. Varied nature inspiration-based optimization algorithms have been proposed as feature selector. The density of data in construction projects and the inability of extracting these data cause various losses in field studies. In this respect, the behaviors of leaders are important in the selection and efficient use of these data. The objective of this study is implementing Artificial Bee Colony (ABC) algorithm as a feature selection method to predict the leadership perception of the construction employees. When Random Forest, Sequential Minimal Optimization and K-Nearest Neighborhood (KNN) are used as classifier, 84.1584% as highest accuracy result and 0.805 as highest F-Measure result were obtained by using KNN and Random Forest classifier with proposed ABC Algorithm as feature selector. The results show that a nature inspiration-based optimization algorithm like ABC algorithm as feature selector is satisfactory in prediction of the Construction Employee’s Leadership Perception.


2014 ◽  
Vol 507 ◽  
pp. 806-809
Author(s):  
Shu Fang Li ◽  
Qin Jia ◽  
Hong Liang

In order to Red Tide algae present real-time automatic classification method of high accuracy rate, this paper proposes using ReliefF-SBS for feature selection. Namely feature analysis about Red Tide algae image original data set. And on this basis, feature selection to remove the irrelevant features and redundant features from the original feature set feature, to get the optimal feature subset, and reduce their impact on the classification accuracy. Meanwhile compare the classification results before and after SVM and KNN two kinds feature selection classifiers.


Sign in / Sign up

Export Citation Format

Share Document