MULTI-OBJECTIVE EVOLUTIONARY ALGORITHMS FOR FILTER BASED FEATURE SELECTION IN CLASSIFICATION

BING XUE; LIAM CERVANTE; LIN SHANG; WILL N. BROWNE; MENGJIE ZHANG

doi:10.1142/s0218213013500243

A self-adaptive multi-objective feature selection approach for classification problems

Integrated Computer-Aided Engineering ◽

10.3233/ica-210664 ◽

2021 ◽

pp. 1-19

Author(s):

Yu Xue ◽

Haokai Zhu ◽

Ferrante Neri

Keyword(s):

Feature Selection ◽

High Performance ◽

Heuristic Algorithms ◽

Classification Performance ◽

Classification Problems ◽

Main Concept ◽

Detection Mechanism ◽

Multi Objective ◽

Feature Selection Approach ◽

Self Adaptive

In classification tasks, feature selection (FS) can reduce the data dimensionality and may also improve classification accuracy, both of which are commonly treated as the two objectives in FS problems. Many meta-heuristic algorithms have been applied to solve the FS problems and they perform satisfactorily when the problem is relatively simple. However, once the dimensionality of the datasets grows, their performance drops dramatically. This paper proposes a self-adaptive multi-objective genetic algorithm (SaMOGA) for FS, which is designed to maintain a high performance even when the dimensionality of the datasets grows. The main concept of SaMOGA lies in the dynamic selection of five different crossover operators in different evolution process by applying a self-adaptive mechanism. Meanwhile, a search stagnation detection mechanism is also proposed to prevent premature convergence. In the experiments, we compare SaMOGA with five multi-objective FS algorithms on sixteen datasets. According to the experimental results, SaMOGA yields a set of well converged and well distributed solutions on most data sets, indicating that SaMOGA can guarantee classification performance while removing many features, and the advantage over its counterparts is more obvious when the dimensionality of datasets grows.

Download Full-text

Particle swarm optimisation for feature selection: A hybrid filter-wrapper approach

10.26686/wgtn.14273612.v1 ◽

2021 ◽

Author(s):

T Butler-Yeoman ◽

Bing Xue ◽

Mengjie Zhang

Keyword(s):

Feature Selection ◽

Classification Accuracy ◽

Computational Cost ◽

Particle Swarm ◽

Classification Performance ◽

Particle Swarm Optimisation ◽

Computational Time ◽

Advantages And Disadvantages ◽

Benchmark Datasets ◽

Selection Algorithms

© 2015 IEEE. Feature selection is an important pre-processing step, which can reduce the dimensionality of a dataset and increase the accuracy and efficiency of a learning/classification algorithm. However, existing feature selection algorithms mainly wrappers and filters have their own advantages and disadvantages. This paper proposes two filter-wrapper hybrid feature selection algorithms based on particle swarm optimisation (PSO), where the first algorithm named FastPSO combined filter and wrapper into the search process of PSO for feature selection with most of the evaluations as filters and a small number of evaluations as wrappers. The second algorithm named RapidPSO further reduced the number of wrapper evaluations. Theoretical analysis on FastPSO and RapidPSO is conducted to investigate their complexity. FastPSO and RapidPSO are compared with a pure wrapper algorithm named WrapperPSO and a pure filter algorithm named FilterPSO on nine benchmark datasets of varying difficulty. The experimental results show that both FastPSO and RapidPSO can successfully reduce the number of features and simultaneously increase the classification performance over using all features. The two proposed algorithms maintain the high classification performance achieved by WrapperPSO and significantly reduce the computational time, although the number of features is larger. At the same time, they increase the classification accuracy of FilterPSO and reduce the number of features, but increased the computational cost. FastPSO outperformed RapidPSO in terms of the classification accuracy and the number of features, but increased the computational time, which shows the trade-off between the efficiency and effectiveness. © 2015 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Download Full-text

Pareto front feature selection based on artificial bee colony optimization

10.26686/wgtn.14298869 ◽

2021 ◽

Author(s):

E Hancer ◽

Bing Xue ◽

Mengjie Zhang ◽

D Karaboga ◽

B Akay

Keyword(s):

Feature Selection ◽

Artificial Bee Colony ◽

Classification Performance ◽

Binary Representation ◽

Genetic Operators ◽

Forward Selection ◽

Artificial Bee Colony Optimization ◽

Multi Objective ◽

Bee Colony ◽

Benchmark Datasets

© 2017 Elsevier Inc. Feature selection has two major conflicting aims, i.e., to maximize the classification performance and to minimize the number of selected features to overcome the curse of dimensionality. To balance their trade-off, feature selection can be handled as a multi-objective problem. In this paper, a feature selection approach is proposed based on a new multi-objective artificial bee colony algorithm integrated with non-dominated sorting procedure and genetic operators. Two different implementations of the proposed approach are developed: ABC with binary representation and ABC with continuous representation. Their performance are examined on 12 benchmark datasets and the results are compared with those of linear forward selection, greedy stepwise backward selection, two single objective ABC algorithms and three well-known multi-objective evolutionary computation algorithms. The results show that the proposed approach with the binary representation outperformed the other methods in terms of both the dimensionality reduction and the classification accuracy.

Download Full-text

A comprehensive comparison on evolutionary feature selection approaches to classification

10.26686/wgtn.14195936.v1 ◽

2021 ◽

Author(s):

Bing Xue ◽

Mengjie Zhang ◽

William Browne

Keyword(s):

Feature Selection ◽

Feature Selection Method ◽

Classification Performance ◽

Decision Makers ◽

Computational Time ◽

Multi Objective ◽

Advantages And Disadvantages ◽

Different Types ◽

Comprehensive Comparison ◽

Single Objective

© 2015 Imperial College Press. Feature selection is an important data preprocessing step in machine learning and data mining, such as classification tasks. Research on feature selection has been extensively conducted for more than 50 years and different types of approaches have been proposed, which include wrapper approaches or filter approaches, and single objective approaches or multi-objective approaches. However, the advantages and disadvantages of such approaches have not been thoroughly investigated. This paper provides a comprehensive study on comparing different types of feature selection approaches, specifically including comparisons on the classification performance and computational time of wrappers and filters, generality of wrapper approaches, and comparisons on single objective and multi-objective approaches. Particle swarm optimization (PSO)-based approaches, which include different types of methods, are used as typical examples to conduct this research. A total of 10 different feature selection methods and over 7000 experiments are involved. The results show that filters are usually faster than wrappers, but wrappers using a simple classification algorithm can be faster than filters. Wrappers often achieve better classification performance than filters. Feature subsets obtained from wrappers can be general to other classification algorithms. Meanwhile, multi-objective approaches are generally better choices than single objective algorithms. The findings are not only useful for researchers to develop new approaches to addressing new challenges in feature selection, but also useful for real-world decision makers to choose a specific feature selection method according to their own requirements.

Download Full-text

A multi-objective artificial bee colony approach to feature selection using fuzzy mutual information

10.26686/wgtn.14220071.v1 ◽

2021 ◽

Author(s):

E Hancer ◽

Bing Xue ◽

Mengjie Zhang ◽

D Karaboga ◽

B Akay

Keyword(s):

Feature Selection ◽

Mutual Information ◽

Classification Accuracy ◽

Artificial Bee Colony ◽

Feature Subset ◽

Fitness Evaluation ◽

Multi Objective ◽

Bee Colony ◽

Conflicting Objectives ◽

Single Objective

© 2015 IEEE. Feature selection often involves two conflicting objectives of minimizing the feature subset size and the maximizing the classification accuracy. In this paper, a multi-objective artificial bee colony (MOABC) framework is developed for feature selection in classification, and a new fuzzy mutual information based criterion is proposed to evaluate the relevance of feature subsets. Three new multi-objective feature selection approaches are proposed by integrating MOABC with three filter fitness evaluation criteria, which are mutual information, fuzzy mutual information and the proposed fuzzy mutual information. The proposed multi-objective feature selection approaches are examined by comparing them with three single-objective ABC-based feature selection approaches on six commonly used datasets. The results show that the proposed approaches are able to achieve better performance than the original feature set in terms of the classification accuracy and the number of features. By using the same evaluation criterion, the proposed multi-objective algorithms generally perform better than the single-objective methods, especially in terms of reducing the number of features. Furthermore, the proposed fuzzy mutual information criterion outperforms mutual information and the original fuzzy mutual information in both single-objective and multi-objective manners. This work is the first study on multi-objective ABC for filter feature selection in classification, which shows that multi-objective ABC can be effectively used to address feature selection problems.

Download Full-text

Feature Selection with Conditional Mutual Information Considering Feature Interaction

Symmetry ◽

10.3390/sym11070858 ◽

2019 ◽

Vol 11 (7) ◽

pp. 858 ◽

Cited By ~ 3

Author(s):

Jun Liang ◽

Liang Hou ◽

Zhenhua Luan ◽

Weiping Huang

Keyword(s):

Feature Selection ◽

Mutual Information ◽

Classification Performance ◽

Feature Interaction ◽

Conditional Mutual Information ◽

Selection Algorithm ◽

Benchmark Datasets ◽

Feature Relevance ◽

New Feature ◽

Selection Algorithms

Feature interaction is a newly proposed feature relevance relationship, but the unintentional removal of interactive features can result in poor classification performance for this relationship. However, traditional feature selection algorithms mainly focus on detecting relevant and redundant features while interactive features are usually ignored. To deal with this problem, feature relevance, feature redundancy and feature interaction are redefined based on information theory. Then a new feature selection algorithm named CMIFSI (Conditional Mutual Information based Feature Selection considering Interaction) is proposed in this paper, which makes use of conditional mutual information to estimate feature redundancy and interaction, respectively. To verify the effectiveness of our algorithm, empirical experiments are conducted to compare it with other several representative feature selection algorithms. The results on both synthetic and benchmark datasets indicate that our algorithm achieves better results than other methods in most cases. Further, it highlights the necessity of dealing with feature interaction.

Download Full-text

BINARY PSO AND ROUGH SET THEORY FOR FEATURE SELECTION: A MULTI-OBJECTIVE FILTER BASED APPROACH

International Journal of Computational Intelligence and Applications ◽

10.1142/s1469026814500096 ◽

2014 ◽

Vol 13 (02) ◽

pp. 1450009 ◽

Cited By ~ 37

Author(s):

BING XUE ◽

LIAM CERVANTE ◽

LIN SHANG ◽

WILL N. BROWNE ◽

MENGJIE ZHANG

Keyword(s):

Feature Selection ◽

Rough Set ◽

Rough Set Theory ◽

Classification Performance ◽

Classification Algorithms ◽

Binary Particle Swarm Optimization ◽

Traditional Methods ◽

K Nearest Neighbors ◽

Multi Objective ◽

Single Objective

Feature selection is a multi-objective problem, where the two main objectives are to maximize the classification accuracy and minimize the number of features. However, most of the existing algorithms belong to single objective, wrapper approaches. In this work, we investigate the use of binary particle swarm optimization (BPSO) and probabilistic rough set (PRS) for multi-objective feature selection. We use PRS to propose a new measure for the number of features based on which a new filter based single objective algorithm (PSOPRSE) is developed. Then a new filter-based multi-objective algorithm (MORSE) is proposed, which aims to maximize a measure for the classification performance and minimize the new measure for the number of features. MORSE is examined and compared with PSOPRSE, two existing PSO-based single objective algorithms, two traditional methods, and the only existing BPSO and PRS-based multi-objective algorithm (MORSN). Experiments have been conducted on six commonly used discrete datasets with a relative small number of features and six continuous datasets with a large number of features. The classification performance of the selected feature subsets are evaluated by three classification algorithms (decision trees, Naïve Bayes, and k-nearest neighbors). The results show that the proposed algorithms can automatically select a smaller number of features and achieve similar or better classification performance than using all features. PSOPRSE achieves better performance than the other two PSO-based single objective algorithms and the two traditional methods. MORSN and MORSE outperform all these five single objective algorithms in terms of both the classification performance and the number of features. MORSE achieves better classification performance than MORSN. These filter algorithms are general to the three different classification algorithms.

Download Full-text

A comprehensive comparison on evolutionary feature selection approaches to classification

10.26686/wgtn.14195936 ◽

2021 ◽

Author(s):

Bing Xue ◽

Mengjie Zhang ◽

William Browne

Keyword(s):

Feature Selection ◽

Feature Selection Method ◽

Classification Performance ◽

Decision Makers ◽

Computational Time ◽

Multi Objective ◽

Advantages And Disadvantages ◽

Different Types ◽

Comprehensive Comparison ◽

Single Objective

© 2015 Imperial College Press. Feature selection is an important data preprocessing step in machine learning and data mining, such as classification tasks. Research on feature selection has been extensively conducted for more than 50 years and different types of approaches have been proposed, which include wrapper approaches or filter approaches, and single objective approaches or multi-objective approaches. However, the advantages and disadvantages of such approaches have not been thoroughly investigated. This paper provides a comprehensive study on comparing different types of feature selection approaches, specifically including comparisons on the classification performance and computational time of wrappers and filters, generality of wrapper approaches, and comparisons on single objective and multi-objective approaches. Particle swarm optimization (PSO)-based approaches, which include different types of methods, are used as typical examples to conduct this research. A total of 10 different feature selection methods and over 7000 experiments are involved. The results show that filters are usually faster than wrappers, but wrappers using a simple classification algorithm can be faster than filters. Wrappers often achieve better classification performance than filters. Feature subsets obtained from wrappers can be general to other classification algorithms. Meanwhile, multi-objective approaches are generally better choices than single objective algorithms. The findings are not only useful for researchers to develop new approaches to addressing new challenges in feature selection, but also useful for real-world decision makers to choose a specific feature selection method according to their own requirements.

Download Full-text

Pareto front feature selection based on artificial bee colony optimization

10.26686/wgtn.14298869.v1 ◽

2021 ◽

Author(s):

E Hancer ◽

Bing Xue ◽

Mengjie Zhang ◽

D Karaboga ◽

B Akay

Keyword(s):

Feature Selection ◽

Artificial Bee Colony ◽

Classification Performance ◽

Binary Representation ◽

Genetic Operators ◽

Forward Selection ◽

Artificial Bee Colony Optimization ◽

Multi Objective ◽

Bee Colony ◽

Benchmark Datasets

© 2017 Elsevier Inc. Feature selection has two major conflicting aims, i.e., to maximize the classification performance and to minimize the number of selected features to overcome the curse of dimensionality. To balance their trade-off, feature selection can be handled as a multi-objective problem. In this paper, a feature selection approach is proposed based on a new multi-objective artificial bee colony algorithm integrated with non-dominated sorting procedure and genetic operators. Two different implementations of the proposed approach are developed: ABC with binary representation and ABC with continuous representation. Their performance are examined on 12 benchmark datasets and the results are compared with those of linear forward selection, greedy stepwise backward selection, two single objective ABC algorithms and three well-known multi-objective evolutionary computation algorithms. The results show that the proposed approach with the binary representation outperformed the other methods in terms of both the dimensionality reduction and the classification accuracy.

Download Full-text

Particle Swarm Optimisation for Feature Selection in Classification

10.26686/wgtn.17006317 ◽

2021 ◽

Author(s):

◽

Bing Xue

Keyword(s):

Feature Selection ◽

Classification Accuracy ◽

Classification Performance ◽

Classification Algorithm ◽

Computational Time ◽

Classification Algorithms ◽

Multi Objective ◽

Selection Approach ◽

Feature Selection Approach ◽

Single Objective

<p>Classification problems often have a large number of features, but not all of them are useful for classification. Irrelevant and redundant features may even reduce the classification accuracy. Feature selection is a process of selecting a subset of relevant features, which can decrease the dimensionality, shorten the running time, and/or improve the classification accuracy. There are two types of feature selection approaches, i.e. wrapper and filter approaches. Their main difference is that wrappers use a classification algorithm to evaluate the goodness of the features during the feature selection process while filters are independent of any classification algorithm. Feature selection is a difficult task because of feature interactions and the large search space. Existing feature selection methods suffer from different problems, such as stagnation in local optima and high computational cost. Evolutionary computation (EC) techniques are well-known global search algorithms. Particle swarm optimisation (PSO) is an EC technique that is computationally less expensive and can converge faster than other methods. PSO has been successfully applied to many areas, but its potential for feature selection has not been fully investigated. The overall goal of this thesis is to investigate and improve the capability of PSO for feature selection to select a smaller number of features and achieve similar or better classification performance than using all features. This thesis investigates the use of PSO for both wrapper and filter, and for both single objective and multi-objective feature selection, and also investigates the differences between wrappers and filters. This thesis proposes a new PSO based wrapper, single objective feature selection approach by developing new initialisation and updating mechanisms. The results show that by considering the number of features in the initialisation and updating procedures, the new algorithm can improve the classification performance, reduce the number of features and decrease computational time. This thesis develops the first PSO based wrapper multi-objective feature selection approach, which aims to maximise the classification accuracy and simultaneously minimise the number of features. The results show that the proposed multi-objective algorithm can obtain more and better feature subsets than single objective algorithms, and outperform other well-known EC based multi-objective feature selection algorithms. This thesis develops a filter, single objective feature selection approach based on PSO and information theory. Two measures are proposed to evaluate the relevance of the selected features based on each pair of features and a group of features, respectively. The results show that PSO and information based algorithms can successfully address feature selection tasks. The group based method achieves higher classification accuracies, but the pair based method is faster and selects smaller feature subsets. This thesis proposes the first PSO based multi-objective filter feature selection approach using information based measures. This work is also the first work using other two well-known multi-objective EC algorithms in filter feature selection, which are also used to compare the performance of the PSO based approach. The results show that the PSO based multiobjective filter approach can successfully address feature selection problems, outperform single objective filter algorithms and achieve better classification performance than other multi-objective algorithms. This thesis investigates the difference between wrapper and filter approaches in terms of the classification performance and computational time, and also examines the generality of wrappers. The results show that wrappers generally achieve better or similar classification performance than filters, but do not always need longer computational time than filters. The results also show that wrappers built with simple classification algorithms can be general to other classification algorithms.</p>

Download Full-text