Particle Swarm Optimisation for Feature Selection in Classification

Classification Algorithm ◽

Computational Time ◽

Classification Algorithms ◽

Multi Objective ◽

Selection Approach ◽

<p>Classification problems often have a large number of features, but not all of them are useful for classification. Irrelevant and redundant features may even reduce the classification accuracy. Feature selection is a process of selecting a subset of relevant features, which can decrease the dimensionality, shorten the running time, and/or improve the classification accuracy. There are two types of feature selection approaches, i.e. wrapper and filter approaches. Their main difference is that wrappers use a classification algorithm to evaluate the goodness of the features during the feature selection process while filters are independent of any classification algorithm. Feature selection is a difficult task because of feature interactions and the large search space. Existing feature selection methods suffer from different problems, such as stagnation in local optima and high computational cost. Evolutionary computation (EC) techniques are well-known global search algorithms. Particle swarm optimisation (PSO) is an EC technique that is computationally less expensive and can converge faster than other methods. PSO has been successfully applied to many areas, but its potential for feature selection has not been fully investigated. The overall goal of this thesis is to investigate and improve the capability of PSO for feature selection to select a smaller number of features and achieve similar or better classification performance than using all features. This thesis investigates the use of PSO for both wrapper and filter, and for both single objective and multi-objective feature selection, and also investigates the differences between wrappers and filters. This thesis proposes a new PSO based wrapper, single objective feature selection approach by developing new initialisation and updating mechanisms. The results show that by considering the number of features in the initialisation and updating procedures, the new algorithm can improve the classification performance, reduce the number of features and decrease computational time. This thesis develops the first PSO based wrapper multi-objective feature selection approach, which aims to maximise the classification accuracy and simultaneously minimise the number of features. The results show that the proposed multi-objective algorithm can obtain more and better feature subsets than single objective algorithms, and outperform other well-known EC based multi-objective feature selection algorithms. This thesis develops a filter, single objective feature selection approach based on PSO and information theory. Two measures are proposed to evaluate the relevance of the selected features based on each pair of features and a group of features, respectively. The results show that PSO and information based algorithms can successfully address feature selection tasks. The group based method achieves higher classification accuracies, but the pair based method is faster and selects smaller feature subsets. This thesis proposes the first PSO based multi-objective filter feature selection approach using information based measures. This work is also the first work using other two well-known multi-objective EC algorithms in filter feature selection, which are also used to compare the performance of the PSO based approach. The results show that the PSO based multiobjective filter approach can successfully address feature selection problems, outperform single objective filter algorithms and achieve better classification performance than other multi-objective algorithms. This thesis investigates the difference between wrapper and filter approaches in terms of the classification performance and computational time, and also examines the generality of wrappers. The results show that wrappers generally achieve better or similar classification performance than filters, but do not always need longer computational time than filters. The results also show that wrappers built with simple classification algorithms can be general to other classification algorithms.</p>

A comprehensive comparison on evolutionary feature selection approaches to classification

10.26686/wgtn.14195936.v1 ◽

2021 ◽

Author(s):

Bing Xue ◽

Mengjie Zhang ◽

William Browne

Keyword(s):

Feature Selection ◽

Feature Selection Method ◽

Decision Makers ◽

Computational Time ◽

Multi Objective ◽

Advantages And Disadvantages ◽

Different Types ◽

Comprehensive Comparison ◽

© 2015 Imperial College Press. Feature selection is an important data preprocessing step in machine learning and data mining, such as classification tasks. Research on feature selection has been extensively conducted for more than 50 years and different types of approaches have been proposed, which include wrapper approaches or filter approaches, and single objective approaches or multi-objective approaches. However, the advantages and disadvantages of such approaches have not been thoroughly investigated. This paper provides a comprehensive study on comparing different types of feature selection approaches, specifically including comparisons on the classification performance and computational time of wrappers and filters, generality of wrapper approaches, and comparisons on single objective and multi-objective approaches. Particle swarm optimization (PSO)-based approaches, which include different types of methods, are used as typical examples to conduct this research. A total of 10 different feature selection methods and over 7000 experiments are involved. The results show that filters are usually faster than wrappers, but wrappers using a simple classification algorithm can be faster than filters. Wrappers often achieve better classification performance than filters. Feature subsets obtained from wrappers can be general to other classification algorithms. Meanwhile, multi-objective approaches are generally better choices than single objective algorithms. The findings are not only useful for researchers to develop new approaches to addressing new challenges in feature selection, but also useful for real-world decision makers to choose a specific feature selection method according to their own requirements.

BINARY PSO AND ROUGH SET THEORY FOR FEATURE SELECTION: A MULTI-OBJECTIVE FILTER BASED APPROACH

International Journal of Computational Intelligence and Applications ◽

10.1142/s1469026814500096 ◽

2014 ◽

Vol 13 (02) ◽

pp. 1450009 ◽

Cited By ~ 37

Author(s):

BING XUE ◽

LIAM CERVANTE ◽

LIN SHANG ◽

WILL N. BROWNE ◽

MENGJIE ZHANG

Keyword(s):

Feature Selection ◽

Rough Set ◽

Rough Set Theory ◽

Classification Algorithms ◽

Binary Particle Swarm Optimization ◽

Traditional Methods ◽

K Nearest Neighbors ◽

Multi Objective ◽

Feature selection is a multi-objective problem, where the two main objectives are to maximize the classification accuracy and minimize the number of features. However, most of the existing algorithms belong to single objective, wrapper approaches. In this work, we investigate the use of binary particle swarm optimization (BPSO) and probabilistic rough set (PRS) for multi-objective feature selection. We use PRS to propose a new measure for the number of features based on which a new filter based single objective algorithm (PSOPRSE) is developed. Then a new filter-based multi-objective algorithm (MORSE) is proposed, which aims to maximize a measure for the classification performance and minimize the new measure for the number of features. MORSE is examined and compared with PSOPRSE, two existing PSO-based single objective algorithms, two traditional methods, and the only existing BPSO and PRS-based multi-objective algorithm (MORSN). Experiments have been conducted on six commonly used discrete datasets with a relative small number of features and six continuous datasets with a large number of features. The classification performance of the selected feature subsets are evaluated by three classification algorithms (decision trees, Naïve Bayes, and k-nearest neighbors). The results show that the proposed algorithms can automatically select a smaller number of features and achieve similar or better classification performance than using all features. PSOPRSE achieves better performance than the other two PSO-based single objective algorithms and the two traditional methods. MORSN and MORSE outperform all these five single objective algorithms in terms of both the classification performance and the number of features. MORSE achieves better classification performance than MORSN. These filter algorithms are general to the three different classification algorithms.

A comprehensive comparison on evolutionary feature selection approaches to classification

10.26686/wgtn.14195936 ◽

2021 ◽

Author(s):

Bing Xue ◽

Mengjie Zhang ◽

William Browne

Keyword(s):

Feature Selection ◽

Feature Selection Method ◽

Decision Makers ◽

Computational Time ◽

Multi Objective ◽

Advantages And Disadvantages ◽

Different Types ◽

Comprehensive Comparison ◽

© 2015 Imperial College Press. Feature selection is an important data preprocessing step in machine learning and data mining, such as classification tasks. Research on feature selection has been extensively conducted for more than 50 years and different types of approaches have been proposed, which include wrapper approaches or filter approaches, and single objective approaches or multi-objective approaches. However, the advantages and disadvantages of such approaches have not been thoroughly investigated. This paper provides a comprehensive study on comparing different types of feature selection approaches, specifically including comparisons on the classification performance and computational time of wrappers and filters, generality of wrapper approaches, and comparisons on single objective and multi-objective approaches. Particle swarm optimization (PSO)-based approaches, which include different types of methods, are used as typical examples to conduct this research. A total of 10 different feature selection methods and over 7000 experiments are involved. The results show that filters are usually faster than wrappers, but wrappers using a simple classification algorithm can be faster than filters. Wrappers often achieve better classification performance than filters. Feature subsets obtained from wrappers can be general to other classification algorithms. Meanwhile, multi-objective approaches are generally better choices than single objective algorithms. The findings are not only useful for researchers to develop new approaches to addressing new challenges in feature selection, but also useful for real-world decision makers to choose a specific feature selection method according to their own requirements.

A Comprehensive Comparison on Evolutionary Feature Selection Approaches to Classification

International Journal of Computational Intelligence and Applications ◽

10.1142/s146902681550008x ◽

2015 ◽

Vol 14 (02) ◽

pp. 1550008 ◽

Cited By ~ 27

Author(s):

Bing Xue ◽

Mengjie Zhang ◽

Will N. Browne

Keyword(s):

Feature Selection ◽

Feature Selection Method ◽

Decision Makers ◽

Computational Time ◽

Multi Objective ◽

Advantages And Disadvantages ◽

Different Types ◽

Comprehensive Comparison ◽

Feature selection is an important data preprocessing step in machine learning and data mining, such as classification tasks. Research on feature selection has been extensively conducted for more than 50 years and different types of approaches have been proposed, which include wrapper approaches or filter approaches, and single objective approaches or multi-objective approaches. However, the advantages and disadvantages of such approaches have not been thoroughly investigated. This paper provides a comprehensive study on comparing different types of feature selection approaches, specifically including comparisons on the classification performance and computational time of wrappers and filters, generality of wrapper approaches, and comparisons on single objective and multi-objective approaches. Particle swarm optimization (PSO)-based approaches, which include different types of methods, are used as typical examples to conduct this research. A total of 10 different feature selection methods and over 7000 experiments are involved. The results show that filters are usually faster than wrappers, but wrappers using a simple classification algorithm can be faster than filters. Wrappers often achieve better classification performance than filters. Feature subsets obtained from wrappers can be general to other classification algorithms. Meanwhile, multi-objective approaches are generally better choices than single objective algorithms. The findings are not only useful for researchers to develop new approaches to addressing new challenges in feature selection, but also useful for real-world decision makers to choose a specific feature selection method according to their own requirements.

Global mutual information-based feature selection approach using single-objective and multi-objective optimization

Neurocomputing ◽

10.1016/j.neucom.2015.06.016 ◽

2015 ◽

Vol 168 ◽

pp. 47-54 ◽

Cited By ~ 38

Author(s):

Min Han ◽

Weijie Ren

Keyword(s):

Feature Selection ◽

Mutual Information ◽

Multi Objective Optimization ◽

Multi Objective ◽

Selection Approach ◽

A self-adaptive multi-objective feature selection approach for classification problems

Integrated Computer-Aided Engineering ◽

10.3233/ica-210664 ◽

2021 ◽

pp. 1-19

Author(s):

Yu Xue ◽

Haokai Zhu ◽

Ferrante Neri

Keyword(s):

Feature Selection ◽

High Performance ◽

Heuristic Algorithms ◽

Classification Problems ◽

Main Concept ◽

Detection Mechanism ◽

Multi Objective ◽

Self Adaptive

In classification tasks, feature selection (FS) can reduce the data dimensionality and may also improve classification accuracy, both of which are commonly treated as the two objectives in FS problems. Many meta-heuristic algorithms have been applied to solve the FS problems and they perform satisfactorily when the problem is relatively simple. However, once the dimensionality of the datasets grows, their performance drops dramatically. This paper proposes a self-adaptive multi-objective genetic algorithm (SaMOGA) for FS, which is designed to maintain a high performance even when the dimensionality of the datasets grows. The main concept of SaMOGA lies in the dynamic selection of five different crossover operators in different evolution process by applying a self-adaptive mechanism. Meanwhile, a search stagnation detection mechanism is also proposed to prevent premature convergence. In the experiments, we compare SaMOGA with five multi-objective FS algorithms on sixteen datasets. According to the experimental results, SaMOGA yields a set of well converged and well distributed solutions on most data sets, indicating that SaMOGA can guarantee classification performance while removing many features, and the advantage over its counterparts is more obvious when the dimensionality of datasets grows.

SU-CCE: A Novel Feature Selection Approach for Reducing High Dimensionality

10.3233/apc210196 ◽

2021 ◽

Author(s):

A B Pawar ◽

M A Jawale ◽

Ravi Kumar Tirandasu ◽

Saiprasad Potharaju

Keyword(s):

Feature Selection ◽

Classification Accuracy ◽

Feature Space ◽

Microarray Dataset ◽

Classification Model ◽

High Dimensionality ◽

High Dimensional ◽

Selection Approach ◽

Careful Investigation

High dimensionality is the serious issue in the preprocessing of data mining. Having large number of features in the dataset leads to several complications for classifying an unknown instance. In a initial dataspace there may be redundant and irrelevant features present, which leads to high memory consumption, and confuse the learning model created with those properties of features. Always it is advisable to select the best features and generate the classification model for better accuracy. In this research, we proposed a novel feature selection approach and Symmetrical uncertainty and Correlation Coefficient (SU-CCE) for reducing the high dimensional feature space and increasing the classification accuracy. The experiment is performed on colon cancer microarray dataset which has 2000 features. The proposed method derived 38 best features from it. To measure the strength of proposed method, top 38 features extracted by 4 traditional filter-based methods are compared with various classifiers. After careful investigation of result, the proposed approach is competing with most of the traditional methods.

A Feature Selection Approach for Network Intrusion Classification

International Journal of Computer Vision and Image Processing ◽

10.4018/ijcvip.2013100104 ◽

2013 ◽

Vol 3 (4) ◽

pp. 51-59 ◽

Cited By ~ 1

Author(s):

Heba F. Eid ◽

Mostafa A. Salama ◽

Aboul Ella Hassanien

Keyword(s):

Feature Selection ◽

Classification Accuracy ◽

Learning Algorithm ◽

Local Maximum ◽

Selection Methods ◽

Hybrid Filter ◽

Data Set ◽

Network Intrusion ◽

Selection Approach ◽

Feature Selection Approach

Feature selection is a preprocessing step to machine learning, leads to increase the classification accuracy and reduce its complexity. Feature selection methods are classified into two main categories: filter and wrapper. Filter methods evaluate features without involving any learning algorithm, while wrapper methods depend on a learning algorithm for feature evaluation. Variety hybrid Filter and wrapper methods have been proposed in the literature. However, hybrid filter and wrapper approaches suffer from the problem of determining the cut-off point of the ranked features. This leads to decrease the classification accuracy by eliminating important features. In this paper the authors proposed a Hybrid Bi-Layer behavioral-based feature selection approach, which combines filter and wrapper feature selection methods. The proposed approach solves the cut-off point problem for the ranked features. It consists of two layers, at the first layer Information gain is used to rank the features and select a new set of features depending on a global maxima classification accuracy. Then, at the second layer a new subset of features is selected from within the first layer redacted data set by searching for a group of local maximum classification accuracy. To evaluate the proposed approach it is applied on NSL-KDD dataset, where the number of features is reduced from 41 to 34 features at the first layer. Then reduced from 34 to 20 features at the second layer, which leads to improve the classification accuracy to 99.2%.

MULTI-OBJECTIVE EVOLUTIONARY ALGORITHMS FOR FILTER BASED FEATURE SELECTION IN CLASSIFICATION

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213013500243 ◽

2013 ◽

Vol 22 (04) ◽

pp. 1350024 ◽

Cited By ~ 24

Author(s):

BING XUE ◽

LIAM CERVANTE ◽

LIN SHANG ◽

WILL N. BROWNE ◽

MENGJIE ZHANG

Keyword(s):

Feature Selection ◽

Evaluation Criteria ◽

Classification Problems ◽

Multi Objective ◽

Conflicting Objectives ◽

Benchmark Datasets ◽

Selection Algorithms ◽

Filter Algorithms ◽

Feature selection is a multi-objective problem with the two main conflicting objectives of minimising the number of features and maximising the classification performance. However, most existing feature selection algorithms are single objective and do not appropriately reflect the actual need. There are a small number of multi-objective feature selection algorithms, which are wrapper based and accordingly are computationally expensive and less general than filter algorithms. Evolutionary computation techniques are particularly suitable for multi-objective optimisation because they use a population of candidate solutions and are able to find multiple non-dominated solutions in a single run. However, the two well-known evolutionary multi-objective algorithms, non-dominated sorting based multi-objective genetic algorithm II (NSGAII) and strength Pareto evolutionary algorithm 2 (SPEA2) have not been applied to filter based feature selection. In this work, based on NSGAII and SPEA2, we develop two multi-objective, filter based feature selection frameworks. Four multi-objective feature selection methods are then developed by applying mutual information and entropy as two different filter evaluation criteria in each of the two proposed frameworks. The proposed multi-objective algorithms are examined and compared with a single objective method and three traditional methods (two filters and one wrapper) on eight benchmark datasets. A decision tree is employed to test the classification performance. Experimental results show that the proposed multi-objective algorithms can automatically evolve a set of non-dominated solutions that include a smaller number of features and achieve better classification performance than using all features. NSGAII and SPEA2 outperform the single objective algorithm, the two traditional filter algorithms and even the traditional wrapper algorithm in terms of both the number of features and the classification performance in most cases. NSGAII achieves similar performance to SPEA2 for the datasets that consist of a small number of features and slightly better results when the number of features is large. This work represents the first study on NSGAII and SPEA2 for filter feature selection in classification problems with both providing field leading classification performance.