BINARY PSO AND ROUGH SET THEORY FOR FEATURE SELECTION: A MULTI-OBJECTIVE FILTER BASED APPROACH

<p>Classification problems often have a large number of features, but not all of them are useful for classification. Irrelevant and redundant features may even reduce the classification accuracy. Feature selection is a process of selecting a subset of relevant features, which can decrease the dimensionality, shorten the running time, and/or improve the classification accuracy. There are two types of feature selection approaches, i.e. wrapper and filter approaches. Their main difference is that wrappers use a classification algorithm to evaluate the goodness of the features during the feature selection process while filters are independent of any classification algorithm. Feature selection is a difficult task because of feature interactions and the large search space. Existing feature selection methods suffer from different problems, such as stagnation in local optima and high computational cost. Evolutionary computation (EC) techniques are well-known global search algorithms. Particle swarm optimisation (PSO) is an EC technique that is computationally less expensive and can converge faster than other methods. PSO has been successfully applied to many areas, but its potential for feature selection has not been fully investigated. The overall goal of this thesis is to investigate and improve the capability of PSO for feature selection to select a smaller number of features and achieve similar or better classification performance than using all features. This thesis investigates the use of PSO for both wrapper and filter, and for both single objective and multi-objective feature selection, and also investigates the differences between wrappers and filters. This thesis proposes a new PSO based wrapper, single objective feature selection approach by developing new initialisation and updating mechanisms. The results show that by considering the number of features in the initialisation and updating procedures, the new algorithm can improve the classification performance, reduce the number of features and decrease computational time. This thesis develops the first PSO based wrapper multi-objective feature selection approach, which aims to maximise the classification accuracy and simultaneously minimise the number of features. The results show that the proposed multi-objective algorithm can obtain more and better feature subsets than single objective algorithms, and outperform other well-known EC based multi-objective feature selection algorithms. This thesis develops a filter, single objective feature selection approach based on PSO and information theory. Two measures are proposed to evaluate the relevance of the selected features based on each pair of features and a group of features, respectively. The results show that PSO and information based algorithms can successfully address feature selection tasks. The group based method achieves higher classification accuracies, but the pair based method is faster and selects smaller feature subsets. This thesis proposes the first PSO based multi-objective filter feature selection approach using information based measures. This work is also the first work using other two well-known multi-objective EC algorithms in filter feature selection, which are also used to compare the performance of the PSO based approach. The results show that the PSO based multiobjective filter approach can successfully address feature selection problems, outperform single objective filter algorithms and achieve better classification performance than other multi-objective algorithms. This thesis investigates the difference between wrapper and filter approaches in terms of the classification performance and computational time, and also examines the generality of wrappers. The results show that wrappers generally achieve better or similar classification performance than filters, but do not always need longer computational time than filters. The results also show that wrappers built with simple classification algorithms can be general to other classification algorithms.</p>

Download Full-text

Particle Swarm Optimisation for Feature Selection in Classification

10.26686/wgtn.17006317.v1 ◽

2021 ◽

Author(s):

◽

Bing Xue

Keyword(s):

Feature Selection ◽

Classification Accuracy ◽

Classification Performance ◽

Classification Algorithm ◽

Computational Time ◽

Classification Algorithms ◽

Multi Objective ◽

Selection Approach ◽

Feature Selection Approach ◽

Single Objective

<p>Classification problems often have a large number of features, but not all of them are useful for classification. Irrelevant and redundant features may even reduce the classification accuracy. Feature selection is a process of selecting a subset of relevant features, which can decrease the dimensionality, shorten the running time, and/or improve the classification accuracy. There are two types of feature selection approaches, i.e. wrapper and filter approaches. Their main difference is that wrappers use a classification algorithm to evaluate the goodness of the features during the feature selection process while filters are independent of any classification algorithm. Feature selection is a difficult task because of feature interactions and the large search space. Existing feature selection methods suffer from different problems, such as stagnation in local optima and high computational cost. Evolutionary computation (EC) techniques are well-known global search algorithms. Particle swarm optimisation (PSO) is an EC technique that is computationally less expensive and can converge faster than other methods. PSO has been successfully applied to many areas, but its potential for feature selection has not been fully investigated. The overall goal of this thesis is to investigate and improve the capability of PSO for feature selection to select a smaller number of features and achieve similar or better classification performance than using all features. This thesis investigates the use of PSO for both wrapper and filter, and for both single objective and multi-objective feature selection, and also investigates the differences between wrappers and filters. This thesis proposes a new PSO based wrapper, single objective feature selection approach by developing new initialisation and updating mechanisms. The results show that by considering the number of features in the initialisation and updating procedures, the new algorithm can improve the classification performance, reduce the number of features and decrease computational time. This thesis develops the first PSO based wrapper multi-objective feature selection approach, which aims to maximise the classification accuracy and simultaneously minimise the number of features. The results show that the proposed multi-objective algorithm can obtain more and better feature subsets than single objective algorithms, and outperform other well-known EC based multi-objective feature selection algorithms. This thesis develops a filter, single objective feature selection approach based on PSO and information theory. Two measures are proposed to evaluate the relevance of the selected features based on each pair of features and a group of features, respectively. The results show that PSO and information based algorithms can successfully address feature selection tasks. The group based method achieves higher classification accuracies, but the pair based method is faster and selects smaller feature subsets. This thesis proposes the first PSO based multi-objective filter feature selection approach using information based measures. This work is also the first work using other two well-known multi-objective EC algorithms in filter feature selection, which are also used to compare the performance of the PSO based approach. The results show that the PSO based multiobjective filter approach can successfully address feature selection problems, outperform single objective filter algorithms and achieve better classification performance than other multi-objective algorithms. This thesis investigates the difference between wrapper and filter approaches in terms of the classification performance and computational time, and also examines the generality of wrappers. The results show that wrappers generally achieve better or similar classification performance than filters, but do not always need longer computational time than filters. The results also show that wrappers built with simple classification algorithms can be general to other classification algorithms.</p>

Download Full-text

MULTI-OBJECTIVE EVOLUTIONARY ALGORITHMS FOR FILTER BASED FEATURE SELECTION IN CLASSIFICATION

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213013500243 ◽

2013 ◽

Vol 22 (04) ◽

pp. 1350024 ◽

Cited By ~ 24

Author(s):

BING XUE ◽

LIAM CERVANTE ◽

LIN SHANG ◽

WILL N. BROWNE ◽

MENGJIE ZHANG

Keyword(s):

Feature Selection ◽

Evaluation Criteria ◽

Classification Performance ◽

Classification Problems ◽

Multi Objective ◽

Conflicting Objectives ◽

Benchmark Datasets ◽

Selection Algorithms ◽

Filter Algorithms ◽

Single Objective

Feature selection is a multi-objective problem with the two main conflicting objectives of minimising the number of features and maximising the classification performance. However, most existing feature selection algorithms are single objective and do not appropriately reflect the actual need. There are a small number of multi-objective feature selection algorithms, which are wrapper based and accordingly are computationally expensive and less general than filter algorithms. Evolutionary computation techniques are particularly suitable for multi-objective optimisation because they use a population of candidate solutions and are able to find multiple non-dominated solutions in a single run. However, the two well-known evolutionary multi-objective algorithms, non-dominated sorting based multi-objective genetic algorithm II (NSGAII) and strength Pareto evolutionary algorithm 2 (SPEA2) have not been applied to filter based feature selection. In this work, based on NSGAII and SPEA2, we develop two multi-objective, filter based feature selection frameworks. Four multi-objective feature selection methods are then developed by applying mutual information and entropy as two different filter evaluation criteria in each of the two proposed frameworks. The proposed multi-objective algorithms are examined and compared with a single objective method and three traditional methods (two filters and one wrapper) on eight benchmark datasets. A decision tree is employed to test the classification performance. Experimental results show that the proposed multi-objective algorithms can automatically evolve a set of non-dominated solutions that include a smaller number of features and achieve better classification performance than using all features. NSGAII and SPEA2 outperform the single objective algorithm, the two traditional filter algorithms and even the traditional wrapper algorithm in terms of both the number of features and the classification performance in most cases. NSGAII achieves similar performance to SPEA2 for the datasets that consist of a small number of features and slightly better results when the number of features is large. This work represents the first study on NSGAII and SPEA2 for filter feature selection in classification problems with both providing field leading classification performance.

Download Full-text

A comprehensive comparison on evolutionary feature selection approaches to classification

10.26686/wgtn.14195936.v1 ◽

2021 ◽

Author(s):

Bing Xue ◽

Mengjie Zhang ◽

William Browne

Keyword(s):

Feature Selection ◽

Feature Selection Method ◽

Classification Performance ◽

Decision Makers ◽

Computational Time ◽

Multi Objective ◽

Advantages And Disadvantages ◽

Different Types ◽

Comprehensive Comparison ◽

Single Objective

© 2015 Imperial College Press. Feature selection is an important data preprocessing step in machine learning and data mining, such as classification tasks. Research on feature selection has been extensively conducted for more than 50 years and different types of approaches have been proposed, which include wrapper approaches or filter approaches, and single objective approaches or multi-objective approaches. However, the advantages and disadvantages of such approaches have not been thoroughly investigated. This paper provides a comprehensive study on comparing different types of feature selection approaches, specifically including comparisons on the classification performance and computational time of wrappers and filters, generality of wrapper approaches, and comparisons on single objective and multi-objective approaches. Particle swarm optimization (PSO)-based approaches, which include different types of methods, are used as typical examples to conduct this research. A total of 10 different feature selection methods and over 7000 experiments are involved. The results show that filters are usually faster than wrappers, but wrappers using a simple classification algorithm can be faster than filters. Wrappers often achieve better classification performance than filters. Feature subsets obtained from wrappers can be general to other classification algorithms. Meanwhile, multi-objective approaches are generally better choices than single objective algorithms. The findings are not only useful for researchers to develop new approaches to addressing new challenges in feature selection, but also useful for real-world decision makers to choose a specific feature selection method according to their own requirements.

Download Full-text

K-Fold Cross Validation for Selection of Cardiovascular Disease Diagnosis Features by Applying Rule-Based Datamining

Signal and Image Processing Letters ◽

10.31763/simple.v1i2.3 ◽

2019 ◽

Vol 1 (2) ◽

pp. 23-35

Author(s):

Dwi Normawati ◽

Dewi Pramudi Ismi

Keyword(s):

Coronary Heart Disease ◽

Feature Selection ◽

Heart Disease ◽

Rough Set ◽

Cross Validation ◽

Rough Set Theory ◽

Disease Diagnosis ◽

Classification Performance ◽

Rule Based ◽

Fold Cross Validation

Coronary heart disease is a disease that often causes human death, occurs when there is atherosclerosis blocking blood flow to the heart muscle in the coronary arteries. The doctor's referral method for diagnosing coronary heart disease is coronary angiography, but it is invasive, high risk and expensive. The purpose of this study is to analyze the effect of implementing the k-Fold Cross Validation (CV) dataset on the rule-based feature selection to diagnose coronary heart disease, using the Cleveland heart disease dataset. The research conducted a feature selection using a medical expert-based (MFS) and computer-based method, namely the Variable Precision Rough Set (VPRS), which is the development of the Rough Set theory. Evaluation of classification performance using the k-Fold method of 10-Fold, 5-Fold and 3-Fold. The results of the study are the number of attributes of the feature selection results are different in each Fold, both for the VPRS and MFS methods, for accuracy values obtained from the average accuracy resulting from 10-Fold, 5-Fold and 3-Fold. The result was the highest accuracy value in the VPRS method 76.34% with k = 5, while the MTF accuracy was 71.281% with k = 3. So, the k-fold implementation for this case is less effective, because the division of data is still structured, according to the order of records that apply in each fold, while the amount of testing data is too small and too structured. This affects the results of the accuracy because the testing rules are not thoroughly represented

Download Full-text

Feature Selection Combining Information Theory View and Algebraic View in the Neighborhood Decision System

Entropy ◽

10.3390/e23060704 ◽

2021 ◽

Vol 23 (6) ◽

pp. 704

Author(s):

Jiucheng Xu ◽

Kanglin Qu ◽

Meng Yuan ◽

Jie Yang

Keyword(s):

Information Theory ◽

Feature Selection ◽

Rough Set ◽

Rough Set Theory ◽

Classification Performance ◽

Joint Entropy ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Selection Algorithms ◽

Theory View

Feature selection is one of the core contents of rough set theory and application. Since the reduction ability and classification performance of many feature selection algorithms based on rough set theory and its extensions are not ideal, this paper proposes a feature selection algorithm that combines the information theory view and algebraic view in the neighborhood decision system. First, the neighborhood relationship in the neighborhood rough set model is used to retain the classification information of continuous data, to study some uncertainty measures of neighborhood information entropy. Second, to fully reflect the decision ability and classification performance of the neighborhood system, the neighborhood credibility and neighborhood coverage are defined and introduced into the neighborhood joint entropy. Third, a feature selection algorithm based on neighborhood joint entropy is designed, which improves the disadvantage that most feature selection algorithms only consider information theory definition or algebraic definition. Finally, experiments and statistical analyses on nine data sets prove that the algorithm can effectively select the optimal feature subset, and the selection result can maintain or improve the classification performance of the data set.

Download Full-text

A comprehensive comparison on evolutionary feature selection approaches to classification

10.26686/wgtn.14195936 ◽

2021 ◽

Author(s):

Bing Xue ◽

Mengjie Zhang ◽

William Browne

Keyword(s):

Feature Selection ◽

Feature Selection Method ◽

Classification Performance ◽

Decision Makers ◽

Computational Time ◽

Multi Objective ◽

Advantages And Disadvantages ◽

Different Types ◽

Comprehensive Comparison ◽

Single Objective

© 2015 Imperial College Press. Feature selection is an important data preprocessing step in machine learning and data mining, such as classification tasks. Research on feature selection has been extensively conducted for more than 50 years and different types of approaches have been proposed, which include wrapper approaches or filter approaches, and single objective approaches or multi-objective approaches. However, the advantages and disadvantages of such approaches have not been thoroughly investigated. This paper provides a comprehensive study on comparing different types of feature selection approaches, specifically including comparisons on the classification performance and computational time of wrappers and filters, generality of wrapper approaches, and comparisons on single objective and multi-objective approaches. Particle swarm optimization (PSO)-based approaches, which include different types of methods, are used as typical examples to conduct this research. A total of 10 different feature selection methods and over 7000 experiments are involved. The results show that filters are usually faster than wrappers, but wrappers using a simple classification algorithm can be faster than filters. Wrappers often achieve better classification performance than filters. Feature subsets obtained from wrappers can be general to other classification algorithms. Meanwhile, multi-objective approaches are generally better choices than single objective algorithms. The findings are not only useful for researchers to develop new approaches to addressing new challenges in feature selection, but also useful for real-world decision makers to choose a specific feature selection method according to their own requirements.

Download Full-text

Feature Selection of Grey Wolf Optimizer Based on Quantum Computing and Uncertain Symmetry Rough Set

Symmetry ◽

10.3390/sym11121470 ◽

2019 ◽

Vol 11 (12) ◽

pp. 1470

Author(s):

Guobao Zhao ◽

Haiying Wang ◽

Deli Jia ◽

Quanbin Wang

Keyword(s):

Feature Selection ◽

Quantum Computing ◽

Rough Set ◽

Classification Accuracy ◽

Rough Set Theory ◽

Attribute Reduction ◽

Classification Performance ◽

Grey Wolf Optimizer ◽

Grey Wolf ◽

Local Searches

Considering the crucial influence of feature selection on data classification accuracy, a grey wolf optimizer based on quantum computing and uncertain symmetry rough set (QCGWORS) was proposed. QCGWORS was to apply a parallel of three theories to feature selection, and each of them owned the unique advantages of optimizing feature selection algorithm. Quantum computing had a good balance ability when exploring feature sets between global and local searches. Grey wolf optimizer could effectively explore all possible feature subsets, and uncertain symmetry rough set theory could accurately evaluate the correlation of potential feature subsets. QCGWORS intelligent algorithm could minimize the number of features while maximizing classification performance. In the experimental stage, k nearest neighbors (KNN) classifier and random forest (RF) classifier guided the machine learning process of the proposed algorithm, and 13 datasets were compared for testing experiments. Experimental results showed that compared with other feature selection methods, QCGWORS improved the classification accuracy on 12 datasets, among which the best accuracy was increased by 20.91%. In attribute reduction, each dataset had a benefit of the reduction effect of the minimum feature number.

Download Full-text

A Comprehensive Comparison on Evolutionary Feature Selection Approaches to Classification

International Journal of Computational Intelligence and Applications ◽

10.1142/s146902681550008x ◽

2015 ◽

Vol 14 (02) ◽

pp. 1550008 ◽

Cited By ~ 27

Author(s):

Bing Xue ◽

Mengjie Zhang ◽

Will N. Browne

Keyword(s):

Feature Selection ◽

Feature Selection Method ◽

Classification Performance ◽

Decision Makers ◽

Computational Time ◽

Multi Objective ◽

Advantages And Disadvantages ◽

Different Types ◽

Comprehensive Comparison ◽

Single Objective

Feature selection is an important data preprocessing step in machine learning and data mining, such as classification tasks. Research on feature selection has been extensively conducted for more than 50 years and different types of approaches have been proposed, which include wrapper approaches or filter approaches, and single objective approaches or multi-objective approaches. However, the advantages and disadvantages of such approaches have not been thoroughly investigated. This paper provides a comprehensive study on comparing different types of feature selection approaches, specifically including comparisons on the classification performance and computational time of wrappers and filters, generality of wrapper approaches, and comparisons on single objective and multi-objective approaches. Particle swarm optimization (PSO)-based approaches, which include different types of methods, are used as typical examples to conduct this research. A total of 10 different feature selection methods and over 7000 experiments are involved. The results show that filters are usually faster than wrappers, but wrappers using a simple classification algorithm can be faster than filters. Wrappers often achieve better classification performance than filters. Feature subsets obtained from wrappers can be general to other classification algorithms. Meanwhile, multi-objective approaches are generally better choices than single objective algorithms. The findings are not only useful for researchers to develop new approaches to addressing new challenges in feature selection, but also useful for real-world decision makers to choose a specific feature selection method according to their own requirements.

Download Full-text