Feature Selection with Conditional Mutual Information Considering Feature Interaction

Building classification models from real-world datasets became a difficult task, especially in datasets with high dimensional features. Unfortunately, these datasets may include irrelevant or redundant features which have a negative effect on the classification performance. Selecting the significant features and eliminating undesirable features can improve the classification models. Fuzzy mutual information is widely used feature selection to find the best feature subset before classification process. However, it requires more computation and storage space. To overcome these limitations, this paper proposes an improved fuzzy mutual information feature selection based on representative samples. Based on benchmark datasets, the experiments show that the proposed method achieved better results in the terms of classification accuracy, selected feature subset size, storage, and stability.

Download Full-text

A NOVEL FEATURE SELECTION ALGORITHM WITH SUPERVISED MUTUAL INFORMATION FOR CLASSIFICATION

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213013500279 ◽

2013 ◽

Vol 22 (04) ◽

pp. 1350027

Author(s):

JAGANATHAN PALANICHAMY ◽

KUPPUCHAMY RAMASAMY

Keyword(s):

Machine Learning ◽

Data Mining ◽

Feature Selection ◽

Mutual Information ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Class A ◽

Selection Algorithms ◽

The Relationship ◽

Class Variable

Feature selection is essential in data mining and pattern recognition, especially for database classification. During past years, several feature selection algorithms have been proposed to measure the relevance of various features to each class. A suitable feature selection algorithm normally maximizes the relevancy and minimizes the redundancy of the selected features. The mutual information measure can successfully estimate the dependency of features on the entire sampling space, but it cannot exactly represent the redundancies among features. In this paper, a novel feature selection algorithm is proposed based on maximum relevance and minimum redundancy criterion. The mutual information is used to measure the relevancy of each feature with class variable and calculate the redundancy by utilizing the relationship between candidate features, selected features and class variables. The effectiveness is tested with ten benchmarked datasets available in UCI Machine Learning Repository. The experimental results show better performance when compared with some existing algorithms.

Download Full-text

WJMI: A New Feature Selection Algorithm Based on Weighted Joint Mutual Information

Proceedings of the 3rd International Conference on Mechatronics and Industrial Informatics ◽

10.2991/icmii-15.2015.108 ◽

2015 ◽

Cited By ~ 1

Author(s):

Xiuli Qi ◽

Chengxiang Yin ◽

Kai Cheng ◽

Xianglin Liao ◽

Xingdang Kang

Keyword(s):

Feature Selection ◽

Mutual Information ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

New Feature

Download Full-text

A redundancy-removing feature selection algorithm for nominal data

PeerJ Computer Science ◽

10.7717/peerj-cs.24 ◽

2015 ◽

Vol 1 ◽

pp. e24 ◽

Cited By ~ 1

Author(s):

Zhihua Li ◽

Wenqu Gu

Keyword(s):

Feature Selection ◽

Mutual Information ◽

Feature Selection Method ◽

Selection Method ◽

Selection Algorithm ◽

Nominal Data ◽

New Information ◽

New Feature ◽

High Dimensional Datasets ◽

Experimental Comparisons

No order correlation or similarity metric exists in nominal data, and there will always be more redundancy in a nominal dataset, which means that an efficient mutual information-based nominal-data feature selection method is relatively difficult to find. In this paper, a nominal-data feature selection method based on mutual information without data transformation, called the redundancy-removing more relevance less redundancy algorithm, is proposed. By forming several new information-related definitions and the corresponding computational methods, the proposed method can compute the information-related amount of nominal data directly. Furthermore, by creating a new evaluation function that considers both the relevance and the redundancy globally, the new feature selection method can evaluate the importance of each nominal-data feature. Although the presented feature selection method takes commonly used MIFS-like forms, it is capable of handling high-dimensional datasets without expensive computations. We perform extensive experimental comparisons of the proposed algorithm and other methods using three benchmarking nominal datasets with two different classifiers. The experimental results demonstrate the average advantage of the presented algorithm over the well-known NMIFS algorithm in terms of the feature selection and classification accuracy, which indicates that the proposed method has a promising performance.

Download Full-text

An improved feature selection algorithm with conditional mutual information for classification problems

2013 International Conference on Human Computer Interactions (ICHCI) ◽

10.1109/ichci-ieee.2013.6887802 ◽

2013 ◽

Cited By ~ 1

Author(s):

Jaganathan Palanichamy ◽

Kuppuchamy Ramasamy

Keyword(s):

Feature Selection ◽

Mutual Information ◽

Conditional Mutual Information ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Classification Problems

Download Full-text

An Improved Feature Selection Algorithm Based on Parzen Window and Conditional Mutual Information

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.347-350.2614 ◽

2013 ◽

Vol 347-350 ◽

pp. 2614-2619

Author(s):

Deng Chao He ◽

Wen Ning Hao ◽

Gang Chen ◽

Da Wei Jin

Keyword(s):

Feature Selection ◽

Mutual Information ◽

Probability Density Functions ◽

Density Functions ◽

Continuous Variables ◽

Conditional Mutual Information ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Parzen Window ◽

Selection For

In this paper, an improved feature selection algorithm by conditional mutual information with Parzen window was proposed, which adopted conditional mutual information as an evaluation criterion of feature selection in order to overcome the deficiency of feature redundant and used Parzen window to estimate the probability density functions and calculate the conditional mutual information of continuous variables, in such a way as to achieve feature selection for continuous data.

Download Full-text

MULTI-OBJECTIVE EVOLUTIONARY ALGORITHMS FOR FILTER BASED FEATURE SELECTION IN CLASSIFICATION

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213013500243 ◽

2013 ◽

Vol 22 (04) ◽

pp. 1350024 ◽

Cited By ~ 24

Author(s):

BING XUE ◽

LIAM CERVANTE ◽

LIN SHANG ◽

WILL N. BROWNE ◽

MENGJIE ZHANG

Keyword(s):

Feature Selection ◽

Evaluation Criteria ◽

Classification Performance ◽

Classification Problems ◽

Multi Objective ◽

Conflicting Objectives ◽

Benchmark Datasets ◽

Selection Algorithms ◽

Filter Algorithms ◽

Single Objective

Feature selection is a multi-objective problem with the two main conflicting objectives of minimising the number of features and maximising the classification performance. However, most existing feature selection algorithms are single objective and do not appropriately reflect the actual need. There are a small number of multi-objective feature selection algorithms, which are wrapper based and accordingly are computationally expensive and less general than filter algorithms. Evolutionary computation techniques are particularly suitable for multi-objective optimisation because they use a population of candidate solutions and are able to find multiple non-dominated solutions in a single run. However, the two well-known evolutionary multi-objective algorithms, non-dominated sorting based multi-objective genetic algorithm II (NSGAII) and strength Pareto evolutionary algorithm 2 (SPEA2) have not been applied to filter based feature selection. In this work, based on NSGAII and SPEA2, we develop two multi-objective, filter based feature selection frameworks. Four multi-objective feature selection methods are then developed by applying mutual information and entropy as two different filter evaluation criteria in each of the two proposed frameworks. The proposed multi-objective algorithms are examined and compared with a single objective method and three traditional methods (two filters and one wrapper) on eight benchmark datasets. A decision tree is employed to test the classification performance. Experimental results show that the proposed multi-objective algorithms can automatically evolve a set of non-dominated solutions that include a smaller number of features and achieve better classification performance than using all features. NSGAII and SPEA2 outperform the single objective algorithm, the two traditional filter algorithms and even the traditional wrapper algorithm in terms of both the number of features and the classification performance in most cases. NSGAII achieves similar performance to SPEA2 for the datasets that consist of a small number of features and slightly better results when the number of features is large. This work represents the first study on NSGAII and SPEA2 for filter feature selection in classification problems with both providing field leading classification performance.

Download Full-text

A redundancy-removing feature selection algorithm for nominal data

10.7287/peerj.preprints.1184v1 ◽

2015 ◽

Cited By ~ 1

Author(s):

Zhihua Li

Keyword(s):

Feature Selection ◽

Mutual Information ◽

Feature Selection Method ◽

Selection Method ◽

Selection Algorithm ◽

Nominal Data ◽

New Information ◽

New Feature ◽

High Dimensional Datasets ◽

Experimental Comparisons

No order correlation or similarity metric exists in nominal data, and there will always be more redundancy in a nominal dataset, which means that an efficient mutual information-based nominal-data feature selection method is relatively difficult to find. In this paper, a nominal-data feature selection method based on mutual information without data transformation, called the redundancy-removing more relevance less redundancy algorithm, is proposed. By forming several new information-related definitions and the corresponding computational methods, the proposed method can compute the information-related amount of nominal data directly. Furthermore, by creating a new evaluation function that considers both the relevance and the redundancy globally, the new feature selection method can evaluate the importance of each nominal-data feature. Although the presented feature selection method takes commonly used MIFS-like forms, it is capable of handling high-dimensional datasets without expensive computations. We perform extensive experimental comparisons of the proposed algorithm and other methods using three benchmarking nominal datasets with two different classifiers. The experimental results demonstrate the average advantage of the presented algorithm over the well-known NMIFS algorithm in terms of the feature selection and classification accuracy, which indicates that the proposed method has a promising performance.

Download Full-text

Particle swarm optimisation for feature selection: A hybrid filter-wrapper approach

10.26686/wgtn.14273612.v1 ◽

2021 ◽

Author(s):

T Butler-Yeoman ◽

Bing Xue ◽

Mengjie Zhang

Keyword(s):

Feature Selection ◽

Classification Accuracy ◽

Computational Cost ◽

Particle Swarm ◽

Classification Performance ◽

Particle Swarm Optimisation ◽

Computational Time ◽

Advantages And Disadvantages ◽

Benchmark Datasets ◽

Selection Algorithms

© 2015 IEEE. Feature selection is an important pre-processing step, which can reduce the dimensionality of a dataset and increase the accuracy and efficiency of a learning/classification algorithm. However, existing feature selection algorithms mainly wrappers and filters have their own advantages and disadvantages. This paper proposes two filter-wrapper hybrid feature selection algorithms based on particle swarm optimisation (PSO), where the first algorithm named FastPSO combined filter and wrapper into the search process of PSO for feature selection with most of the evaluations as filters and a small number of evaluations as wrappers. The second algorithm named RapidPSO further reduced the number of wrapper evaluations. Theoretical analysis on FastPSO and RapidPSO is conducted to investigate their complexity. FastPSO and RapidPSO are compared with a pure wrapper algorithm named WrapperPSO and a pure filter algorithm named FilterPSO on nine benchmark datasets of varying difficulty. The experimental results show that both FastPSO and RapidPSO can successfully reduce the number of features and simultaneously increase the classification performance over using all features. The two proposed algorithms maintain the high classification performance achieved by WrapperPSO and significantly reduce the computational time, although the number of features is larger. At the same time, they increase the classification accuracy of FilterPSO and reduce the number of features, but increased the computational cost. FastPSO outperformed RapidPSO in terms of the classification accuracy and the number of features, but increased the computational time, which shows the trade-off between the efficiency and effectiveness. © 2015 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Download Full-text

A Feature Selection Algorithm Integrating Maximum Classification Information and Minimum Interaction Feature Dependency Information

Computational Intelligence and Neuroscience ◽

10.1155/2021/3569632 ◽

2021 ◽

Vol 2021 ◽

pp. 1-10

Author(s):

Li Zhang

Keyword(s):

Feature Selection ◽

Mutual Information ◽

Information Gain ◽

Small Sample ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Class Labels ◽

Minimum Interaction ◽

Classification Information ◽

Selection Algorithms

Feature selection is the key step in the analysis of high-dimensional small sample data. The core of feature selection is to analyse and quantify the correlation between features and class labels and the redundancy between features. However, most of the existing feature selection algorithms only consider the classification contribution of individual features and ignore the influence of interfeature redundancy and correlation. Therefore, this paper proposes a feature selection algorithm for nonlinear dynamic conditional relevance (NDCRFS) through the study and analysis of the existing feature selection algorithm ideas and method. Firstly, redundancy and relevance between features and between features and class labels are discriminated by mutual information, conditional mutual information, and interactive mutual information. Secondly, the selected features and candidate features are dynamically weighted utilizing information gain factors. Finally, to evaluate the performance of this feature selection algorithm, NDCRFS was validated against 6 other feature selection algorithms on three classifiers, using 12 different data sets, for variability and classification metrics between the different algorithms. The experimental results show that the NDCRFS method can improve the quality of the feature subsets and obtain better classification results.

Download Full-text