An Efficient, Parallelized Algorithm for Optimal Conditional Entropy-Based Feature Selection

Feature selection is an important task in data miningand machine learning to reduce the dimensionality of the dataand increase the performance of an algorithm, such as a clas-sification algorithm. However, feature selection is a challengingtask due mainly to the large search space. A variety of methodshave been applied to solve feature selection problems, whereevolutionary computation techniques have recently gained muchattention and shown some success. However, there are no compre-hensive guidelines on the strengths and weaknesses of alternativeapproaches. This leads to a disjointed and fragmented fieldwith ultimately lost opportunities for improving performanceand successful applications. This paper presents a comprehensivesurvey of the state-of-the-art work on evolutionary computationfor feature selection, which identifies the contributions of thesedifferent algorithms. In addition, current issues and challengesare also discussed to identify promising areas for future research. Index Terms—Evolutionary computation, feature selection,classification, data mining, machine learning. © 2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Download Full-text

A Survey on Evolutionary Computation Approaches to Feature Selection

10.26686/wgtn.14214497.v1 ◽

2021 ◽

Author(s):

Bing Xue ◽

Mengjie Zhang ◽

William Browne ◽

X Yao

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Evolutionary Computation ◽

State Of The Art ◽

Search Space ◽

Mining Machine ◽

Future Research ◽

Selection Problems ◽

Personal Use ◽

Index Terms

Feature selection is an important task in data miningand machine learning to reduce the dimensionality of the dataand increase the performance of an algorithm, such as a clas-sification algorithm. However, feature selection is a challengingtask due mainly to the large search space. A variety of methodshave been applied to solve feature selection problems, whereevolutionary computation techniques have recently gained muchattention and shown some success. However, there are no compre-hensive guidelines on the strengths and weaknesses of alternativeapproaches. This leads to a disjointed and fragmented fieldwith ultimately lost opportunities for improving performanceand successful applications. This paper presents a comprehensivesurvey of the state-of-the-art work on evolutionary computationfor feature selection, which identifies the contributions of thesedifferent algorithms. In addition, current issues and challengesare also discussed to identify promising areas for future research. Index Terms—Evolutionary computation, feature selection,classification, data mining, machine learning. © 2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Download Full-text

Comparing Swarm Intelligence Algorithms for Dimension Reduction in Machine Learning

Big Data and Cognitive Computing ◽

10.3390/bdcc5030036 ◽

2021 ◽

Vol 5 (3) ◽

pp. 36

Author(s):

Gabriella Kicska ◽

Attila Kiss

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Swarm Intelligence ◽

Feature Selection Method ◽

Bat Algorithm ◽

Search Space ◽

Grey Wolf Optimizer ◽

Decision Tree Classifier ◽

Invasive Weed ◽

Tree Classifier

Nowadays, the high-dimensionality of data causes a variety of problems in machine learning. It is necessary to reduce the feature number by selecting only the most relevant of them. Different approaches called Feature Selection are used for this task. In this paper, we propose a Feature Selection method that uses Swarm Intelligence techniques. Swarm Intelligence algorithms perform optimization by searching for optimal points in the search space. We show the usability of these techniques for solving Feature Selection and compare the performance of five major swarm algorithms: Particle Swarm Optimization, Artificial Bee Colony, Invasive Weed Optimization, Bat Algorithm, and Grey Wolf Optimizer. The accuracy of a decision tree classifier was used to evaluate the algorithms. It turned out that the dimension of the data can be reduced about two times without a loss in accuracy. Moreover, the accuracy increased when abandoning redundant features. Based on our experiments GWO turned out to be the best. It has the highest ranking on different datasets, and its average iteration number to find the best solution is 30.8. ABC obtained the lowest ranking on high-dimensional datasets.

Download Full-text

Sentiment Analysis of Movie Reviews: A Study of Machine Learning Algorithms with Various Feature Selection Methods

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v5i9.113121 ◽

2017 ◽

Vol 5 (9) ◽

Cited By ~ 1

Author(s):

Rajwinder Kaur

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Sentiment Analysis ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Selection Methods

Download Full-text

Detection With Firefly Algorithm (FA) Based Feature Selection Forautism Spectrum Disorder (ASD) and Machine Learning Classification

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v7i5.992998 ◽

2019 ◽

Vol 7 (5) ◽

pp. 992-998

Author(s):

R. Rajeswari ◽

R.S. Padma Priya

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Firefly Algorithm ◽

Spectrum Disorder ◽

Machine Learning Classification

Download Full-text

Feature Selection Algorithm for High-dimensional Biomedical Data Using Information Gain and Improved Chemical Reaction Optimization

Current Bioinformatics ◽

10.2174/1574893615666200204154358 ◽

2021 ◽

Vol 15 (8) ◽

pp. 912-926

Author(s):

Ge Zhang ◽

Pan Yu ◽

Jianlin Wang ◽

Chaokun Yan

Keyword(s):

Feature Selection ◽

Chemical Reaction ◽

Information Gain ◽

Feature Selection Method ◽

Search Space ◽

Neighborhood Search ◽

Biomedical Data ◽

Chemical Reaction Optimization ◽

Search Mechanism ◽

Reaction Optimization

Background: There have been rapid developments in various bioinformatics technologies, which have led to the accumulation of a large amount of biomedical data. However, these datasets usually involve thousands of features and include much irrelevant or redundant information, which leads to confusion during diagnosis. Feature selection is a solution that consists of finding the optimal subset, which is known to be an NP problem because of the large search space. Objective: For the issue, this paper proposes a hybrid feature selection method based on an improved chemical reaction optimization algorithm (ICRO) and an information gain (IG) approach, which called IGICRO. Methods: IG is adopted to obtain some important features. The neighborhood search mechanism is combined with ICRO to increase the diversity of the population and improve the capacity of local search. Results: Experimental results of eight public available data sets demonstrate that our proposed approach outperforms original CRO and other state-of-the-art approaches.

Download Full-text

A Unified View of Causal and Non-causal Feature Selection

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3436891 ◽

2021 ◽

Vol 15 (4) ◽

pp. 1-46

Author(s):

Kui Yu ◽

Lin Liu ◽

Jiuyong Li

Keyword(s):

Feature Selection ◽

Bayesian Network ◽

Synthetic Data ◽

Selection Methods ◽

Bayesian Network Model ◽

Real World Data ◽

Feature Sets ◽

Unified View ◽

Optimal Feature ◽

Different Levels

In this article, we aim to develop a unified view of causal and non-causal feature selection methods. The unified view will fill in the gap in the research of the relation between the two types of methods. Based on the Bayesian network framework and information theory, we first show that causal and non-causal feature selection methods share the same objective. That is to find the Markov blanket of a class attribute, the theoretically optimal feature set for classification. We then examine the assumptions made by causal and non-causal feature selection methods when searching for the optimal feature set, and unify the assumptions by mapping them to the restrictions on the structure of the Bayesian network model of the studied problem. We further analyze in detail how the structural assumptions lead to the different levels of approximations employed by the methods in their search, which then result in the approximations in the feature sets found by the methods with respect to the optimal feature set. With the unified view, we can interpret the output of non-causal methods from a causal perspective and derive the error bounds of both types of methods. Finally, we present practical understanding of the relation between causal and non-causal methods using extensive experiments with synthetic data and various types of real-world data.

Download Full-text

Preliminary Evaluation of Search Space Characterization and Effects of Machine Learning

2020 SoutheastCon ◽

10.1109/southeastcon44009.2020.9249659 ◽

2020 ◽

Author(s):

Leo A. Ghelarducci ◽

Keith Garfield

Keyword(s):

Machine Learning ◽

Search Space ◽

Preliminary Evaluation ◽

Space Characterization

Download Full-text

An Improved Machine Learning-Based Employees Attrition Prediction Framework with Emphasis on Feature Selection

Mathematics ◽

10.3390/math9111226 ◽

2021 ◽

Vol 9 (11) ◽

pp. 1226

Author(s):

Saeed Najafi-Zangeneh ◽

Naser Shams-Gharneh ◽

Ali Arjomandi-Nezhad ◽

Sarfaraz Hashemkhani Zolfani

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Standard Deviation ◽

Analytical Formula ◽

Feature Selection Method ◽

Selection Method ◽

Performance Measure ◽

Learning Approaches ◽

Training Costs ◽

Professional Employees

Companies always seek ways to make their professional employees stay with them to reduce extra recruiting and training costs. Predicting whether a particular employee may leave or not will help the company to make preventive decisions. Unlike physical systems, human resource problems cannot be described by a scientific-analytical formula. Therefore, machine learning approaches are the best tools for this aim. This paper presents a three-stage (pre-processing, processing, post-processing) framework for attrition prediction. An IBM HR dataset is chosen as the case study. Since there are several features in the dataset, the “max-out” feature selection method is proposed for dimension reduction in the pre-processing stage. This method is implemented for the IBM HR dataset. The coefficient of each feature in the logistic regression model shows the importance of the feature in attrition prediction. The results show improvement in the F1-score performance measure due to the “max-out” feature selection method. Finally, the validity of parameters is checked by training the model for multiple bootstrap datasets. Then, the average and standard deviation of parameters are analyzed to check the confidence value of the model’s parameters and their stability. The small standard deviation of parameters indicates that the model is stable and is more likely to generalize well.

Download Full-text

A Comparison of Feature Selection and Forecasting Machine Learning Algorithms for Predicting Glycaemia in Type 1 Diabetes Mellitus

Applied Sciences ◽

10.3390/app11041742 ◽

2021 ◽

Vol 11 (4) ◽

pp. 1742

Author(s):

Ignacio Rodríguez-Rodríguez ◽

José-Víctor Rodríguez ◽

Wai Lok Woo ◽

Bo Wei ◽

Domingo-Javier Pardo-Quiles

Keyword(s):

Diabetes Mellitus ◽

Machine Learning ◽

Type 1 Diabetes ◽

Feature Selection ◽

Blood Glucose ◽

Type 1 Diabetes Mellitus ◽

Support Vector ◽

Chronic Hyperglycemia ◽

Predictive Algorithms

Type 1 diabetes mellitus (DM1) is a metabolic disease derived from falls in pancreatic insulin production resulting in chronic hyperglycemia. DM1 subjects usually have to undertake a number of assessments of blood glucose levels every day, employing capillary glucometers for the monitoring of blood glucose dynamics. In recent years, advances in technology have allowed for the creation of revolutionary biosensors and continuous glucose monitoring (CGM) techniques. This has enabled the monitoring of a subject’s blood glucose level in real time. On the other hand, few attempts have been made to apply machine learning techniques to predicting glycaemia levels, but dealing with a database containing such a high level of variables is problematic. In this sense, to the best of the authors’ knowledge, the issues of proper feature selection (FS)—the stage before applying predictive algorithms—have not been subject to in-depth discussion and comparison in past research when it comes to forecasting glycaemia. Therefore, in order to assess how a proper FS stage could improve the accuracy of the glycaemia forecasted, this work has developed six FS techniques alongside four predictive algorithms, applying them to a full dataset of biomedical features related to glycaemia. These were harvested through a wide-ranging passive monitoring process involving 25 patients with DM1 in practical real-life scenarios. From the obtained results, we affirm that Random Forest (RF) as both predictive algorithm and FS strategy offers the best average performance (Root Median Square Error, RMSE = 18.54 mg/dL) throughout the 12 considered predictive horizons (up to 60 min in steps of 5 min), showing Support Vector Machines (SVM) to have the best accuracy as a forecasting algorithm when considering, in turn, the average of the six FS techniques applied (RMSE = 20.58 mg/dL).

Download Full-text