An improved Random Forest based on Feature Selection and Feature weighting for case retrieval in CBR system Application to medical data

doi:10.4018/ijsi.293265

A hybrid feature selection algorithm combining ReliefF and Particle swarm optimization for high-dimensional medical data

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-202948 ◽

2021 ◽

pp. 1-15

Author(s):

Zhaozhao Xu ◽

Derong Shen ◽

Yue Kou ◽

Tiezheng Nie

Keyword(s):

Feature Selection ◽

Particle Swarm Optimization ◽

Random Forest ◽

Classification Accuracy ◽

Particle Swarm ◽

Medical Data ◽

High Dimensional ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Swarm Optimization

Due to high-dimensional feature and strong correlation of features, the classification accuracy of medical data is not as good enough as expected. feature selection is a common algorithm to solve this problem, and selects effective features by reducing the dimensionality of high-dimensional data. However, traditional feature selection algorithms have the blindness of threshold setting and the search algorithms are liable to fall into a local optimal solution. Based on it, this paper proposes a hybrid feature selection algorithm combining ReliefF and Particle swarm optimization. The algorithm is mainly divided into three parts: Firstly, the ReliefF is used to calculate the feature weight, and the features are ranked by the weight. Then ranking feature is grouped according to the density equalization, where the density of features in each group is the same. Finally, the Particle Swarm Optimization algorithm is used to search the ranking feature groups, and the feature selection is performed according to a new fitness function. Experimental results show that the random forest has the highest classification accuracy on the features selected. More importantly, it has the least number of features. In addition, experimental results on 2 medical datasets show that the average accuracy of random forest reaches 90.20%, which proves that the hybrid algorithm has a certain application value.

Download Full-text

An Optimal Categorization of Feature Selection Methods for Knowledge Discovery

Data Mining ◽

10.4018/978-1-4666-2455-9.ch005 ◽

2013 ◽

pp. 92-106

Author(s):

Harleen Kaur ◽

Ritu Chauhan ◽

M. Alam

Keyword(s):

Data Mining ◽

Feature Selection ◽

Discriminant Analysis ◽

Medical Data ◽

Stepwise Discriminant Analysis ◽

Selection Methods ◽

Medical Databases ◽

Active Research ◽

Potential Improvement ◽

Large Effort

With the continuous availability of massive experimental medical data has given impetus to a large effort in developing mathematical, statistical and computational intelligent techniques to infer models from medical databases. Feature selection has been an active research area in pattern recognition, statistics, and data mining communities. However, there have been relatively few studies on preprocessing data used as input for data mining systems in medical data. In this chapter, the authors focus on several feature selection methods as to their effectiveness in preprocessing input medical data. They evaluate several feature selection algorithms such as Mutual Information Feature Selection (MIFS), Fast Correlation-Based Filter (FCBF) and Stepwise Discriminant Analysis (STEPDISC) with machine learning algorithm naive Bayesian and Linear Discriminant analysis techniques. The experimental analysis of feature selection technique in medical databases has enable the authors to find small number of informative features leading to potential improvement in medical diagnosis by reducing the size of data set, eliminating irrelevant features, and decreasing the processing time.

Download Full-text

A Hybrid Model Based on EMD-Feature Selection and Random Forest Method for Medical Data Forecasting

International Journal of Academic Research in Accounting Finance and Management Sciences ◽

10.6007/ijarafms/v9-i4/6841 ◽

2020 ◽

Vol 9 (4) ◽

Author(s):

Duen-Huang Huang ◽

Chih-Hung Tsai ◽

Hao-En Chueh ◽

Liang-Ying Wei

Keyword(s):

Feature Selection ◽

Random Forest ◽

Hybrid Model ◽

Medical Data ◽

Model Based ◽

Random Forest Method

Download Full-text

An Optimal Categorization of Feature Selection Methods for Knowledge Discovery

Visual Analytics and Interactive Technologies ◽

10.4018/978-1-60960-102-7.ch006 ◽

2011 ◽

pp. 94-108 ◽

Cited By ~ 4

Author(s):

Harleen Kaur ◽

Ritu Chauhan ◽

M. Alam

Keyword(s):

Data Mining ◽

Feature Selection ◽

Discriminant Analysis ◽

Medical Data ◽

Stepwise Discriminant Analysis ◽

Selection Methods ◽

Medical Databases ◽

Active Research ◽

Potential Improvement ◽

Large Effort

With the continuous availability of massive experimental medical data has given impetus to a large effort in developing mathematical, statistical and computational intelligent techniques to infer models from medical databases. Feature selection has been an active research area in pattern recognition, statistics, and data mining communities. However, there have been relatively few studies on preprocessing data used as input for data mining systems in medical data. In this chapter, the authors focus on several feature selection methods as to their effectiveness in preprocessing input medical data. They evaluate several feature selection algorithms such as Mutual Information Feature Selection (MIFS), Fast Correlation-Based Filter (FCBF) and Stepwise Discriminant Analysis (STEPDISC) with machine learning algorithm naive Bayesian and Linear Discriminant analysis techniques. The experimental analysis of feature selection technique in medical databases has enable the authors to find small number of informative features leading to potential improvement in medical diagnosis by reducing the size of data set, eliminating irrelevant features, and decreasing the processing time.

Download Full-text

An Integrated Solution for Snoring Sound Classification Using Bhattacharyya Distance Based GMM Supervectors with SVM, Feature Selection with Random Forest and Spectrogram with CNN

10.21437/interspeech.2017-1794 ◽

2017 ◽

Cited By ~ 3

Author(s):

Tin Lay Nwe ◽

Huy Dat Tran ◽

Wen Zheng Terence Ng ◽

Bin Ma

Keyword(s):

Feature Selection ◽

Random Forest ◽

Bhattacharyya Distance ◽

Sound Classification

Download Full-text

MetalExplorer, a Bioinformatics Tool for the Improved Prediction of Eight Types of Metal-Binding Sites Using a Random Forest Algorithm with Two- Step Feature Selection

Current Bioinformatics ◽

10.2174/2468422806666160618091522 ◽

2017 ◽

Vol 12 (6) ◽

Cited By ~ 6

Author(s):

Jiangning Song ◽

Chen Li ◽

Cheng Zheng ◽

Jerico Revote ◽

Ziding Zhang ◽

...

Keyword(s):

Feature Selection ◽

Random Forest ◽

Metal Binding ◽

Binding Sites ◽

Random Forest Algorithm ◽

Bioinformatics Tool ◽

Metal Binding Sites

Download Full-text

Application of GA Feature Selection on Naive Bayes, Random Forest and SVM for Credit Card Fraud Detection

2020 International Conference on Decision Aid Sciences and Application (DASA) ◽

10.1109/dasa51403.2020.9317228 ◽

2020 ◽

Author(s):

Yakub K. Saheed ◽

Moshood A. Hambali ◽

Micheal O. Arowolo ◽

Yinusa A. Olasupo

Keyword(s):

Feature Selection ◽

Random Forest ◽

Credit Card ◽

Naive Bayes ◽

Fraud Detection ◽

Naïve Bayes ◽

Credit Card Fraud

Download Full-text

R-HEFS: Rough set based Heterogeneous Ensemble Feature Selection Method for Medical data Classification

Artificial Intelligence in Medicine ◽

10.1016/j.artmed.2021.102049 ◽

2021 ◽

pp. 102049

Author(s):

Rubul Kumar Bania ◽

Anindya Halder

Keyword(s):

Feature Selection ◽

Rough Set ◽

Feature Selection Method ◽

Data Classification ◽

Selection Method ◽

Medical Data ◽

Medical Data Classification ◽

Heterogeneous Ensemble

Download Full-text

Train delays prediction based on feature selection and random forest

2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC) ◽

10.1109/itsc45102.2020.9294653 ◽

2020 ◽

Author(s):

Yuanyuan Ji ◽

Wei Zheng ◽

Hairong Dong ◽

Pengfei Gao

Keyword(s):

Feature Selection ◽

Random Forest

Download Full-text

Research of Medical High-Dimensional Imbalanced Data Classification Ensemble Feature Selection Algorithm with Random Forest

2017 International Conference on Smart Grid and Electrical Automation (ICSGEA) ◽

10.1109/icsgea.2017.158 ◽

2017 ◽

Cited By ~ 2

Author(s):

Min Zhu ◽

Bo Su ◽

Gangmin Ning

Keyword(s):

Feature Selection ◽

Random Forest ◽

Imbalanced Data ◽

Data Classification ◽

High Dimensional ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Imbalanced Data Classification

Download Full-text