An efficient binary chimp optimization algorithm for feature selection in biomedical data classification

Feature Selection using FFS and PCA in Biomedical Data Classification with AdaBoost-SVM

International Journal of Intelligent Systems and Applications in Engineering ◽

10.18201/ijisae.2018637928 ◽

2018 ◽

Vol 1 (6) ◽

pp. 33-39

Author(s):

Rahime Ceylan

Keyword(s):

Feature Selection ◽

Data Classification ◽

Biomedical Data

Download Full-text

Feature Selection for High-Dimensional and Imbalanced Biomedical Data Based on Robust Correlation Based Redundancy and Binary Grasshopper Optimization Algorithm

Genes ◽

10.3390/genes11070717 ◽

2020 ◽

Vol 11 (7) ◽

pp. 717

Author(s):

Garba Abdulrauf Sharifai ◽

Zurinahni Zainol

Keyword(s):

Feature Selection ◽

Optimization Algorithm ◽

Imbalanced Data ◽

High Dimensional ◽

Data Sets ◽

Biomedical Data ◽

Data Set ◽

Grasshopper Optimization Algorithm ◽

Imbalanced Class ◽

Grasshopper Optimization

The training machine learning algorithm from an imbalanced data set is an inherently challenging task. It becomes more demanding with limited samples but with a massive number of features (high dimensionality). The high dimensional and imbalanced data set has posed severe challenges in many real-world applications, such as biomedical data sets. Numerous researchers investigated either imbalanced class or high dimensional data sets and came up with various methods. Nonetheless, few approaches reported in the literature have addressed the intersection of the high dimensional and imbalanced class problem due to their complicated interactions. Lately, feature selection has become a well-known technique that has been used to overcome this problem by selecting discriminative features that represent minority and majority class. This paper proposes a new method called Robust Correlation Based Redundancy and Binary Grasshopper Optimization Algorithm (rCBR-BGOA); rCBR-BGOA has employed an ensemble of multi-filters coupled with the Correlation-Based Redundancy method to select optimal feature subsets. A binary Grasshopper optimisation algorithm (BGOA) is used to construct the feature selection process as an optimisation problem to select the best (near-optimal) combination of features from the majority and minority class. The obtained results, supported by the proper statistical analysis, indicate that rCBR-BGOA can improve the classification performance for high dimensional and imbalanced datasets in terms of G-mean and the Area Under the Curve (AUC) performance metrics.

Download Full-text

Feature selection algorithm for high dimensional biomedical data classification based on redundant removal

10.14236/ewic/hci2018.232 ◽

2018 ◽

Author(s):

Bingtao Zhang ◽

Peng Cao ◽

Yi Zhang ◽

Chaochao Zhang ◽

Zhe Li ◽

...

Keyword(s):

Feature Selection ◽

Data Classification ◽

High Dimensional ◽

Biomedical Data ◽

Selection Algorithm ◽

Feature Selection Algorithm

Download Full-text

A multivariate feature selection framework for high dimensional biomedical data classification

2017 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB) ◽

10.1109/cibcb.2017.8058528 ◽

2017 ◽

Cited By ~ 1

Author(s):

Abeer Alzubaidi ◽

Georgina Cosma

Keyword(s):

Feature Selection ◽

Data Classification ◽

High Dimensional ◽

Biomedical Data ◽

Selection Framework

Download Full-text

A novel feature selection approach for biomedical data classification

Journal of Biomedical Informatics ◽

10.1016/j.jbi.2009.07.008 ◽

2010 ◽

Vol 43 (1) ◽

pp. 15-23 ◽

Cited By ~ 116

Author(s):

Yonghong Peng ◽

Zhiqing Wu ◽

Jianmin Jiang

Keyword(s):

Feature Selection ◽

Data Classification ◽

Biomedical Data ◽

Selection Approach ◽

Feature Selection Approach

Download Full-text

An Evolutionary Hybrid Feature Selection Approach for Biomedical Data Classification

2020 10th International Conference on Computer and Knowledge Engineering (ICCKE) ◽

10.1109/iccke50421.2020.9303648 ◽

2020 ◽

Author(s):

Fariba Moeini ◽

Seyed Jalaleddin Mousavirad

Keyword(s):

Feature Selection ◽

Data Classification ◽

Biomedical Data ◽

Selection Approach ◽

Feature Selection Approach

Download Full-text

Population-Based Feature Selection for Biomedical Data Classification

Data Mining and Analysis in the Engineering Field - Advances in Data Mining and Database Management ◽

10.4018/978-1-4666-6086-1.ch016 ◽

2014 ◽

pp. 296-326 ◽

Cited By ~ 2

Author(s):

Seyed Jalaleddin Mousavirad ◽

Hossein Ebrahimpour-Komleh

Keyword(s):

Feature Selection ◽

Learning Algorithm ◽

Selection Process ◽

Data Classification ◽

Population Based ◽

Statistical Characteristics ◽

Biomedical Data ◽

Filter Methods ◽

Embedded Methods

Classification of biomedical data plays a significant role in prediction and diagnosis of disease. The existence of redundant and irrelevant features is one of the major problems in biomedical data classification. Excluding these features can improve the performance of classification algorithm. Feature selection is the problem of selecting a subset of features without reducing the accuracy of the original set of features. These algorithms are divided into three categories: wrapper, filter, and embedded methods. Wrapper methods use the learning algorithm for selection of features while filter methods use statistical characteristics of data. In the embedded methods, feature selection process combines with the learning process. Population-based metaheuristics can be applied for wrapper feature selection. In these algorithms, a population of candidate solutions is created. Then, they try to improve the objective function using some operators. This chapter presents the application of population-based feature selection to deal with issues of high dimensionality in the biomedical data classification. The result shows that population-based feature selection has presented acceptable performance in biomedical data classification.

Download Full-text

Feature selection based on an improved cat swarm optimization algorithm for big data classification

The Journal of Supercomputing ◽

10.1007/s11227-016-1631-0 ◽

2016 ◽

Vol 72 (8) ◽

pp. 3210-3221 ◽

Cited By ~ 51

Author(s):

Kuan-Cheng Lin ◽

Kai-Yuan Zhang ◽

Yi-Hung Huang ◽

Jason C. Hung ◽

Neil Yen

Keyword(s):

Feature Selection ◽

Big Data ◽

Optimization Algorithm ◽

Data Classification ◽

Swarm Optimization ◽

Cat Swarm Optimization ◽

Big Data Classification

Download Full-text

Automated Maintenance Data Classification Using Recurrent Neural Network: Enhancement by Spotted Hyena-Based Whale Optimization

Mathematics ◽

10.3390/math8112008 ◽

2020 ◽

Vol 8 (11) ◽

pp. 2008

Author(s):

Mustufa Haider Abidi ◽

Usama Umer ◽

Muneer Khan Mohammed ◽

Mohamed K. Aboudaif ◽

Hisham Alkhalefah

Keyword(s):

Neural Network ◽

Feature Extraction ◽

Feature Selection ◽

Data Acquisition ◽

Recurrent Neural Network ◽

Optimization Algorithm ◽

Data Classification ◽

Whale Optimization Algorithm ◽

Spotted Hyena ◽

Whale Optimization

Data classification has been considered extensively in different fields, such as machine learning, artificial intelligence, pattern recognition, and data mining, and the expansion of classification has yielded immense achievements. The automatic classification of maintenance data has been investigated over the past few decades owing to its usefulness in construction and facility management. To utilize automated data classification in the maintenance field, a data classification model is implemented in this study based on the analysis of different mechanical maintenance data. The developed model involves four main steps: (a) data acquisition, (b) feature extraction, (c) feature selection, and (d) classification. During data acquisition, four types of dataset are collected from the benchmark Google datasets. The attributes of each dataset are further processed for classification. Principal component analysis and first-order and second-order statistical features are computed during the feature extraction process. To reduce the dimensions of the features for error-free classification, feature selection was performed. The hybridization of two algorithms, the Whale Optimization Algorithm (WOA) and Spotted Hyena Optimization (SHO), tends to produce a new algorithm—i.e., a Spotted Hyena-based Whale Optimization Algorithm (SH-WOA), which is adopted for performing feature selection. The selected features are subjected to a deep learning algorithm called Recurrent Neural Network (RNN). To enhance the efficiency of conventional RNNs, the number of hidden neurons in an RNN is optimized using the developed SH-WOA. Finally, the efficacy of the proposed model is verified utilizing the entire dataset. Experimental results show that the developed model can effectively solve uncertain data classification, which minimizes the execution time and enhances efficiency.

Download Full-text

Using Penguins Search Optimization Algorithm for Best Features Selection for Biomedical Data Classification

International Journal of Organizational and Collective Intelligence ◽

10.4018/ijoci.2017100103 ◽

2017 ◽

Vol 7 (4) ◽

pp. 51-62 ◽

Cited By ~ 1

Author(s):

Noria Bidi ◽

Zakaria Elberrichi

Keyword(s):

Feature Selection ◽

Optimization Algorithm ◽

Feature Selection Method ◽

Support Vector ◽

Feature Subset ◽

Biomedical Data ◽

Features Selection ◽

Cancer Data ◽

Search Optimization ◽

Fitness Value

Feature selection is essential to improve the classification effectiveness. This paper presents a new adaptive algorithm called FS-PeSOA (feature selection penguins search optimization algorithm) which is a meta-heuristic feature selection method based on “Penguins Search Optimization Algorithm” (PeSOA), it will be combined with different classifiers to find the best subset features, which achieve the highest accuracy in classification. In order to explore the feature subset candidates, the bio-inspired approach PeSOA generates during the process a trial feature subset and estimates its fitness value by using three classifiers for each case: Naive Bayes (NB), Nearest Neighbors (KNN) and Support Vector Machines (SVMs). Our proposed approach has been experimented on six well known benchmark datasets (Wisconsin Breast Cancer, Pima Diabetes, Mammographic Mass, Dermatology, Colon Tumor and Prostate Cancer data sets). Experimental results prove that the classification accuracy of FS-PeSOA is the highest and very powerful for different datasets.

Download Full-text