Nature-Inspired Meta-heuristic Optimization Algorithms for Breast Cancer Diagnostic Model: A Comparative Study

The selection of features is used to obtain a subset of features by the removal of irrelevant features with no or less predictive output. Meta-heuristic algorithms are appropriate for the selection of features because feature subset representation is direct and the evaluation is easily accomplished. This paper performed a comparative study on the impact of meta-heuristic optimization algorithms on breast cancer diagnosis using Ant Colony Optimization (ACO) and Particle Swarm Optimization (PSO). The two feature selection algorithms were used to obtain the relevant attributes from the Wisconsin breast cancer (original) dataset. The selected attributes were passed to seven learning algorithms: Support Vector Machine (SVM), Decision Tree (C4.5), Naïve Bayes (NB), K Nearest Neighhood (KNN), Neural Network (NN), Logistic Regression (LR), and Random Forest (RF). The diagnostic model was evaluated based on accuracy, precision, recall, and F1-measure. Experimental showed that the highest accuracy of 97.1388% was obtained in both PSO and ACO using RF classifier, the highest precision value of 0.9720 was recorded in ACO using RF classifier, the highest recall value of 0.9750 was achieved in PSO using RF classifier, the highest F1-measure value of 0.9700 was obtained in PSO using SVM, the highest kappa statistic of 0.9370 was obtained in both PSO and ACO using RF and the lowest time of 0s was taken to build a model was recorded in PSO using KNN and NB, and also in ACO using KNN. The paper concluded that the breast diagnostic model using PSO and ACO with different learning algorithms revealed that the accuracy of RF outperformed other algorithms. Also, it was shown that ACO produced better precision using RF compared with PSO and PSO gave better recall using RF compared with ACO, PSO recorded an efficient F1-measure using SVM. The best time used to build a model was obtained in PSO for KNN and NB, and ACO with KNN.Keywords— Breast cancer, Data mining, Diagnosis, Feature selection, Meta-heuristic.

Download Full-text

Hybrid Feature Selection Based Weighted Least Squares Twin Support Vector Machine Approach for Diagnosing Breast Cancer, Hepatitis, and Diabetes

Advances in Artificial Neural Systems ◽

10.1155/2015/265637 ◽

2015 ◽

Vol 2015 ◽

pp. 1-10 ◽

Cited By ~ 19

Author(s):

Divya Tomar ◽

Sonali Agarwal

Keyword(s):

Breast Cancer ◽

Support Vector Machine ◽

Feature Selection ◽

Least Squares ◽

Weighted Least Squares ◽

Geometric Mean ◽

Twin Support Vector Machine ◽

Support Vector ◽

Diagnostic Model ◽

Feature Subset

There is a necessity for analysis of a large amount of data in many fields such as healthcare, business, industries, and agriculture. Therefore, the need of the feature selection (FS) technique for the researchers is quite evident in many fields of science, especially in computer science. Furthermore, an effective FS technique that is best suited to a particular learning algorithm is of great help for the researchers. Hence, this paper proposes a hybrid feature selection (HFS) based efficient disease diagnostic model for Breast Cancer, Hepatitis, and Diabetes. A HFS is an efficient method that combines the positive aspects of both Filter and Wrapper FS approaches. The proposed model adopts weighted least squares twin support vector machine (WLSTSVM) as a classification approach, sequential forward selection (SFS) as a search strategy, and correlation feature selection (CFS) to evaluate the importance of each feature. This model not only selects relevant feature subset but also efficiently deals with the data imbalance problem. The effectiveness of the HFS based WLSTSVM approach is examined on three well-known disease datasets taken from UCI repository with the help of predictive accuracy, sensitivity, specificity, and geometric mean. The experiment confirms that our proposed HFS based WLSTSVM disease diagnostic model can result in positive outcomes.

Download Full-text

BCD-WERT: a novel approach for breast cancer detection using whale optimization based efficient features and extremely randomized tree algorithm

PeerJ Computer Science ◽

10.7717/peerj-cs.390 ◽

2021 ◽

Vol 7 ◽

pp. e390

Author(s):

Shafaq Abbas ◽

Zunera Jalil ◽

Abdul Rehman Javed ◽

Iqra Batool ◽

Mohammad Zubair Khan ◽

...

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Support Vector Machine ◽

Feature Selection ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Experimental Results ◽

Support Vector ◽

Novel Approach ◽

Whale Optimization

Breast cancer is one of the leading causes of death in the current age. It often results in subpar living conditions for a patient as they have to go through expensive and painful treatments to fight this cancer. One in eight women all over the world is affected by this disease. Almost half a million women annually do not survive this fight and die from this disease. Machine learning algorithms have proven to outperform all existing solutions for the prediction of breast cancer using models built on the previously available data. In this paper, a novel approach named BCD-WERT is proposed that utilizes the Extremely Randomized Tree and Whale Optimization Algorithm (WOA) for efficient feature selection and classification. WOA reduces the dimensionality of the dataset and extracts the relevant features for accurate classification. Experimental results on state-of-the-art comprehensive dataset demonstrated improved performance in comparison with eight other machine learning algorithms: Support Vector Machine (SVM), Random Forest, Kernel Support Vector Machine, Decision Tree, Logistic Regression, Stochastic Gradient Descent, Gaussian Naive Bayes and k-Nearest Neighbor. BCD-WERT outperformed all with the highest accuracy rate of 99.30% followed by SVM achieving 98.60% accuracy. Experimental results also reveal the effectiveness of feature selection techniques in improving prediction accuracy.

Download Full-text

META-HEURISTIC OPTIMIZATION ALGORITHMS BASED FEATURE SELECTION FOR CLINICAL BREAST CANCER DIAGNOSIS

Journal of the Egyptian Mathematical Society ◽

10.21608/jomes.2018.2673.1023 ◽

2018 ◽

Vol 26 (2) ◽

pp. 321-336

Author(s):

Ashraf Darwish ◽

Gehad Sayed ◽

Aboul Hassanien

Keyword(s):

Breast Cancer ◽

Feature Selection ◽

Cancer Diagnosis ◽

Optimization Algorithms ◽

Breast Cancer Diagnosis ◽

Heuristic Optimization ◽

Clinical Breast ◽

Selection For

Download Full-text

SVM and KNN Based SGO Feature Selection Algorithm for Breast Cancer Diagnosis

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.d4428.038620 ◽

2020 ◽

Vol 8 (2S7) ◽

pp. 2237-2240

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Feature Selection ◽

Learning Algorithms ◽

Subset Selection ◽

Machine Learning Algorithms ◽

Feature Subset Selection ◽

Feature Subset ◽

Selection Algorithm ◽

Feature Selection Algorithm

In diagnosis and prediction systems, algorithms working on datasets with a high number of dimensions tend to take more time than those with fewer dimensions. Feature subset selection algorithms enhance the efficiency of Machine Learning algorithms in prediction problems by selecting a subset of the total features and thus pruning redundancy and noise. In this article, such a feature subset selection method is proposed and implemented to diagnose breast cancer using Support Vector Machine (SVM) and K-Nearest Neighbor (KNN) algorithms. This feature selection algorithm is based on Social Group Optimization (SGO) an evolutionary algorithm. Higher accuracy in diagnosing breast cancer is achieved using our proposed model when compared to other feature selection-based Machine Learning algorithms

Download Full-text

Feature Selection with Fast Correlation-Based Filter for Breast Cancer Prediction and Classification Using Machine Learning Algorithms

2018 International Symposium on Advanced Electrical and Communication Technologies (ISAECT) ◽

10.1109/isaect.2018.8618688 ◽

2018 ◽

Author(s):

Youness Khourdifi ◽

Mohamed Bahaj

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Feature Selection ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Cancer Prediction

Download Full-text

Comparative study on total nitrogen prediction in wastewater treatment plant and effect of various feature selection methods on machine learning algorithms performance

Journal of Water Process Engineering ◽

10.1016/j.jwpe.2021.102033 ◽

2021 ◽

Vol 41 ◽

pp. 102033

Author(s):

Faramarz Bagherzadeh ◽

Mohamad-Javad Mehrani ◽

Milad Basirifard ◽

Javad Roostaei

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Wastewater Treatment ◽

Comparative Study ◽

Total Nitrogen ◽

Wastewater Treatment Plant ◽

Learning Algorithms ◽

Treatment Plant ◽

Machine Learning Algorithms ◽

Selection Methods

Download Full-text

BREAST CANCER DETECTION USING RSFS-BASED FEATURE SELECTION ALGORITHMS IN THERMAL IMAGES

Biomedical Engineering Applications Basis and Communications ◽

10.4015/s1016237221500204 ◽

2021 ◽

pp. 2150020

Author(s):

Nazila Darabi ◽

Abdalhossein Rezai ◽

Seyedeh Shahrbanoo Falahieh Hamidpour

Keyword(s):

Breast Cancer ◽

Feature Selection ◽

Cancer Detection ◽

Vital Role ◽

Support Vector ◽

Computer Aided Detection ◽

K Nearest Neighbors ◽

Cad System ◽

Common Cancer ◽

Selection Algorithms

Breast cancer is a common cancer in female. Accurate and early detection of breast cancer can play a vital role in treatment. This paper presents and evaluates a thermogram based Computer-Aided Detection (CAD) system for the detection of breast cancer. In this CAD system, the Random Subset Feature Selection (RSFS) algorithm and hybrid of minimum Redundancy Maximum Relevance (mRMR) algorithm and Genetic Algorithm (GA) with RSFS algorithm are utilized for feature selection. In addition, the Support Vector Machine (SVM) and k-Nearest Neighbors (kNN) algorithms are utilized as classifier algorithm. The proposed CAD system is verified using MATLAB 2017 and a dataset that is composed of breast images from 78 patients. The implementation results demonstrate that using RSFS algorithm for feature selection and kNN and SVM algorithms as classifier have accuracy of 85.36% and 75%, and sensitivity of 94.11% and 79.31%, respectively. In addition, using hybrid GA and RSFS algorithm for feature selection and kNN and SVM algorithms as classifier have accuracy of 83.87% and 69.56%, and sensitivity of 96% and 81.81%, respectively, and using hybrid mRMR and RSFS algorithms for feature selection and kNN and SVM algorithms as classifier have accuracy of 77.41% and 73.07%, and sensitivity of 98% and 72.72%, respectively.

Download Full-text

A Comparative Study on Machine Learning Algorithms for Predicting Breast Cancer Prognosis in Improving Clinical Trials

2020 International Conference on Computational Science and Computational Intelligence (CSCI) ◽

10.1109/csci51800.2020.00152 ◽

2020 ◽

Author(s):

Neetu Sangari ◽

Yanzhen Qu

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Clinical Trials ◽

Comparative Study ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Breast Cancer Prognosis ◽

Cancer Prognosis

Download Full-text

Simultaneous Channel and Feature Selection of Fused EEG Features Based on Sparse Group Lasso

BioMed Research International ◽

10.1155/2015/703768 ◽

2015 ◽

Vol 2015 ◽

pp. 1-13 ◽

Cited By ~ 11

Author(s):

Jin-Jia Wang ◽

Fang Xue ◽

Hui Li

Keyword(s):

Feature Selection ◽

Feature Selection Method ◽

Group Lasso ◽

High Dimensional ◽

Test Accuracy ◽

Gradient Descent Method ◽

Feature Subset ◽

Eeg Signals ◽

Sparse Group Lasso ◽

Selection Of

Feature extraction and classification of EEG signals are core parts of brain computer interfaces (BCIs). Due to the high dimension of the EEG feature vector, an effective feature selection algorithm has become an integral part of research studies. In this paper, we present a new method based on a wrapped Sparse Group Lasso for channel and feature selection of fused EEG signals. The high-dimensional fused features are firstly obtained, which include the power spectrum, time-domain statistics, AR model, and the wavelet coefficient features extracted from the preprocessed EEG signals. The wrapped channel and feature selection method is then applied, which uses the logistical regression model with Sparse Group Lasso penalized function. The model is fitted on the training data, and parameter estimation is obtained by modified blockwise coordinate descent and coordinate gradient descent method. The best parameters and feature subset are selected by using a 10-fold cross-validation. Finally, the test data is classified using the trained model. Compared with existing channel and feature selection methods, results show that the proposed method is more suitable, more stable, and faster for high-dimensional feature fusion. It can simultaneously achieve channel and feature selection with a lower error rate. The test accuracy on the data used from international BCI Competition IV reached 84.72%.

Download Full-text

A novel feature selection algorithm based on damping oscillation theory

PLoS ONE ◽

10.1371/journal.pone.0255307 ◽

2021 ◽

Vol 16 (8) ◽

pp. e0255307

Author(s):

Fujun Wang ◽

Xing Wang

Keyword(s):

Feature Selection ◽

Optimization Algorithm ◽

Euclidean Distance ◽

Oscillation Theory ◽

Feature Subset Selection ◽

Support Vector ◽

Data Sets ◽

Feature Subset ◽

Selection Algorithm ◽

Filter Model

Feature selection is an important task in big data analysis and information retrieval processing. It reduces the number of features by removing noise, extraneous data. In this paper, one feature subset selection algorithm based on damping oscillation theory and support vector machine classifier is proposed. This algorithm is called the Maximum Kendall coefficient Maximum Euclidean Distance Improved Gray Wolf Optimization algorithm (MKMDIGWO). In MKMDIGWO, first, a filter model based on Kendall coefficient and Euclidean distance is proposed, which is used to measure the correlation and redundancy of the candidate feature subset. Second, the wrapper model is an improved grey wolf optimization algorithm, in which its position update formula has been improved in order to achieve optimal results. Third, the filter model and the wrapper model are dynamically adjusted by the damping oscillation theory to achieve the effect of finding an optimal feature subset. Therefore, MKMDIGWO achieves both the efficiency of the filter model and the high precision of the wrapper model. Experimental results on five UCI public data sets and two microarray data sets have demonstrated the higher classification accuracy of the MKMDIGWO algorithm than that of other four state-of-the-art algorithms. The maximum ACC value of the MKMDIGWO algorithm is at least 0.5% higher than other algorithms on 10 data sets.

Download Full-text