scholarly journals Nature-Inspired Meta-heuristic Optimization Algorithms for Breast Cancer Diagnostic Model: A Comparative Study

2021 ◽  
Vol 6 (1) ◽  
Author(s):  
Tinuke O Oladele ◽  
Babatunde J Olorunsola ◽  
Taye O Aro ◽  
Hakeem B Akande ◽  
Oyenike A Olukiran

The selection of features is used to obtain a subset of features by the removal of irrelevant features with no or less predictive output. Meta-heuristic algorithms are appropriate for the selection of features because feature subset representation is direct and the evaluation is easily accomplished. This paper performed a comparative study on the impact of meta-heuristic optimization algorithms on breast cancer diagnosis using Ant Colony Optimization (ACO) and Particle Swarm Optimization (PSO). The two feature selection algorithms were used to obtain the relevant attributes from the Wisconsin breast cancer (original) dataset. The selected attributes were passed to seven learning algorithms: Support Vector Machine (SVM), Decision Tree (C4.5), Naïve Bayes (NB), K Nearest Neighhood (KNN), Neural Network (NN), Logistic Regression (LR), and Random Forest (RF). The diagnostic model was evaluated based on accuracy, precision, recall, and F1-measure. Experimental showed that the highest accuracy of 97.1388% was obtained in both PSO and ACO using RF classifier, the highest precision value of 0.9720 was recorded in ACO using RF classifier,  the highest recall value of 0.9750 was achieved in PSO using RF classifier, the highest F1-measure value of 0.9700 was obtained in PSO using SVM, the highest kappa statistic of 0.9370 was obtained in both PSO and ACO using RF and the lowest time of 0s was taken to build a model was recorded in PSO using KNN and NB, and also in ACO using KNN. The paper concluded that the breast diagnostic model using PSO and ACO with different learning algorithms revealed that the accuracy of RF outperformed other algorithms. Also, it was shown that ACO produced better precision using RF compared with PSO and PSO gave better recall using RF compared with ACO, PSO recorded an efficient F1-measure using SVM. The best time used to build a model was obtained in PSO for KNN and NB, and ACO with KNN.Keywords— Breast cancer, Data mining, Diagnosis, Feature selection, Meta-heuristic. 

2015 ◽  
Vol 2015 ◽  
pp. 1-10 ◽  
Author(s):  
Divya Tomar ◽  
Sonali Agarwal

There is a necessity for analysis of a large amount of data in many fields such as healthcare, business, industries, and agriculture. Therefore, the need of the feature selection (FS) technique for the researchers is quite evident in many fields of science, especially in computer science. Furthermore, an effective FS technique that is best suited to a particular learning algorithm is of great help for the researchers. Hence, this paper proposes a hybrid feature selection (HFS) based efficient disease diagnostic model for Breast Cancer, Hepatitis, and Diabetes. A HFS is an efficient method that combines the positive aspects of both Filter and Wrapper FS approaches. The proposed model adopts weighted least squares twin support vector machine (WLSTSVM) as a classification approach, sequential forward selection (SFS) as a search strategy, and correlation feature selection (CFS) to evaluate the importance of each feature. This model not only selects relevant feature subset but also efficiently deals with the data imbalance problem. The effectiveness of the HFS based WLSTSVM approach is examined on three well-known disease datasets taken from UCI repository with the help of predictive accuracy, sensitivity, specificity, and geometric mean. The experiment confirms that our proposed HFS based WLSTSVM disease diagnostic model can result in positive outcomes.


2021 ◽  
Vol 7 ◽  
pp. e390
Author(s):  
Shafaq Abbas ◽  
Zunera Jalil ◽  
Abdul Rehman Javed ◽  
Iqra Batool ◽  
Mohammad Zubair Khan ◽  
...  

Breast cancer is one of the leading causes of death in the current age. It often results in subpar living conditions for a patient as they have to go through expensive and painful treatments to fight this cancer. One in eight women all over the world is affected by this disease. Almost half a million women annually do not survive this fight and die from this disease. Machine learning algorithms have proven to outperform all existing solutions for the prediction of breast cancer using models built on the previously available data. In this paper, a novel approach named BCD-WERT is proposed that utilizes the Extremely Randomized Tree and Whale Optimization Algorithm (WOA) for efficient feature selection and classification. WOA reduces the dimensionality of the dataset and extracts the relevant features for accurate classification. Experimental results on state-of-the-art comprehensive dataset demonstrated improved performance in comparison with eight other machine learning algorithms: Support Vector Machine (SVM), Random Forest, Kernel Support Vector Machine, Decision Tree, Logistic Regression, Stochastic Gradient Descent, Gaussian Naive Bayes and k-Nearest Neighbor. BCD-WERT outperformed all with the highest accuracy rate of 99.30% followed by SVM achieving 98.60% accuracy. Experimental results also reveal the effectiveness of feature selection techniques in improving prediction accuracy.


2020 ◽  
Vol 8 (2S7) ◽  
pp. 2237-2240

In diagnosis and prediction systems, algorithms working on datasets with a high number of dimensions tend to take more time than those with fewer dimensions. Feature subset selection algorithms enhance the efficiency of Machine Learning algorithms in prediction problems by selecting a subset of the total features and thus pruning redundancy and noise. In this article, such a feature subset selection method is proposed and implemented to diagnose breast cancer using Support Vector Machine (SVM) and K-Nearest Neighbor (KNN) algorithms. This feature selection algorithm is based on Social Group Optimization (SGO) an evolutionary algorithm. Higher accuracy in diagnosing breast cancer is achieved using our proposed model when compared to other feature selection-based Machine Learning algorithms


Author(s):  
Nazila Darabi ◽  
Abdalhossein Rezai ◽  
Seyedeh Shahrbanoo Falahieh Hamidpour

Breast cancer is a common cancer in female. Accurate and early detection of breast cancer can play a vital role in treatment. This paper presents and evaluates a thermogram based Computer-Aided Detection (CAD) system for the detection of breast cancer. In this CAD system, the Random Subset Feature Selection (RSFS) algorithm and hybrid of minimum Redundancy Maximum Relevance (mRMR) algorithm and Genetic Algorithm (GA) with RSFS algorithm are utilized for feature selection. In addition, the Support Vector Machine (SVM) and k-Nearest Neighbors (kNN) algorithms are utilized as classifier algorithm. The proposed CAD system is verified using MATLAB 2017 and a dataset that is composed of breast images from 78 patients. The implementation results demonstrate that using RSFS algorithm for feature selection and kNN and SVM algorithms as classifier have accuracy of 85.36% and 75%, and sensitivity of 94.11% and 79.31%, respectively. In addition, using hybrid GA and RSFS algorithm for feature selection and kNN and SVM algorithms as classifier have accuracy of 83.87% and 69.56%, and sensitivity of 96% and 81.81%, respectively, and using hybrid mRMR and RSFS algorithms for feature selection and kNN and SVM algorithms as classifier have accuracy of 77.41% and 73.07%, and sensitivity of 98% and 72.72%, respectively.


2015 ◽  
Vol 2015 ◽  
pp. 1-13 ◽  
Author(s):  
Jin-Jia Wang ◽  
Fang Xue ◽  
Hui Li

Feature extraction and classification of EEG signals are core parts of brain computer interfaces (BCIs). Due to the high dimension of the EEG feature vector, an effective feature selection algorithm has become an integral part of research studies. In this paper, we present a new method based on a wrapped Sparse Group Lasso for channel and feature selection of fused EEG signals. The high-dimensional fused features are firstly obtained, which include the power spectrum, time-domain statistics, AR model, and the wavelet coefficient features extracted from the preprocessed EEG signals. The wrapped channel and feature selection method is then applied, which uses the logistical regression model with Sparse Group Lasso penalized function. The model is fitted on the training data, and parameter estimation is obtained by modified blockwise coordinate descent and coordinate gradient descent method. The best parameters and feature subset are selected by using a 10-fold cross-validation. Finally, the test data is classified using the trained model. Compared with existing channel and feature selection methods, results show that the proposed method is more suitable, more stable, and faster for high-dimensional feature fusion. It can simultaneously achieve channel and feature selection with a lower error rate. The test accuracy on the data used from international BCI Competition IV reached 84.72%.


PLoS ONE ◽  
2021 ◽  
Vol 16 (8) ◽  
pp. e0255307
Author(s):  
Fujun Wang ◽  
Xing Wang

Feature selection is an important task in big data analysis and information retrieval processing. It reduces the number of features by removing noise, extraneous data. In this paper, one feature subset selection algorithm based on damping oscillation theory and support vector machine classifier is proposed. This algorithm is called the Maximum Kendall coefficient Maximum Euclidean Distance Improved Gray Wolf Optimization algorithm (MKMDIGWO). In MKMDIGWO, first, a filter model based on Kendall coefficient and Euclidean distance is proposed, which is used to measure the correlation and redundancy of the candidate feature subset. Second, the wrapper model is an improved grey wolf optimization algorithm, in which its position update formula has been improved in order to achieve optimal results. Third, the filter model and the wrapper model are dynamically adjusted by the damping oscillation theory to achieve the effect of finding an optimal feature subset. Therefore, MKMDIGWO achieves both the efficiency of the filter model and the high precision of the wrapper model. Experimental results on five UCI public data sets and two microarray data sets have demonstrated the higher classification accuracy of the MKMDIGWO algorithm than that of other four state-of-the-art algorithms. The maximum ACC value of the MKMDIGWO algorithm is at least 0.5% higher than other algorithms on 10 data sets.


Sign in / Sign up

Export Citation Format

Share Document