scholarly journals A Composite Hybrid Feature Selection Learning-Based Optimization of Genetic Algorithm For Breast Cancer Detection

Author(s):  
Ahmed Abdullah Farid ◽  
Gamal Selim ◽  
Hatem Khater

Breast cancer is a significant health issue across the world. Breast cancer is the most widely-diagnosed cancer in women; early-stage diagnosis of disease and therapies increase patient safety. This paper proposes a synthetic model set of features focused on the optimization of the genetic algorithm (CHFS-BOGA) to forecast breast cancer. This hybrid feature selection approach combines the advantages of three filter feature selection approaches with an optimize Genetic Algorithm (OGA) to select the best features to improve the performance of the classification process and scalability. We propose OGA by improving the initial population generating and genetic operators using the results of filter approaches as some prior information with using the C4.5 decision tree classifier as a fitness function instead of probability and random selection. The authors collected available updated data from Wisconsin UCI machine learning with a total of 569 rows and 32 columns. The dataset evaluated using an explorer set of weka data mining open-source software for the analysis purpose. The results show that the proposed hybrid feature selection approach significantly outperforms the single filter approaches and principal component analysis (PCA) for optimum feature selection. These characteristics are good indicators for the return prediction. The highest accuracy achieved with the proposed system before (CHFS-BOGA) using the support vector machine (SVM) classifiers was 97.3%. The highest accuracy after (CHFS-BOGA-SVM) was 98.25% on split 70.0% train, remainder test, and 100% on the full training set. Moreover, the receiver operating characteristic (ROC) curve was equal to 1.0. The results showed that the proposed (CHFS-BOGA-SVM) system was able to accurately classify the type of breast tumor, whether malignant or benign.

Author(s):  
Indu Singh ◽  
Shashank Garg ◽  
Shivam Arora ◽  
Nikhil Arora ◽  
Kripali Agrawal

Background: Breast cancer is the development of a malignant tumor in the breast of human beings (especially females). If not detected at the initial stages, it can substantially lead to an inoperable construct. It is a reason for majority of cancer-related deaths throughout the world. Objectives: The main aim of our study is to diagnose the breast cancer at early stage so that required treatment can be provided for survival. The tumor is classified as malignant or benign accurately at early stage using a novel approach that includes an ensemble of Genetic Algorithm for feature selection and kernel selection for SVM-Classifier. Methods: The proposed GA-SVM (Genetic Algorithm – Support Vector Machine) algorithm in this paper optimally selects the most appropriate features for training with the SVM classifier. Genetic Programming is used to select the features and the kernel for the SVM classifier. Genetic Algorithm operates by exploring the optimal layout of features for breast cancer, thus, subjugating the problems faced in exponentially immense feature space. Results: The proposed approach accounts for a mean accuracy of 98.82% by using the Wisconsin Diagnostic Breast Cancer (WDBC) dataset available on UCI with the training and testing ratio being 50:50 respectively. Conclusion: The results prove that our proposed model outperforms the previously designed models for breast cancer diagnosis. The outcome assures that the GA-SVM model may be used as an effective tool in assisting the doctors for treating the patients. Alternatively, it may be utilized as an alternate opinion in their eventual diagnosis.


Twitter sentiment analysis is a vital concept in determining the public opinions about products, services, events or personality. Analyzing the medical tweets on a specific topic can provide immense benefits in medical industry. However, the medical tweets require efficient feature selection approach to produce significantly accurate results. Penguin search optimization algorithm (PeSOA) has the ability to resolve NP-hard problems. This paper aims at developing an automated opinion mining framework by modeling the feature selection problem as NP-hard optimization problem and using PeSOA based feature selection approach to solve it. Initially, the medical tweets based on cancer and drugs keywords are extracted and pre-processed to filter the relevant informative tweets. Then the features are extracted based on the Natural Language Processing (NLP) concepts and the optimal features are selected using PeSOA whose results are fed as input to three baseline classifiers to achieve optimal and accurate sentiment classification. The experimental results obtained through MATLAB simulations on cancer and drug tweets using k-Nearest Neighbor (KNN), Naïve Bayes (NB) and Support Vector Machine (SVM) indicate that the proposed PeSOA feature selection based tweet opinion mining has improved the classification performance significantly. It shows that the PeSOA feature selection with the SVM classifier provides superior sentiment classification than the other classifiers


Sign in / Sign up

Export Citation Format

Share Document