Feature Selection with Fast Correlation-Based Filter for Breast Cancer Prediction and Classification Using Machine Learning Algorithms

Author(s):  
Youness Khourdifi ◽  
Mohamed Bahaj
Author(s):  
Akshya Yadav ◽  
Imlikumla Jamir ◽  
Raj Rajeshwari Jain ◽  
Mayank Sohani

Cancer has been characterized as one of the leading diseases that causes death in humans. Breast cancer being a subtype of cancer causes death in one out of every eight women worldwide. The solution to counter this is by conducting early and accurate diagnosis for faster treatment. To achieve such accuracy in a short span of time proves difficult with existing techniques. In this paper, different machine learning algorithms which can be used as tools by physicians for early and effective detection and prediction of cancerous cells have been studied and introduced. The different algorithms introduced here are ANN, DT, Random Forest (RF), Naive Bayes Classifier (NBC), SVM and KNN. These algorithms are trained with a dataset that contain parameters describing the tumor of a person having breast cancer and are then used to classify and predict whether the cell is cancerous.


2021 ◽  
Vol 7 ◽  
pp. e390
Author(s):  
Shafaq Abbas ◽  
Zunera Jalil ◽  
Abdul Rehman Javed ◽  
Iqra Batool ◽  
Mohammad Zubair Khan ◽  
...  

Breast cancer is one of the leading causes of death in the current age. It often results in subpar living conditions for a patient as they have to go through expensive and painful treatments to fight this cancer. One in eight women all over the world is affected by this disease. Almost half a million women annually do not survive this fight and die from this disease. Machine learning algorithms have proven to outperform all existing solutions for the prediction of breast cancer using models built on the previously available data. In this paper, a novel approach named BCD-WERT is proposed that utilizes the Extremely Randomized Tree and Whale Optimization Algorithm (WOA) for efficient feature selection and classification. WOA reduces the dimensionality of the dataset and extracts the relevant features for accurate classification. Experimental results on state-of-the-art comprehensive dataset demonstrated improved performance in comparison with eight other machine learning algorithms: Support Vector Machine (SVM), Random Forest, Kernel Support Vector Machine, Decision Tree, Logistic Regression, Stochastic Gradient Descent, Gaussian Naive Bayes and k-Nearest Neighbor. BCD-WERT outperformed all with the highest accuracy rate of 99.30% followed by SVM achieving 98.60% accuracy. Experimental results also reveal the effectiveness of feature selection techniques in improving prediction accuracy.


Author(s):  
Yolanda D Austria ◽  
Marie Luvett Goh ◽  
Lorenzo Sta. Maria Jr. ◽  
Jay-Ar Lalata ◽  
Joselito Eduard Goh ◽  
...  

2021 ◽  
Vol 191 ◽  
pp. 487-492
Author(s):  
Mohammed Amine Naji ◽  
Sanaa El Filali ◽  
Kawtar Aarika ◽  
EL Habib Benlahmar ◽  
Rachida Ait Abdelouhahid ◽  
...  

2020 ◽  
Vol 8 (2S7) ◽  
pp. 2237-2240

In diagnosis and prediction systems, algorithms working on datasets with a high number of dimensions tend to take more time than those with fewer dimensions. Feature subset selection algorithms enhance the efficiency of Machine Learning algorithms in prediction problems by selecting a subset of the total features and thus pruning redundancy and noise. In this article, such a feature subset selection method is proposed and implemented to diagnose breast cancer using Support Vector Machine (SVM) and K-Nearest Neighbor (KNN) algorithms. This feature selection algorithm is based on Social Group Optimization (SGO) an evolutionary algorithm. Higher accuracy in diagnosing breast cancer is achieved using our proposed model when compared to other feature selection-based Machine Learning algorithms


Sign in / Sign up

Export Citation Format

Share Document