Optimized Breast Cancer Classification using Feature Selection and Outliers Detection

Breast cancer is the second most commonly diagnosed cancer in women throughout the world. It is on the rise, especially in developing countries, where the majority of cases are discovered late. Breast cancer develops when cancerous tumors form on the surface of the breast cells. The absence of accurate prognostic models to assist physicians recognize symptoms early makes it difficult to develop a treatment plan that would help patients live longer. However, machine learning techniques have recently been used to improve the accuracy and speed of breast cancer diagnosis. If the accuracy is flawless, the model will be more efficient, and the solution to breast cancer diagnosis will be better. Nevertheless, the primary difficulty for systems developed to detect breast cancer using machine-learning models is attaining the greatest classification accuracy and picking the most predictive feature useful for increasing accuracy. As a result, breast cancer prognosis remains a difficulty in today's society. This research seeks to address a flaw in an existing technique that is unable to enhance classification of continuous-valued data, particularly its accuracy and the selection of optimal features for breast cancer prediction. In order to address these issues, this study examines the impact of outliers and feature reduction on the Wisconsin Diagnostic Breast Cancer Dataset, which was tested using seven different machine learning algorithms. The results show that Logistic Regression, Random Forest, and Adaboost classifiers achieved the greatest accuracy of 99.12%, on removal of outliers from the dataset. Also, this filtered dataset with feature selection, on the other hand, has the greatest accuracy of 100% and 99.12% with Random Forest and Gradient boost classifiers, respectively. When compared to other state-of-the-art approaches, the two suggested strategies outperformed the unfiltered data in terms of accuracy. The suggested architecture might be a useful tool for radiologists to reduce the number of false negatives and positives. As a result, the efficiency of breast cancer diagnosis analysis will be increased.

Download Full-text

Breast Cancer Diagnosis Using Machine Learning Algorithms - A Survey

International Journal of Distributed and Parallel systems ◽

10.5121/ijdps.2013.4309 ◽

2013 ◽

Vol 4 (3) ◽

pp. 105-112 ◽

Cited By ~ 15

Author(s):

Gayathri B.M ◽

Sumathi C.P ◽

Santhanam T

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Cancer Diagnosis ◽

Learning Algorithms ◽

Breast Cancer Diagnosis ◽

Machine Learning Algorithms

Download Full-text

Classifications of Breast Cancer Diagnosis using Machine Learning

International Journal of Computers ◽

10.46300/9108.2020.14.13 ◽

2020 ◽

Vol 14 ◽

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Random Forest ◽

Breast Cancer Diagnosis ◽

Performance Comparison ◽

Support Vector ◽

Breast Cancer Dataset ◽

K Nearest Neighbors ◽

Cancer Dataset ◽

Machine Learning Classification

Breast Cancer (BC) is amongst the most common and leading causes of deaths in women throughout the world. Recently, classification and data analysis tools are being widely used in the medical field for diagnosis, prognosis and decision making to help lower down the risks of people dying or suffering from diseases. Advanced machine learning methods have proven to give hope for patients as this has helped the doctors in early detection of diseases like Breast Cancer that can be fatal, in support with providing accurate outcomes. However, the results highly depend on the techniques used for feature selection and classification which will produce a strong machine learning model. In this paper, a performance comparison is conducted using four classifiers which are Multilayer Perceptron (MLP), Support Vector Machine (SVM), K-Nearest Neighbors (KNN) and Random Forest on the Wisconsin Breast Cancer dataset to spot the most effective predictors. The main goal is to apply best machine learning classification methods to predict the Breast Cancer as benign or malignant using terms such as accuracy, f-measure, precision and recall. Experimental results show that Random forest is proven to achieve the highest accuracy of 99.26% on this dataset and features, while SVM and KNN show 97.78% and 97.04% accuracy respectively. MLP shows the least accuracy of 94.07%. All the experiments are conducted using RStudio as the data mining tool platform.

Download Full-text

Predictive Analysis of Machine Learning Algorithms for Breast Cancer Diagnosis

2020 8th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO) ◽

10.1109/icrito48877.2020.9197945 ◽

2020 ◽

Author(s):

Mudit Arora ◽

Subhranil Som ◽

Ajay Rana

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Cancer Diagnosis ◽

Learning Algorithms ◽

Breast Cancer Diagnosis ◽

Machine Learning Algorithms ◽

Predictive Analysis

Download Full-text

Analysis of Breast Cancer Diagnosis and Prognosis Using Machine Learning Algorithms

Advances in Automation, Signal Processing, Instrumentation, and Control - Lecture Notes in Electrical Engineering ◽

10.1007/978-981-15-8221-9_298 ◽

2021 ◽

pp. 3197-3211

Author(s):

Sankardoss Varadhan ◽

Nikhil Jeswani ◽

Vaibhav Sajnani ◽

Prince Jaiswal

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Cancer Diagnosis ◽

Learning Algorithms ◽

Breast Cancer Diagnosis ◽

Machine Learning Algorithms ◽

Diagnosis And Prognosis ◽

Cancer Diagnosis And Prognosis

Download Full-text

A Comparative Analysis of Feature Selection Methods and Associated Machine Learning Algorithms on Wisconsin Breast Cancer Dataset (WBCD)

Advances in Intelligent Systems and Computing - Proceedings of International Conference on ICT for Sustainable Development ◽

10.1007/978-981-10-0129-1_23 ◽

2016 ◽

pp. 215-224 ◽

Cited By ~ 3

Author(s):

Nileshkumar Modi ◽

Kaushar Ghanchi

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Feature Selection ◽

Comparative Analysis ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Breast Cancer Dataset ◽

Selection Methods ◽

Cancer Dataset

Download Full-text

A Breast Cancer Diagnosis Method based on VIM Feature Selection and Hierarchical Clustering Random Forest Algorithm

IEEE Access ◽

10.1109/access.2021.3139595 ◽

2021 ◽

pp. 1-1

Author(s):

Zexian Huang ◽

Daqi Chen

Keyword(s):

Breast Cancer ◽

Feature Selection ◽

Random Forest ◽

Hierarchical Clustering ◽

Cancer Diagnosis ◽

Breast Cancer Diagnosis ◽

Random Forest Algorithm ◽

Diagnosis Method

Download Full-text

Design and analysis of quantum powered support vector machines for malignant breast cancer diagnosis

Journal of Intelligent Systems ◽

10.1515/jisys-2020-0089 ◽

2021 ◽

Vol 30 (1) ◽

pp. 998-1013

Author(s):

Shubham Vashisth ◽

Ishika Dhall ◽

Garima Aggarwal

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Cancer Diagnosis ◽

Breast Cancer Diagnosis ◽

Machine Learning Algorithms ◽

Classification Model ◽

Support Vector ◽

Malignant Breast ◽

Quantum Technology ◽

Classical Computer

Abstract The rapid pace of development over the last few decades in the domain of machine learning mirrors the advances made in the field of quantum computing. It is natural to ask whether the conventional machine learning algorithms could be optimized using the present-day noisy intermediate-scale quantum technology. There are certain computational limitations while training a machine learning model on a classical computer. Using quantum computation, it is possible to surpass these limitations and carry out such calculations in an optimized manner. This study illustrates the working of the quantum support vector machine classification model which guarantees an exponential speed-up over its typical alternatives. This research uses the quantum SVM model to solve the classification task of a malignant breast cancer diagnosis. This study also demonstrates a comparative analysis of distinct forms of SVM algorithms concerning their time complexity and performances on standard evaluation metrics, namely accuracy, precision, recall, and F1-score, to exemplify the supremacy of quantum SVM over its conventional variants.

Download Full-text