scholarly journals Optimized Breast Cancer Classification using Feature Selection and Outliers Detection

Author(s):  
A. B Yusuf ◽  
R. M Dima ◽  
S. K Aina

Breast cancer is the second most commonly diagnosed cancer in women throughout the world. It is on the rise, especially in developing countries, where the majority of cases are discovered late. Breast cancer develops when cancerous tumors form on the surface of the breast cells. The absence of accurate prognostic models to assist physicians recognize symptoms early makes it difficult to develop a treatment plan that would help patients live longer. However, machine learning techniques have recently been used to improve the accuracy and speed of breast cancer diagnosis. If the accuracy is flawless, the model will be more efficient, and the solution to breast cancer diagnosis will be better. Nevertheless, the primary difficulty for systems developed to detect breast cancer using machine-learning models is attaining the greatest classification accuracy and picking the most predictive feature useful for increasing accuracy. As a result, breast cancer prognosis remains a difficulty in today's society. This research seeks to address a flaw in an existing technique that is unable to enhance classification of continuous-valued data, particularly its accuracy and the selection of optimal features for breast cancer prediction. In order to address these issues, this study examines the impact of outliers and feature reduction on the Wisconsin Diagnostic Breast Cancer Dataset, which was tested using seven different machine learning algorithms. The results show that Logistic Regression, Random Forest, and Adaboost classifiers achieved the greatest accuracy of 99.12%, on removal of outliers from the dataset. Also, this filtered dataset with feature selection, on the other hand, has the greatest accuracy of 100% and 99.12% with Random Forest and Gradient boost classifiers, respectively. When compared to other state-of-the-art approaches, the two suggested strategies outperformed the unfiltered data in terms of accuracy. The suggested architecture might be a useful tool for radiologists to reduce the number of false negatives and positives. As a result, the efficiency of breast cancer diagnosis analysis will be increased.

2020 ◽  
Vol 14 ◽  

Breast Cancer (BC) is amongst the most common and leading causes of deaths in women throughout the world. Recently, classification and data analysis tools are being widely used in the medical field for diagnosis, prognosis and decision making to help lower down the risks of people dying or suffering from diseases. Advanced machine learning methods have proven to give hope for patients as this has helped the doctors in early detection of diseases like Breast Cancer that can be fatal, in support with providing accurate outcomes. However, the results highly depend on the techniques used for feature selection and classification which will produce a strong machine learning model. In this paper, a performance comparison is conducted using four classifiers which are Multilayer Perceptron (MLP), Support Vector Machine (SVM), K-Nearest Neighbors (KNN) and Random Forest on the Wisconsin Breast Cancer dataset to spot the most effective predictors. The main goal is to apply best machine learning classification methods to predict the Breast Cancer as benign or malignant using terms such as accuracy, f-measure, precision and recall. Experimental results show that Random forest is proven to achieve the highest accuracy of 99.26% on this dataset and features, while SVM and KNN show 97.78% and 97.04% accuracy respectively. MLP shows the least accuracy of 94.07%. All the experiments are conducted using RStudio as the data mining tool platform.


2021 ◽  
Vol 30 (1) ◽  
pp. 998-1013
Author(s):  
Shubham Vashisth ◽  
Ishika Dhall ◽  
Garima Aggarwal

Abstract The rapid pace of development over the last few decades in the domain of machine learning mirrors the advances made in the field of quantum computing. It is natural to ask whether the conventional machine learning algorithms could be optimized using the present-day noisy intermediate-scale quantum technology. There are certain computational limitations while training a machine learning model on a classical computer. Using quantum computation, it is possible to surpass these limitations and carry out such calculations in an optimized manner. This study illustrates the working of the quantum support vector machine classification model which guarantees an exponential speed-up over its typical alternatives. This research uses the quantum SVM model to solve the classification task of a malignant breast cancer diagnosis. This study also demonstrates a comparative analysis of distinct forms of SVM algorithms concerning their time complexity and performances on standard evaluation metrics, namely accuracy, precision, recall, and F1-score, to exemplify the supremacy of quantum SVM over its conventional variants.


Sign in / Sign up

Export Citation Format

Share Document