Breast Cancer: Classification of Tumors Using Machine Learning Algorithms

Author(s):  
David Hettich ◽  
Megan Olson ◽  
Andie Jackson ◽  
Naima Kaabouch
2020 ◽  
Vol 4 (2) ◽  
pp. 535-544
Author(s):  
Djihane HOUFANI ◽  
◽  
Sihem SLATNIA ◽  
Okba KAZAR ◽  
Noureddine ZERHOUNI ◽  
...  

Background: The second leading deadliest disease affecting women worldwide, after lung cancer, is breast cancer. Traditional approaches for breast cancer diagnosis suffer from time consumption and some human errors in classification. To deal with this problems, many research works based on machine learning techniques are proposed. These approaches show their effectiveness in data classification in many fields, especially in healthcare. Methods: In this cross sectional study, we conducted a practical comparison between the most used machine learning algorithms in the literature. We applied kernel and linear support vector machines, random forest, decision tree, multi-layer perceptron, logistic regression, and k-nearest neighbors for breast cancer tumors classification. The used dataset is Wisconsin diagnosis Breast Cancer. Results: After comparing the machine learning algorithms efficiency, we noticed that multilayer perceptron and logistic regression gave the best results with an accuracy of 98% for breast cancer classification. Conclusion: Machine learning approaches are extensively used in medical prediction and decision support systems. This study showed that multilayer perceptron and logistic regression algorithms are performant ( good accuracy specificity and sensitivity) compared to the other evaluated algorithms.


2021 ◽  
Vol 4 (4) ◽  
pp. 309-315
Author(s):  
Kumawuese Jennifer Kurugh ◽  
Muhammad Aminu Ahmad ◽  
Awwal Ahmad Babajo

Datasets are a major requirement in the development of breast cancer classification/detection models using machine learning algorithms. These models can provide an effective, accurate and less expensive diagnosis method and reduce life losses. However, using the same machine learning algorithms on different datasets yields different results. This research developed several machine learning models for breast cancer classification/detection using Random forest, support vector machine, K Nearest Neighbors, Gaussian Naïve Bayes, Perceptron and Logistic regression. Three widely used test data sets were used; Wisconsin Breast Cancer (WBC) Original, Wisconsin Diagnostic Breast Cancer (WDBC) and Wisconsin Prognostic Breast Cancer (WPBC). The results show that datasets affect the performance of machine learning classifiers. Also, the machine learning classifiers have different performances with a given breast cancer dataset


2021 ◽  
Vol 12 (2) ◽  
pp. 2422-2439

Cancer classification is one of the main objectives for analyzing big biological datasets. Machine learning algorithms (MLAs) have been extensively used to accomplish this task. Several popular MLAs are available in the literature to classify new samples into normal or cancer populations. Nevertheless, most of them often yield lower accuracies in the presence of outliers, which leads to incorrect classification of samples. Hence, in this study, we present a robust approach for the efficient and precise classification of samples using noisy GEDs. We examine the performance of the proposed procedure in a comparison of the five popular traditional MLAs (SVM, LDA, KNN, Naïve Bayes, Random forest) using both simulated and real gene expression data analysis. We also considered several rates of outliers (10%, 20%, and 50%). The results obtained from simulated data confirm that the traditional MLAs produce better results through our proposed procedure in the presence of outliers using the proposed modified datasets. The further transcriptome analysis found the significant involvement of these extra features in cancer diseases. The results indicated the performance improvement of the traditional MLAs with our proposed procedure. Hence, we propose to apply the proposed procedure instead of the traditional procedure for cancer classification.


2020 ◽  
Vol 2020 ◽  
pp. 1-10 ◽  
Author(s):  
Habib Dhahri ◽  
Ines Rahmany ◽  
Awais Mahmood ◽  
Eslam Al Maghayreh ◽  
Wail Elkilani

Breast cancer is the most diagnosed cancer among women around the world. The development of computer-aided diagnosis tools is essential to help pathologists to accurately interpret and discriminate between malignant and benign tumors. This paper proposes the development of an automated proliferative breast lesion diagnosis based on machine-learning algorithms. We used Tabu search to select the most significant features. The evaluation of the feature is based on the dependency degree of each attribute in the rough set. The categorization of reduced features was built using five machine-learning algorithms. The proposed models were applied to the BIDMC-MGH and Wisconsin Diagnostic Breast Cancer datasets. The performance measures of the used models were evaluated owing to five criteria. The top performing models were AdaBoost and logistic regression. Comparisons with others works prove the efficiency of the proposed method for superior diagnosis of breast cancer against the reviewed classification techniques.


2020 ◽  
Vol 98 (Supplement_4) ◽  
pp. 126-127
Author(s):  
Lucas S Lopes ◽  
Christine F Baes ◽  
Dan Tulpan ◽  
Luis Artur Loyola Chardulo ◽  
Otavio Machado Neto ◽  
...  

Abstract The aim of this project is to compare some of the state-of-the-art machine learning algorithms on the classification of steers finished in feedlots based on performance, carcass and meat quality traits. The precise classification of animals allows for fast, real-time decision making in animal food industry, such as culling or retention of herd animals. Beef production presents high variability in its numerous carcass and beef quality traits. Machine learning algorithms and software provide an opportunity to evaluate the interactions between traits to better classify animals. Four different treatment levels of wet distiller’s grain were applied to 97 Angus-Nellore animals and used as features for the classification problem. The C4.5 decision tree, Naïve Bayes (NB), Random Forest (RF) and Multilayer Perceptron (MLP) Artificial Neural Network algorithms were used to predict and classify the animals based on recorded traits measurements, which include initial and final weights, sheer force and meat color. The top performing classifier was the C4.5 decision tree algorithm with a classification accuracy of 96.90%, while the RF, the MLP and NB classifiers had accuracies of 55.67%, 39.17% and 29.89% respectively. We observed that the final decision tree model constructed with C4.5 selected only the dry matter intake (DMI) feature as a differentiator. When DMI was removed, no other feature or combination of features was sufficiently strong to provide good prediction accuracies for any of the classifiers. We plan to investigate in a follow-up study on a significantly larger sample size, the reasons behind DMI being a more relevant parameter than the other measurements.


Sign in / Sign up

Export Citation Format

Share Document