A Novel Data Mining on Breast Cancer Survivability Using MLP Ensemble Learners

2019 ◽  
Vol 63 (3) ◽  
pp. 435-447
Author(s):  
Mohsen Salehi ◽  
Jafar Razmara ◽  
Shahriar Lotfi

Abstract Breast cancer survivability has always been an important and challenging issue for researchers. Different methods have been utilized mostly based on machine learning techniques for prediction of survivability among cancer patients. The most comprehensive available database of cancer incidence is SEER in the United States, which has been frequently used for different research purposes. In this paper, a new data mining has been performed on the SEER database in order to investigate the ability of machine learning techniques for survivability prediction of breast cancer patients. To this end, the data related to breast cancer incidence have been preprocessed to remove unusable records from the dataset. In sequel, two machine learning techniques were developed based on the Multi-Layer Perceptron (MLP) learner machine including MLP stacked generalization and mixture of MLP-experts to make predictions over the database. The machines have been evaluated using K-fold cross-validation technique. The evaluation of the predictors revealed an accuracy of 84.32% and 83.86% by the mixture of MLP-experts and MLP stacked generalization methods, respectively. This indicates that the predictors can be significantly used for survivability prediction suggesting time- and cost-effective treatment for breast cancer patients.

2019 ◽  
Vol 15 (29) ◽  
pp. 1-23 ◽  
Author(s):  
Rashmi Agrawal

This paper is a product of the research Project “Predictive Analysis Of Breast Cancer Using Machine Learning Techniques” performed in Manav Rachna International Institute of Research and Studies, Faridabad in the year 2018. Introduction: The present article is part of the effort to predict breast cancer which is a serious concern for women’s health. Problem: Breast cancer is the most common type of cancer and has always been a threat to women’s lives. Early diagnosis requires an effective method to predict cancer to allow physicians to distinguish benign and malicious cancer. Researchers and scientists have been trying hard to find innovative methods to predict cancer. Objective: The objective of this paper will be predictive analysis of breast cancer using various machine learning techniques like Naïve Bayes method, Linear Discriminant Analysis, K-Nearest Neighbors and Support Vector Machine method.  Methodology: Predictive data mining has become an instrument for scientists and researchers in the medical field. Predicting breast cancer at an early stage helps in better cure and treatment. KDD (Knowledge Discovery in Databases) is one of the most popular data mining methods used by medical researchers to identify the patterns and the relationship between variables and also helps in predicting the outcome of the disease based upon historical data of datasets. Results: To select the best model for cancer prediction, accuracy of all models will be estimated and the best model will be selected. Conclusion: This work seeks to predict the best technique with highest accuracy for breast cancer. Originality: This research has been performed using R and the dataset taken from UCI machine learning repository. Limitations: The lack of exact information provided by data.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Pratyusha Rakshit ◽  
Onintze Zaballa ◽  
Aritz Pérez ◽  
Elisa Gómez-Inhiesto ◽  
Maria T. Acaiturri-Ayesta ◽  
...  

AbstractThis paper presents a novel machine learning approach to perform an early prediction of the healthcare cost of breast cancer patients. The learning phase of our prediction method considers the following two steps: (1) in the first step, the patients are clustered taking into account the sequences of actions undergoing similar clinical activities and ensuring similar healthcare costs, and (2) a Markov chain is then learned for each group to describe the action-sequences of the patients in the cluster. A two step procedure is undertaken in the prediction phase: (1) first, the healthcare cost of a new patient’s treatment is estimated based on the average healthcare cost of its k-nearest neighbors in each group, and (2) finally, an aggregate measure of the healthcare cost estimated by each group is used as the final predicted cost. Experiments undertaken reveal a mean absolute percentage error as small as 6%, even when half of the clinical records of a patient is available, substantiating the early prediction capability of the proposed method. Comparative analysis substantiates the superiority of the proposed algorithm over the state-of-the-art techniques.


Sign in / Sign up

Export Citation Format

Share Document