scholarly journals Machine learning techniques for personalized breast cancer risk prediction: comparison with the BCRAT and BOADICEA models

2019 ◽  
Vol 21 (1) ◽  
Author(s):  
Chang Ming ◽  
Valeria Viassolo ◽  
Nicole Probst-Hensch ◽  
Pierre O. Chappuis ◽  
Ivo D. Dinov ◽  
...  

Abstract Background Comprehensive breast cancer risk prediction models enable identifying and targeting women at high-risk, while reducing interventions in those at low-risk. Breast cancer risk prediction models used in clinical practice have low discriminatory accuracy (0.53–0.64). Machine learning (ML) offers an alternative approach to standard prediction modeling that may address current limitations and improve accuracy of those tools. The purpose of this study was to compare the discriminatory accuracy of ML-based estimates against a pair of established methods—the Breast Cancer Risk Assessment Tool (BCRAT) and Breast and Ovarian Analysis of Disease Incidence and Carrier Estimation Algorithm (BOADICEA) models. Methods We quantified and compared the performance of eight different ML methods to the performance of BCRAT and BOADICEA using eight simulated datasets and two retrospective samples: a random population-based sample of U.S. breast cancer patients and their cancer-free female relatives (N = 1143), and a clinical sample of Swiss breast cancer patients and cancer-free women seeking genetic evaluation and/or testing (N = 2481). Results Predictive accuracy (AU-ROC curve) reached 88.28% using ML-Adaptive Boosting and 88.89% using ML-random forest versus 62.40% with BCRAT for the U.S. population-based sample. Predictive accuracy reached 90.17% using ML-adaptive boosting and 89.32% using ML-Markov chain Monte Carlo generalized linear mixed model versus 59.31% with BOADICEA for the Swiss clinic-based sample. Conclusions There was a striking improvement in the accuracy of classification of women with and without breast cancer achieved with ML algorithms compared to the state-of-the-art model-based approaches. High-accuracy prediction techniques are important in personalized medicine because they facilitate stratification of prevention strategies and individualized clinical management.

2017 ◽  
Vol 1 (1) ◽  
pp. 53-59 ◽  
Author(s):  
Lance T. Pflieger ◽  
Clinton C. Mason ◽  
Julio C. Facelli

Introduction. Family health history (FHx) is an important factor in breast and ovarian cancer risk assessment. As such, multiple risk prediction models rely strongly on FHx data when identifying a patient’s risk. These models were developed using verified information and when translated into a clinical setting assume that a patient’s FHx is accurate and complete. However, FHx information collected in a typical clinical setting is known to be imprecise and it is not well understood how this uncertainty may affect predictions in clinical settings. Methods. Using Monte Carlo simulations and existing measurements of uncertainty of self-reported FHx, we show how uncertainty in FHx information can alter risk classification when used in typical clinical settings. Results. We found that various models ranged from 52% to 64% for correct tier-level classification of pedigrees under a set of contrived uncertain conditions, but that significant misclassification are not negligible. Conclusions. Our work implies that (i) uncertainty quantification needs to be considered when transferring tools from a controlled research environment to a more uncertain environment (i.e, a health clinic) and (ii) better FHx collection methods are needed to reduce uncertainty in breast cancer risk prediction in clinical settings.


Sign in / Sign up

Export Citation Format

Share Document