scholarly journals Machine learning prediction models in orthopedic surgery: A systematic review in transparent reporting

Author(s):  
Olivier Q. Groot ◽  
Paul T. Ogink ◽  
Amanda Lans ◽  
Peter K. Twining ◽  
Neal D. Kapoor ◽  
...  
Author(s):  
Anil Babu Payedimarri ◽  
Diego Concina ◽  
Luigi Portinale ◽  
Massimo Canonico ◽  
Deborah Seys ◽  
...  

Artificial Intelligence (AI) and Machine Learning (ML) have expanded their utilization in different fields of medicine. During the SARS-CoV-2 outbreak, AI and ML were also applied for the evaluation and/or implementation of public health interventions aimed to flatten the epidemiological curve. This systematic review aims to evaluate the effectiveness of the use of AI and ML when applied to public health interventions to contain the spread of SARS-CoV-2. Our findings showed that quarantine should be the best strategy for containing COVID-19. Nationwide lockdown also showed positive impact, whereas social distancing should be considered to be effective only in combination with other interventions including the closure of schools and commercial activities and the limitation of public transportation. Our findings also showed that all the interventions should be initiated early in the pandemic and continued for a sustained period. Despite the study limitation, we concluded that AI and ML could be of help for policy makers to define the strategies for containing the COVID-19 pandemic.


Author(s):  
Nghia H Nguyen ◽  
Dominic Picetti ◽  
Parambir S Dulai ◽  
Vipul Jairath ◽  
William J Sandborn ◽  
...  

Abstract Background and Aims There is increasing interest in machine learning-based prediction models in inflammatory bowel diseases (IBD). We synthesized and critically appraised studies comparing machine learning vs. traditional statistical models, using routinely available clinical data for risk prediction in IBD. Methods Through a systematic review till January 1, 2021, we identified cohort studies that derived and/or validated machine learning models, based on routinely collected clinical data in patients with IBD, to predict the risk of harboring or developing adverse clinical outcomes, and reported its predictive performance against a traditional statistical model for the same outcome. We appraised the risk of bias in these studies using the Prediction model Risk of Bias ASsessment (PROBAST) tool. Results We included 13 studies on machine learning-based prediction models in IBD encompassing themes of predicting treatment response to biologics and thiopurines, predicting longitudinal disease activity and complications and outcomes in patients with acute severe ulcerative colitis. The most common machine learnings models used were tree-based algorithms, which are classification approaches achieved through supervised learning. Machine learning models outperformed traditional statistical models in risk prediction. However, most models were at high risk of bias, and only one was externally validated. Conclusions Machine learning-based prediction models based on routinely collected data generally perform better than traditional statistical models in risk prediction in IBD, though frequently have high risk of bias. Future studies examining these approaches are warranted, with special focus on external validation and clinical applicability.


BMJ ◽  
2020 ◽  
pp. m958 ◽  
Author(s):  
Elham Mahmoudi ◽  
Neil Kamdar ◽  
Noa Kim ◽  
Gabriella Gonzales ◽  
Karandeep Singh ◽  
...  

Abstract Objective To provide focused evaluation of predictive modeling of electronic medical record (EMR) data to predict 30 day hospital readmission. Design Systematic review. Data source Ovid Medline, Ovid Embase, CINAHL, Web of Science, and Scopus from January 2015 to January 2019. Eligibility criteria for selecting studies All studies of predictive models for 28 day or 30 day hospital readmission that used EMR data. Outcome measures Characteristics of included studies, methods of prediction, predictive features, and performance of predictive models. Results Of 4442 citations reviewed, 41 studies met the inclusion criteria. Seventeen models predicted risk of readmission for all patients and 24 developed predictions for patient specific populations, with 13 of those being developed for patients with heart conditions. Except for two studies from the UK and Israel, all were from the US. The total sample size for each model ranged between 349 and 1 195 640. Twenty five models used a split sample validation technique. Seventeen of 41 studies reported C statistics of 0.75 or greater. Fifteen models used calibration techniques to further refine the model. Using EMR data enabled final predictive models to use a wide variety of clinical measures such as laboratory results and vital signs; however, use of socioeconomic features or functional status was rare. Using natural language processing, three models were able to extract relevant psychosocial features, which substantially improved their predictions. Twenty six studies used logistic or Cox regression models, and the rest used machine learning methods. No statistically significant difference (difference 0.03, 95% confidence interval −0.0 to 0.07) was found between average C statistics of models developed using regression methods (0.71, 0.68 to 0.73) and machine learning (0.74, 0.71 to 0.77). Conclusions On average, prediction models using EMR data have better predictive performance than those using administrative data. However, this improvement remains modest. Most of the studies examined lacked inclusion of socioeconomic features, failed to calibrate the models, neglected to conduct rigorous diagnostic testing, and did not discuss clinical impact.


2019 ◽  
Vol 110 ◽  
pp. 12-22 ◽  
Author(s):  
Evangelia Christodoulou ◽  
Jie Ma ◽  
Gary S. Collins ◽  
Ewout W. Steyerberg ◽  
Jan Y. Verbakel ◽  
...  

2019 ◽  
Author(s):  
Herdiantri Sufriyana ◽  
Atina Husnayain ◽  
Ya-Lin Chen ◽  
Chao-Yang Kuo ◽  
Onkar Singh ◽  
...  

BACKGROUND Predictions in pregnancy care are complex because of interactions among multiple factors. Hence, pregnancy outcomes are not easily predicted by a single predictor using only one algorithm or modeling method. OBJECTIVE This study aims to review and compare the predictive performances between logistic regression (LR) and other machine learning algorithms for developing or validating a multivariable prognostic prediction model for pregnancy care to inform clinicians’ decision making. METHODS Research articles from MEDLINE, Scopus, Web of Science, and Google Scholar were reviewed following several guidelines for a prognostic prediction study, including a risk of bias (ROB) assessment. We report the results based on the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines. Studies were primarily framed as PICOTS (population, index, comparator, outcomes, timing, and setting): Population: men or women in procreative management, pregnant women, and fetuses or newborns; Index: multivariable prognostic prediction models using non-LR algorithms for risk classification to inform clinicians’ decision making; Comparator: the models applying an LR; Outcomes: pregnancy-related outcomes of procreation or pregnancy outcomes for pregnant women and fetuses or newborns; Timing: pre-, inter-, and peripregnancy periods (predictors), at the pregnancy, delivery, and either puerperal or neonatal period (outcome), and either short- or long-term prognoses (time interval); and Setting: primary care or hospital. The results were synthesized by reporting study characteristics and ROBs and by random effects modeling of the difference of the logit area under the receiver operating characteristic curve of each non-LR model compared with the LR model for the same pregnancy outcomes. We also reported between-study heterogeneity by using <i>τ<sup>2</sup></i> and <i>I<sup>2</sup></i>. RESULTS Of the 2093 records, we included 142 studies for the systematic review and 62 studies for a meta-analysis. Most prediction models used LR (92/142, 64.8%) and artificial neural networks (20/142, 14.1%) among non-LR algorithms. Only 16.9% (24/142) of studies had a low ROB. A total of 2 non-LR algorithms from low ROB studies significantly outperformed LR. The first algorithm was a random forest for preterm delivery (logit AUROC 2.51, 95% CI 1.49-3.53; <i>I<sup>2</sup></i>=86%; <i>τ<sup>2</sup></i>=0.77) and pre-eclampsia (logit AUROC 1.2, 95% CI 0.72-1.67; <i>I<sup>2</sup></i>=75%; <i>τ<sup>2</sup></i>=0.09). The second algorithm was gradient boosting for cesarean section (logit AUROC 2.26, 95% CI 1.39-3.13; <i>I<sup>2</sup></i>=75%; <i>τ<sup>2</sup></i>=0.43) and gestational diabetes (logit AUROC 1.03, 95% CI 0.69-1.37; <i>I<sup>2</sup></i>=83%; <i>τ<sup>2</sup></i>=0.07). CONCLUSIONS Prediction models with the best performances across studies were not necessarily those that used LR but also used random forest and gradient boosting that also performed well. We recommend a reanalysis of existing LR models for several pregnancy outcomes by comparing them with those algorithms that apply standard guidelines. CLINICALTRIAL PROSPERO (International Prospective Register of Systematic Reviews) CRD42019136106; https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=136106


Author(s):  
Valentina Bellini ◽  
Marina Valente ◽  
Giorgia Bertorelli ◽  
Barbara Pifferi ◽  
Michelangelo Craca ◽  
...  

Abstract Background Risk stratification plays a central role in anesthetic evaluation. The use of Big Data and machine learning (ML) offers considerable advantages for collection and evaluation of large amounts of complex health-care data. We conducted a systematic review to understand the role of ML in the development of predictive post-surgical outcome models and risk stratification. Methods Following the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) guidelines, we selected the period of the research for studies from 1 January 2015 up to 30 March 2021. A systematic search in Scopus, CINAHL, the Cochrane Library, PubMed, and MeSH databases was performed; the strings of research included different combinations of keywords: “risk prediction,” “surgery,” “machine learning,” “intensive care unit (ICU),” and “anesthesia” “perioperative.” We identified 36 eligible studies. This study evaluates the quality of reporting of prediction models using the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) checklist. Results The most considered outcomes were mortality risk, systemic complications (pulmonary, cardiovascular, acute kidney injury (AKI), etc.), ICU admission, anesthesiologic risk and prolonged length of hospital stay. Not all the study completely followed the TRIPOD checklist, but the quality was overall acceptable with 75% of studies (Rev #2, comm #minor issue) showing an adherence rate to TRIPOD more than 60%. The most frequently used algorithms were gradient boosting (n = 13), random forest (n = 10), logistic regression (LR; n = 7), artificial neural networks (ANNs; n = 6), and support vector machines (SVM; n = 6). Models with best performance were random forest and gradient boosting, with AUC > 0.90. Conclusions The application of ML in medicine appears to have a great potential. From our analysis, depending on the input features considered and on the specific prediction task, ML algorithms seem effective in outcomes prediction more accurately than validated prognostic scores and traditional statistics. Thus, our review encourages the healthcare domain and artificial intelligence (AI) developers to adopt an interdisciplinary and systemic approach to evaluate the overall impact of AI on perioperative risk assessment and on further health care settings as well.


2021 ◽  
Vol 7 ◽  
pp. 205520762110473
Author(s):  
Kushan De Silva ◽  
Joanne Enticott ◽  
Christopher Barton ◽  
Andrew Forbes ◽  
Sajal Saha ◽  
...  

Objective Machine learning involves the use of algorithms without explicit instructions. Of late, machine learning models have been widely applied for the prediction of type 2 diabetes. However, no evidence synthesis of the performance of these prediction models of type 2 diabetes is available. We aim to identify machine learning prediction models for type 2 diabetes in clinical and community care settings and determine their predictive performance. Methods The systematic review of English language machine learning predictive modeling studies in 12 databases will be conducted. Studies predicting type 2 diabetes in predefined clinical or community settings are eligible. Standard CHARMS and TRIPOD guidelines will guide data extraction. Methodological quality will be assessed using a predefined risk of bias assessment tool. The extent of validation will be categorized by Reilly–Evans levels. Primary outcomes include model performance metrics of discrimination ability, calibration, and classification accuracy. Secondary outcomes include candidate predictors, algorithms used, level of validation, and intended use of models. The random-effects meta-analysis of c-indices will be performed to evaluate discrimination abilities. The c-indices will be pooled per prediction model, per model type, and per algorithm. Publication bias will be assessed through funnel plots and regression tests. Sensitivity analysis will be conducted to estimate the effects of study quality and missing data on primary outcome. The sources of heterogeneity will be assessed through meta-regression. Subgroup analyses will be performed for primary outcomes. Ethics and dissemination No ethics approval is required, as no primary or personal data are collected. Findings will be disseminated through scientific sessions and peer-reviewed journals. PROSPERO registration number CRD42019130886


Sign in / Sign up

Export Citation Format

Share Document