scholarly journals Multivariable Prediction Models for Health Care Spending Using Machine Learning: a Protocol of a Systematic Review

Author(s):  
Andrew W. Huang ◽  
Martin Haslberger ◽  
Neto Coulibaly ◽  
Omar Galárraga ◽  
Arman Oganisian ◽  
...  

Abstract Background With rising cost pressures on health care systems, machine-learning (ML) based algorithms are increasingly used to predict health care costs. Despite their potential advantages, the successful implementation of these methods could be undermined by biases introduced in the design, conduct, or analysis of studies seeking to develop and/or validate ML models. The utility of such models may also be negatively affected by poor reporting of these studies. In this systematic review, we aim to evaluate the reporting quality, methodological characteristics, and risk of bias of ML-based prediction models for individual-level health care spending. Methods We will systematically search PubMed and Embase to identify studies developing, updating, or validating ML-based models to predict an individual’s health care spending for any medical condition, over any time period, and in any setting. We will exclude prediction models of aggregate-level health care spending, models used to infer causality, models using radiomics or speech parameters, models of non-clinically validated predictors (e.g. genomics), and cost-effectiveness analyses without predicting individual-level health care spending. We will extract data based on the CHARMS checklist, previously published research, and relevant recommendations. We will assess the adherence of ML-based studies to the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) and examine the inclusion of transparency and reproducibility indicators (e.g. statements on data sharing). To assess the risk of bias, we will apply the Prediction model Risk Of Bias Assessment Tool (PROBAST). Findings will be stratified by study design, ML methods used, population characteristics, and medical field. Discussion Our systematic review will appraise the quality, reporting, and risk of bias of ML-based models for individualized health care cost prediction. This review will provide an overview of the available models and give insights into the strengths and limitations of using ML methods for the prediction of health spending. Trial registration: Not applicable.

BMJ Open ◽  
2020 ◽  
Vol 10 (11) ◽  
pp. e038832
Author(s):  
Constanza L Andaur Navarro ◽  
Johanna A A G Damen ◽  
Toshihiko Takada ◽  
Steven W J Nijman ◽  
Paula Dhiman ◽  
...  

IntroductionStudies addressing the development and/or validation of diagnostic and prognostic prediction models are abundant in most clinical domains. Systematic reviews have shown that the methodological and reporting quality of prediction model studies is suboptimal. Due to the increasing availability of larger, routinely collected and complex medical data, and the rising application of Artificial Intelligence (AI) or machine learning (ML) techniques, the number of prediction model studies is expected to increase even further. Prediction models developed using AI or ML techniques are often labelled as a ‘black box’ and little is known about their methodological and reporting quality. Therefore, this comprehensive systematic review aims to evaluate the reporting quality, the methodological conduct, and the risk of bias of prediction model studies that applied ML techniques for model development and/or validation.Methods and analysisA search will be performed in PubMed to identify studies developing and/or validating prediction models using any ML methodology and across all medical fields. Studies will be included if they were published between January 2018 and December 2019, predict patient-related outcomes, use any study design or data source, and available in English. Screening of search results and data extraction from included articles will be performed by two independent reviewers. The primary outcomes of this systematic review are: (1) the adherence of ML-based prediction model studies to the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD), and (2) the risk of bias in such studies as assessed using the Prediction model Risk Of Bias ASsessment Tool (PROBAST). A narrative synthesis will be conducted for all included studies. Findings will be stratified by study type, medical field and prevalent ML methods, and will inform necessary extensions or updates of TRIPOD and PROBAST to better address prediction model studies that used AI or ML techniques.Ethics and disseminationEthical approval is not required for this study because only available published data will be analysed. Findings will be disseminated through peer-reviewed publications and scientific conferences.Systematic review registrationPROSPERO, CRD42019161764.


Author(s):  
Nghia H Nguyen ◽  
Dominic Picetti ◽  
Parambir S Dulai ◽  
Vipul Jairath ◽  
William J Sandborn ◽  
...  

Abstract Background and Aims There is increasing interest in machine learning-based prediction models in inflammatory bowel diseases (IBD). We synthesized and critically appraised studies comparing machine learning vs. traditional statistical models, using routinely available clinical data for risk prediction in IBD. Methods Through a systematic review till January 1, 2021, we identified cohort studies that derived and/or validated machine learning models, based on routinely collected clinical data in patients with IBD, to predict the risk of harboring or developing adverse clinical outcomes, and reported its predictive performance against a traditional statistical model for the same outcome. We appraised the risk of bias in these studies using the Prediction model Risk of Bias ASsessment (PROBAST) tool. Results We included 13 studies on machine learning-based prediction models in IBD encompassing themes of predicting treatment response to biologics and thiopurines, predicting longitudinal disease activity and complications and outcomes in patients with acute severe ulcerative colitis. The most common machine learnings models used were tree-based algorithms, which are classification approaches achieved through supervised learning. Machine learning models outperformed traditional statistical models in risk prediction. However, most models were at high risk of bias, and only one was externally validated. Conclusions Machine learning-based prediction models based on routinely collected data generally perform better than traditional statistical models in risk prediction in IBD, though frequently have high risk of bias. Future studies examining these approaches are warranted, with special focus on external validation and clinical applicability.


2021 ◽  
Author(s):  
Jamie L. Miller ◽  
Masafumi Tada ◽  
Michihiko Goto ◽  
Nicholas Mohr ◽  
Sangil Lee

ABSTRACTBackgroundThroughout 2020, the coronavirus disease 2019 (COVID-19) has become a threat to public health on national and global level. There has been an immediate need for research to understand the clinical signs and symptoms of COVID-19 that can help predict deterioration including mechanical ventilation, organ support, and death. Studies thus far have addressed the epidemiology of the disease, common presentations, and susceptibility to acquisition and transmission of the virus; however, an accurate prognostic model for severe manifestations of COVID-19 is still needed because of the limited healthcare resources available.ObjectiveThis systematic review aims to evaluate published reports of prediction models for severe illnesses caused COVID-19.MethodsSearches were developed by the primary author and a medical librarian using an iterative process of gathering and evaluating terms. Comprehensive strategies, including both index and keyword methods, were devised for PubMed and EMBASE. The data of confirmed COVID-19 patients from randomized control studies, cohort studies, and case-control studies published between January 2020 and July 2020 were retrieved. Studies were independently assessed for risk of bias and applicability using the Prediction Model Risk Of Bias Assessment Tool (PROBAST). We collected study type, setting, sample size, type of validation, and outcome including intubation, ventilation, any other type of organ support, or death. The combination of the prediction model, scoring system, performance of predictive models, and geographic locations were summarized.ResultsA primary review found 292 articles relevant based on title and abstract. After further review, 246 were excluded based on the defined inclusion and exclusion criteria. Forty-six articles were included in the qualitative analysis. Inter observer agreement on inclusion was 0.86 (95% confidence interval: 0.79 - 0.93). When the PROBAST tool was applied, 44 of the 46 articles were identified to have high or unclear risk of bias, or high or unclear concern for applicability. Two studied reported prediction models, 4C Mortality Score from hospital data and QCOVID from general public data from UK, and were rated as low risk of bias and low concerns for applicability.ConclusionSeveral prognostic models are reported in the literature, but many of them had concerning risks of biases and applicability. For most of the studies, caution is needed before use, as many of them will require external validation before dissemination. However, two articles were found to have low risk of bias and low applicability can be useful tools.


2021 ◽  
Vol 79 (1) ◽  
Author(s):  
Médéa Locquet ◽  
Anh Nguyet Diep ◽  
Charlotte Beaudart ◽  
Nadia Dardenne ◽  
Christian Brabant ◽  
...  

Abstract Background The COVID-19 pandemic is putting significant pressure on the hospital system. To help clinicians in the rapid triage of patients at high risk of COVID-19 while waiting for RT-PCR results, different diagnostic prediction models have been developed. Our objective is to identify, compare, and evaluate performances of prediction models for the diagnosis of COVID-19 in adult patients in a health care setting. Methods A search for relevant references has been conducted on the MEDLINE and Scopus databases. Rigorous eligibility criteria have been established (e.g., adult participants, suspicion of COVID-19, medical setting) and applied by two independent investigators to identify suitable studies at 2 different stages: (1) titles and abstracts screening and (2) full-texts screening. Risk of bias (RoB) has been assessed using the Prediction model study Risk of Bias Assessment Tool (PROBAST). Data synthesis has been presented according to a narrative report of findings. Results Out of the 2334 references identified by the literature search, 13 articles have been included in our systematic review. The studies, carried out all over the world, were performed in 2020. The included articles proposed a model developed using different methods, namely, logistic regression, score, machine learning, XGBoost. All the included models performed well to discriminate adults at high risks of presenting COVID-19 (all area under the ROC curve (AUROC) > 0.500). The best AUROC was observed for the model of Kurstjens et al (AUROC = 0.940 (0.910–0.960), which was also the model that achieved the highest sensitivity (98%). RoB was evaluated as low in general. Conclusion Thirteen models have been developed since the start of the pandemic in order to diagnose COVID-19 in suspected patients from health care centers. All these models are effective, to varying degrees, in identifying whether patients were at high risk of having COVID-19.


Medicina ◽  
2021 ◽  
Vol 57 (6) ◽  
pp. 538
Author(s):  
Alexandru Burlacu ◽  
Adrian Iftene ◽  
Iolanda Valentina Popa ◽  
Radu Crisan-Dabija ◽  
Crischentian Brinza ◽  
...  

Background and objectives: cardiovascular complications (CVC) are the leading cause of death in patients with chronic kidney disease (CKD). Standard cardiovascular disease risk prediction models used in the general population are not validated in patients with CKD. We aim to systematically review the up-to-date literature on reported outcomes of computational methods such as artificial intelligence (AI) or regression-based models to predict CVC in CKD patients. Materials and methods: the electronic databases of MEDLINE/PubMed, EMBASE, and ScienceDirect were systematically searched. The risk of bias and reporting quality for each study were assessed against transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD) and the prediction model risk of bias assessment tool (PROBAST). Results: sixteen papers were included in the present systematic review: 15 non-randomized studies and 1 ongoing clinical trial. Twelve studies were found to perform AI or regression-based predictions of CVC in CKD, either through single or composite endpoints. Four studies have come up with computational solutions for other CV-related predictions in the CKD population. Conclusions: the identified studies represent palpable trends in areas of clinical promise with an encouraging present-day performance. However, there is a clear need for more extensive application of rigorous methodologies. Following the future prospective, randomized clinical trials, and thorough external validations, computational solutions will fill the gap in cardiovascular predictive tools for chronic kidney disease.


2021 ◽  
Vol 9 (1) ◽  
pp. 232596712096809
Author(s):  
Lauren N. Bockhorn ◽  
Angelina M. Vera ◽  
David Dong ◽  
Domenica A. Delgado ◽  
Kevin E. Varner ◽  
...  

Background: The Beighton score is commonly used to assess the degree of hypermobility in patients with hypermobility spectrum disorder. Since proper diagnosis and treatment in this challenging patient population require valid, reliable, and responsive clinical assessments such as the Beighton score, studies must properly evaluate efficacy and effectiveness. Purpose: To succinctly present a systematic review to determine the inter- and intrarater reliability of the Beighton score and the methodological quality of all analyzed studies for use in clinical applications. Study Design: Systematic review; Level of evidence, 3. Methods: A systematic review of the MEDLINE, Embase, CINAHL, and SPORTDiscus databases was performed. Studies that measured inter- or intrarater reliability of the Beighton score in humans with and without hypermobility were included. Non-English, animal, cadaveric, level 5 evidence, and studies utilizing the Beighton score self-assessment version were excluded. Data were extracted to compare scoring methods, population characteristics, and measurements of inter- and intrarater reliability. Risk of bias was assessed with the COSMIN (Consensus-Based Standards for the Selection of Health Measurement Instruments) 2017 checklist. Results: Twenty-four studies were analyzed (1333 patients; mean ± SD age, 28.19 ± 17.34 years [range, 4-71 years]; 640 females, 594 males, 273 unknown sex). Of the 24 studies, 18 reported raters were health care professionals or health care professional students. For interrater reliability, 5 of 8 (62.5%) intraclass correlation coefficients and 12 of 19 (63.2%) kappa values were substantial to almost perfect. Intrarater reliability was reported as excellent in all studies utilizing intraclass correlation coefficients, and 3 of the 7 articles using kappa values reported almost perfect values. Utilizing the COSMIN criteria, we determined that 1 study met “very good” criteria, 7 met “adequate,” 15 met “doubtful,” and 1 met “inadequate” for overall risk of bias in the reliability domain. Conclusion: The Beighton score is a highly reliable clinical tool that shows substantial to excellent inter- and intrarater reliability when used by raters of variable backgrounds and experience levels. While individual components of risk of bias among studies demonstrated large discrepancy, most of the items were adequate to very good.


Author(s):  
Valentina Bellini ◽  
Marina Valente ◽  
Giorgia Bertorelli ◽  
Barbara Pifferi ◽  
Michelangelo Craca ◽  
...  

Abstract Background Risk stratification plays a central role in anesthetic evaluation. The use of Big Data and machine learning (ML) offers considerable advantages for collection and evaluation of large amounts of complex health-care data. We conducted a systematic review to understand the role of ML in the development of predictive post-surgical outcome models and risk stratification. Methods Following the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) guidelines, we selected the period of the research for studies from 1 January 2015 up to 30 March 2021. A systematic search in Scopus, CINAHL, the Cochrane Library, PubMed, and MeSH databases was performed; the strings of research included different combinations of keywords: “risk prediction,” “surgery,” “machine learning,” “intensive care unit (ICU),” and “anesthesia” “perioperative.” We identified 36 eligible studies. This study evaluates the quality of reporting of prediction models using the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) checklist. Results The most considered outcomes were mortality risk, systemic complications (pulmonary, cardiovascular, acute kidney injury (AKI), etc.), ICU admission, anesthesiologic risk and prolonged length of hospital stay. Not all the study completely followed the TRIPOD checklist, but the quality was overall acceptable with 75% of studies (Rev #2, comm #minor issue) showing an adherence rate to TRIPOD more than 60%. The most frequently used algorithms were gradient boosting (n = 13), random forest (n = 10), logistic regression (LR; n = 7), artificial neural networks (ANNs; n = 6), and support vector machines (SVM; n = 6). Models with best performance were random forest and gradient boosting, with AUC > 0.90. Conclusions The application of ML in medicine appears to have a great potential. From our analysis, depending on the input features considered and on the specific prediction task, ML algorithms seem effective in outcomes prediction more accurately than validated prognostic scores and traditional statistics. Thus, our review encourages the healthcare domain and artificial intelligence (AI) developers to adopt an interdisciplinary and systemic approach to evaluate the overall impact of AI on perioperative risk assessment and on further health care settings as well.


BMJ ◽  
2020 ◽  
pp. m1328 ◽  
Author(s):  
Laure Wynants ◽  
Ben Van Calster ◽  
Gary S Collins ◽  
Richard D Riley ◽  
Georg Heinze ◽  
...  

Abstract Objective To review and appraise the validity and usefulness of published and preprint reports of prediction models for diagnosing coronavirus disease 2019 (covid-19) in patients with suspected infection, for prognosis of patients with covid-19, and for detecting people in the general population at increased risk of becoming infected with covid-19 or being admitted to hospital with the disease. Design Living systematic review and critical appraisal by the COVID-PRECISE (Precise Risk Estimation to optimise covid-19 Care for Infected or Suspected patients in diverse sEttings) group. Data sources PubMed and Embase through Ovid, arXiv, medRxiv, and bioRxiv up to 5 May 2020. Study selection Studies that developed or validated a multivariable covid-19 related prediction model. Data extraction At least two authors independently extracted data using the CHARMS (critical appraisal and data extraction for systematic reviews of prediction modelling studies) checklist; risk of bias was assessed using PROBAST (prediction model risk of bias assessment tool). Results 14 217 titles were screened, and 107 studies describing 145 prediction models were included. The review identified four models for identifying people at risk in the general population; 91 diagnostic models for detecting covid-19 (60 were based on medical imaging, nine to diagnose disease severity); and 50 prognostic models for predicting mortality risk, progression to severe disease, intensive care unit admission, ventilation, intubation, or length of hospital stay. The most frequently reported predictors of diagnosis and prognosis of covid-19 are age, body temperature, lymphocyte count, and lung imaging features. Flu-like symptoms and neutrophil count are frequently predictive in diagnostic models, while comorbidities, sex, C reactive protein, and creatinine are frequent prognostic factors. C index estimates ranged from 0.73 to 0.81 in prediction models for the general population, from 0.65 to more than 0.99 in diagnostic models, and from 0.68 to 0.99 in prognostic models. All models were rated at high risk of bias, mostly because of non-representative selection of control patients, exclusion of patients who had not experienced the event of interest by the end of the study, high risk of model overfitting, and vague reporting. Most reports did not include any description of the study population or intended use of the models, and calibration of the model predictions was rarely assessed. Conclusion Prediction models for covid-19 are quickly entering the academic literature to support medical decision making at a time when they are urgently needed. This review indicates that proposed models are poorly reported, at high risk of bias, and their reported performance is probably optimistic. Hence, we do not recommend any of these reported prediction models for use in current practice. Immediate sharing of well documented individual participant data from covid-19 studies and collaboration are urgently needed to develop more rigorous prediction models, and validate promising ones. The predictors identified in included models should be considered as candidate predictors for new models. Methodological guidance should be followed because unreliable predictions could cause more harm than benefit in guiding clinical decisions. Finally, studies should adhere to the TRIPOD (transparent reporting of a multivariable prediction model for individual prognosis or diagnosis) reporting guideline. Systematic review registration Protocol https://osf.io/ehc47/ , registration https://osf.io/wy245 . Readers’ note This article is a living systematic review that will be updated to reflect emerging evidence. Updates may occur for up to two years from the date of original publication. This version is update 2 of the original article published on 7 April 2020 ( BMJ 2020;369:m1328), and previous updates can be found as data supplements ( https://www.bmj.com/content/369/bmj.m1328/related#datasupp ).


2020 ◽  
Author(s):  
Jingyu Zhong ◽  
Liping Si ◽  
Guangcheng Zhang ◽  
Jiayu Huo ◽  
Yue Xing ◽  
...  

Abstract Background: Osteoarthritis is the most common degenerative joint disease diagnosed in clinical practice. It is associated with significant socioeconomic burden and poor quality of life, a large proportion of which is due to knee osteoarthritis (KOA), mainly driven by total knee arthroplasty (TKA). As the difficulty of being detected early and deficiency of disease-modifying drug, the focus of KOA is shifting to disease prevention and the treatment to delay its rapid progression. Thus, the prognostic prediction models are called for, to stratify individuals to guide clinical decision making. The aim of our review is to identify and characterize reported multivariable prognostic models for KOA which concern about three clinical concerns: (1) the risk of developing KOA in general population; (2) the risk of receiving TKA in KOA patients; and (3) the outcome of TKA in KOA patients who plan to receive TKA.Methods: Studies will be identified by searching seven electronic databases. Title and abstract screening and full-text review will be accomplished by two independent reviewers. Data extraction instrument and critical appraisal instrument will be developed before formal assessment, and will be modified during a training phase in advance. Study reporting transparency, methodological quality, and risk of bias will be assessed according to Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) statement, CHecklist for critical Appraisal and data extraction for systematic Reviews of prediction Modelling Studies (CHARMS) and Prediction model Risk Of Bias ASsessment Tool (PROBAST). Prognostic prediction models will be summarized qualitatively. Quantitative metrics on predictive performance of these models will be synthesized with meta-analyses if appropriate.Discussion: Our systematic review will collate evidence from prognostic prediction models that can be used through the whole process of KOA. The review may identify models which are capable of allowing personalized preventative and therapeutic interventions to be precisely targeted at those individuals who are at the highest risk. To accomplish the prediction models to cross the translational gaps between an exploratory research method and a valued addition to precision medicine workflows, research recommendations relating to model development, validation or impact assessment will be made.Systematic review registration: PROSPERO (registered, waiting for assessment, ID 203543)


2022 ◽  
Vol 22 (1) ◽  
Author(s):  
Constanza L. Andaur Navarro ◽  
Johanna A. A. Damen ◽  
Toshihiko Takada ◽  
Steven W. J. Nijman ◽  
Paula Dhiman ◽  
...  

Abstract Background While many studies have consistently found incomplete reporting of regression-based prediction model studies, evidence is lacking for machine learning-based prediction model studies. We aim to systematically review the adherence of Machine Learning (ML)-based prediction model studies to the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) Statement. Methods We included articles reporting on development or external validation of a multivariable prediction model (either diagnostic or prognostic) developed using supervised ML for individualized predictions across all medical fields. We searched PubMed from 1 January 2018 to 31 December 2019. Data extraction was performed using the 22-item checklist for reporting of prediction model studies (www.TRIPOD-statement.org). We measured the overall adherence per article and per TRIPOD item. Results Our search identified 24,814 articles, of which 152 articles were included: 94 (61.8%) prognostic and 58 (38.2%) diagnostic prediction model studies. Overall, articles adhered to a median of 38.7% (IQR 31.0–46.4%) of TRIPOD items. No article fully adhered to complete reporting of the abstract and very few reported the flow of participants (3.9%, 95% CI 1.8 to 8.3), appropriate title (4.6%, 95% CI 2.2 to 9.2), blinding of predictors (4.6%, 95% CI 2.2 to 9.2), model specification (5.2%, 95% CI 2.4 to 10.8), and model’s predictive performance (5.9%, 95% CI 3.1 to 10.9). There was often complete reporting of source of data (98.0%, 95% CI 94.4 to 99.3) and interpretation of the results (94.7%, 95% CI 90.0 to 97.3). Conclusion Similar to prediction model studies developed using conventional regression-based techniques, the completeness of reporting is poor. Essential information to decide to use the model (i.e. model specification and its performance) is rarely reported. However, some items and sub-items of TRIPOD might be less suitable for ML-based prediction model studies and thus, TRIPOD requires extensions. Overall, there is an urgent need to improve the reporting quality and usability of research to avoid research waste. Systematic review registration PROSPERO, CRD42019161764.


Sign in / Sign up

Export Citation Format

Share Document