scholarly journals Evaluating the impact of covariate lookback times on performance of patient-level prediction models

2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Jill Hardin ◽  
Jenna M. Reps

Abstract Background The goal of our study is to examine the impact of the lookback length when engineering features to use in developing predictive models using observational healthcare data. Using a longer lookback for feature engineering gives more insight about patients but increases the issue of left-censoring. Methods We used five US observational databases to develop patient-level prediction models. A target cohort of subjects with hypertensive drug exposures and outcome cohorts of subjects with acute (stroke and gastrointestinal bleeding) and chronic outcomes (diabetes and chronic kidney disease) were developed. Candidate predictors that exist on or prior to the target index date were derived within the following lookback periods: 14, 30, 90, 180, 365, 730, and all days prior to index were evaluated. We predicted the risk of outcomes occurring 1 day until 365 days after index. Ten lasso logistic models for each lookback period were generated to create a distribution of area under the curve (AUC) metrics to evaluate the discriminative performance of the models. Calibration intercept and slope were also calculated. Impact on external validation performance was investigated across five databases. Results The maximum differences in AUCs for the models developed using different lookback periods within a database was < 0.04 for diabetes (in MDCR AUC of 0.593 with 14-day lookback vs. AUC of 0.631 with all-time lookback) and 0.012 for renal impairment (in MDCR AUC of 0.675 with 30-day lookback vs. AUC of 0.687 with 365-day lookback ). For the acute outcomes, the max difference in AUC across lookbacks within a database was 0.015 (in MDCD AUC of 0.767 with 14-day lookback vs. AUC 0.782 with 365-day lookback) for stroke and < 0.03 for gastrointestinal bleeding (in CCAE AUC of 0.631 with 14-day lookback vs. AUC of 0.660 with 730-day lookback). Conclusions In general the choice of covariate lookback had only a small impact on discrimination and calibration, with a short lookback (< 180 days) occasionally decreasing discrimination. Based on the results, if training a logistic regression model for prediction then using covariates with a 365 day lookback appear to be a good tradeoff between performance and interpretation.

2021 ◽  
Author(s):  
Jill Hardin ◽  
Jenna M. Reps

Abstract Background: The goal of our study is to provide guidance for deciding which length of lookback to implement when engineering features to use when developing predictive models using observational healthcare data. Using a longer lookback for feature engineering gives more insight about patients but increases the issue of left-censoring. Methods: We used five US observational databases to develop patient-level prediction models. A target cohort of subjects with hypertensive drug exposures and outcome cohorts of subjects with acute (stroke and gastrointestinal bleeding) and chronic outcomes (diabetes and chronic kidney disease) were developed. Candidate predictors that exist on or prior to the target index date were derived within the following lookback periods: 14, 30, 90, 180, 365, 730, and all days prior to index were evaluated. We predicted the risk of outcomes occurring 1 day until 365 days after index. Ten lasso logistic models for each lookback period were generated to create a distribution of area under the curve (AUC) metrics to evaluate the discriminative performance of the models. Impact on external validation performance was investigated across five databases. Results: Our results show that a shorter lookback time for the acute outcomes, stroke and gastrointestinal bleeding results in equivalent performance as using longer lookback times. A lookback time of at least 365 days for the chronic outcomes, diabetes and renal impairment, results in equivalent performance as using lookback times greater than 365 days.Conclusions: Our results suggest the optimal model performance and choice of length of lookback is dependent on the outcome type (acute or chronic). Our study illustrates that use of at least 365 days results in equivalent performance as using longer lookback periods. Researchers should evaluate lookback in the context of the prediction question to determine optimal model performance.


2020 ◽  
Author(s):  
Jenna Marie Reps ◽  
Ross Williams ◽  
Seng Chan You ◽  
Thomas Falconer ◽  
Evan Minty ◽  
...  

Abstract Objective: To demonstrate how the Observational Healthcare Data Science and Informatics (OHDSI) collaborative network and standardization can be utilized to scale-up external validation of patient-level prediction models by enabling validation across a large number of heterogeneous observational healthcare datasets.Materials & Methods: Five previously published prognostic models (ATRIA, CHADS2, CHADS2VASC, Q-Stroke and Framingham) that predict future risk of stroke in patients with atrial fibrillation were replicated using the OHDSI frameworks. A network study was run that enabled the five models to be externally validated across nine observational healthcare datasets spanning three countries and five independent sites. Results: The five existing models were able to be integrated into the OHDSI framework for patient-level prediction and they obtained mean c-statistics ranging between 0.57-0.63 across the 6 databases with sufficient data to predict stroke within 1 year of initial atrial fibrillation diagnosis for females with atrial fibrillation. This was comparable with existing validation studies. The validation network study was run across nine datasets within 60 days once the models were replicated. An R package for the study was published at https://github.com/OHDSI/StudyProtocolSandbox/tree/master/ExistingStrokeRiskExternalValidation.Discussion: This study demonstrates the ability to scale up external validation of patient-level prediction models using a collaboration of researchers and a data standardization that enable models to be readily shared across data sites. External validation is necessary to understand the transportability or reproducibility of a prediction model, but without collaborative approaches it can take three or more years for a model to be validated by one independent researcher. Conclusion : In this paper we show it is possible to both scale-up and speed-up external validation by showing how validation can be done across multiple databases in less than 2 months. We recommend that researchers developing new prediction models use the OHDSI network to externally validate their models.


Author(s):  
Jenna Marie Reps ◽  
Ross D Williams ◽  
Seng Chan You ◽  
Thomas Falconer ◽  
Evan Minty ◽  
...  

Abstract Background: To demonstrate how the Observational Healthcare Data Science and Informatics (OHDSI) collaborative network and standardization can be utilized to scale-up external validation of patient-level prediction models by enabling validation across a large number of heterogeneous observational healthcare datasets.Methods: Five previously published prognostic models (ATRIA, CHADS2, CHADS2VASC, Q-Stroke and Framingham) that predict future risk of stroke in patients with atrial fibrillation were replicated using the OHDSI frameworks. A network study was run that enabled the five models to be externally validated across nine observational healthcare datasets spanning three countries and five independent sites. Results: The five existing models were able to be integrated into the OHDSI framework for patient-level prediction and they obtained mean c-statistics ranging between 0.57-0.63 across the 6 databases with sufficient data to predict stroke within 1 year of initial atrial fibrillation diagnosis for females with atrial fibrillation. This was comparable with existing validation studies. The validation network study was run across nine datasets within 60 days once the models were replicated. An R package for the study was published at https://github.com/OHDSI/StudyProtocolSandbox/tree/master/ExistingStrokeRiskExternalValidation.Conclusion : This study demonstrates the ability to scale up external validation of patient-level prediction models using a collaboration of researchers and a data standardization that enable models to be readily shared across data sites. External validation is necessary to understand the transportability or reproducibility of a prediction model, but without collaborative approaches it can take three or more years for a model to be validated by one independent researcher. In this paper we show it is possible to both scale-up and speed-up external validation by showing how validation can be done across multiple databases in less than 2 months. We recommend that researchers developing new prediction models use the OHDSI network to externally validate their models.


2020 ◽  
Author(s):  
Jenna Marie Reps ◽  
Ross D Williams ◽  
Seng Chan You ◽  
Thomas Falconer ◽  
Evan Minty ◽  
...  

Abstract Background To demonstrate how the Observational Healthcare Data Science and Informatics (OHDSI) collaborative network and standardization can be utilized to scale-up external validation of patient-level prediction models by enabling validation across a large number of heterogeneous observational healthcare datasets.Methods Five previously published prognostic models (ATRIA, CHADS2, CHADS2VASC, Q-Stroke and Framingham) that predict future risk of stroke in patients with atrial fibrillation were replicated using the OHDSI frameworks. A network study was run that enabled the five models to be externally validated across nine observational healthcare datasets spanning three countries and five independent sites. Results The five existing models were able to be integrated into the OHDSI framework for patient-level prediction and they obtained mean c-statistics ranging between 0.57-0.63 across the 6 databases with sufficient data to predict stroke within 1 year of initial atrial fibrillation diagnosis for females with atrial fibrillation. This was comparable with existing validation studies. The validation network study was run across nine datasets within 60 days once the models were replicated. An R package for the study was published at https://github.com/OHDSI/StudyProtocolSandbox/tree/master/ExistingStrokeRiskExternalValidation .Conclusion This study demonstrates the ability to scale up external validation of patient-level prediction models using a collaboration of researchers and a data standardization that enable models to be readily shared across data sites. External validation is necessary to understand the transportability or reproducibility of a prediction model, but without collaborative approaches it can take three or more years for a model to be validated by one independent researcher. In this paper we show it is possible to both scale-up and speed-up external validation by showing how validation can be done across multiple databases in less than 2 months. We recommend that researchers developing new prediction models use the OHDSI network to externally validate their models.


2019 ◽  
Author(s):  
Jenna Marie Reps ◽  
Ross Williams ◽  
Seng Chan You ◽  
Thomas Falconer ◽  
Evan Minty ◽  
...  

Abstract Objective To demonstrate how the Observational Healthcare Data Science and Informatics (OHDSI) collaborative network and standardization can be utilized to externally validate patient-level prediction models at scale. Materials & Methods Five previously published prognostic models (ATRIA, CHADS2, CHADS2VASC, Q-Stroke and Framingham) that predict future risk of stroke in patients with atrial fibrillation were replicated using the OHDSI frameworks and a network study was run that enabled the five models to be externally validated across nine datasets spanning three countries and five independent sites. Results The five existing models were able to be integrated into the OHDSI framework for patient-level prediction and their performances in predicting stroke within 1 year of initial atrial fibrillation diagnosis for females were comparable with existing studies. The validation network study took 60 days once the models were replicated and an R package for the study was published to collaborators at https://github.com/OHDSI/StudyProtocolSandbox/tree/master/ExistingStrokeRiskExternalValidation. Discussion This study demonstrates the ability to scale up external validation of patient-level prediction models using a collaboration of researchers and data standardization that enable models to be readily shared across data sites. External validation is necessary to understand the transportability and reproducibility of prediction models, but without collaborative approaches it can take three or more years to be validated by one independent researcher. Conclusion In this paper we show it is possible to both scale-up and speed-up external validation by showing how validation can be done across multiple databases in less than 2 months.


2021 ◽  
Vol 10 (1) ◽  
pp. 93
Author(s):  
Mahdieh Montazeri ◽  
Ali Afraz ◽  
Mitra Montazeri ◽  
Sadegh Nejatzadeh ◽  
Fatemeh Rahimi ◽  
...  

Introduction: Our aim in this study was to summarize information on the use of intelligent models for predicting and diagnosing the Coronavirus disease 2019 (COVID-19) to help early and timely diagnosis of the disease.Material and Methods: A systematic literature search included articles published until 20 April 2020 in PubMed, Web of Science, IEEE, ProQuest, Scopus, bioRxiv, and medRxiv databases. The search strategy consisted of two groups of keywords: A) Novel coronavirus, B) Machine learning. Two reviewers independently assessed original papers to determine eligibility for inclusion in this review. Studies were critically reviewed for risk of bias using prediction model risk of bias assessment tool.Results: We gathered 1650 articles through database searches. After the full-text assessment 31 articles were included. Neural networks and deep neural network variants were the most popular machine learning type. Of the five models that authors claimed were externally validated, we considered external validation only for four of them. Area under the curve (AUC) in internal validation of prognostic models varied from .94 to .97. AUC in diagnostic models varied from 0.84 to 0.99, and AUC in external validation of diagnostic models varied from 0.73 to 0.94. Our analysis finds all but two studies have a high risk of bias due to various reasons like a low number of participants and lack of external validation.Conclusion: Diagnostic and prognostic models for COVID-19 show good to excellent discriminative performance. However, these models are at high risk of bias because of various reasons like a low number of participants and lack of external validation. Future studies should address these concerns. Sharing data and experiences for the development, validation, and updating of COVID-19 related prediction models is needed. 


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Espen Jimenez-Solem ◽  
Tonny S. Petersen ◽  
Casper Hansen ◽  
Christian Hansen ◽  
Christina Lioma ◽  
...  

AbstractPatients with severe COVID-19 have overwhelmed healthcare systems worldwide. We hypothesized that machine learning (ML) models could be used to predict risks at different stages of management and thereby provide insights into drivers and prognostic markers of disease progression and death. From a cohort of approx. 2.6 million citizens in Denmark, SARS-CoV-2 PCR tests were performed on subjects suspected for COVID-19 disease; 3944 cases had at least one positive test and were subjected to further analysis. SARS-CoV-2 positive cases from the United Kingdom Biobank was used for external validation. The ML models predicted the risk of death (Receiver Operation Characteristics—Area Under the Curve, ROC-AUC) of 0.906 at diagnosis, 0.818, at hospital admission and 0.721 at Intensive Care Unit (ICU) admission. Similar metrics were achieved for predicted risks of hospital and ICU admission and use of mechanical ventilation. Common risk factors, included age, body mass index and hypertension, although the top risk features shifted towards markers of shock and organ dysfunction in ICU patients. The external validation indicated fair predictive performance for mortality prediction, but suboptimal performance for predicting ICU admission. ML may be used to identify drivers of progression to more severe disease and for prognostication patients in patients with COVID-19. We provide access to an online risk calculator based on these findings.


Circulation ◽  
2019 ◽  
Vol 140 (Suppl_2) ◽  
Author(s):  
Marinos Kosmopoulos ◽  
Jason A Bartos ◽  
Demetris Yannopoulos

Introduction: Veno-Arterial Extracorporeal Membrane Oxygenation (VA ECMO) has emerged as a prominent tool for management of patients with Inability to Wean Off Cardiopulmonary Bypass (IWOCB), extracorporeal cardiopulmonary resuscitation (eCPR) or refractory cardiogenic shock (RCS). The high mortality that is still associated with these diseases urges for the development of reliable prediction models for mortality after cannulation. Survival After VA ECMO (SAVE) Score consists one of the most widely used prediction tools and the only model with external validation. However, its predictive value is still under debate. Hypothesis: Whether VA ECMO indication affects the predictive value of SAVE Score. Methods: 317 patients treated with VA ECMO in a quaternary center (n= 52 for IWOCB, n=179 for eCPR and n=86 for RCS) were retrospectively assessed for differences in SAVE Score and their primary outcomes. The Receiver Operating Characteristic (ROC) curve for SAVE Score and mortality was calculated separately for each VA ECMO indication. Results: The three groups had significant differences in SAVE Score (p<0.01) without significant differences in mortality (p=0.176). ROC Curve calculation indicated significant differences in predictive value of SAVE Score for survival among its different indications. (Area Under the Curve= 81.69% for IWOCB, 53.79% for eCPR and 69.46% for RCS). Conclusion: VA ECMO indication markedly affects the predictive value of SAVE Score. Prediction of primary outcome in IWOCB patients was reliable. On the contrary, routine application for survival estimation in eCPR patients is not supported from our results.


2021 ◽  
Author(s):  
Eunsaem Lee ◽  
Se Young Jung ◽  
Hyung Ju Hwang ◽  
Jaewoo Jung

BACKGROUND Nationwide population-based cohorts provide a new opportunity to build automated risk prediction models at the patient level, and claim data are one of the more useful resources to this end. To avoid unnecessary diagnostic intervention after cancer screening tests, patient-level prediction models should be developed. OBJECTIVE We aimed to develop cancer prediction models using nationwide claim databases with machine learning algorithms, which are explainable and easily applicable in real-world environments. METHODS As source data, we used the Korean National Insurance System Database. Every Korean in ≥40 years old undergoes a national health checkup every 2 years. We gathered all variables from the database including demographic information, basic laboratory values, anthropometric values, and previous medical history. We applied conventional logistic regression methods, light gradient boosting methods, neural networks, survival analysis, and one-class embedding classifier methods to effectively analyze high dimension data based on deep learning–based anomaly detection. Performance was measured with area under the curve and area under precision recall curve. We validated our models externally with a health checkup database from a tertiary hospital. RESULTS The one-class embedding classifier model received the highest area under the curve scores with values of 0.868, 0.849, 0.798, 0.746, 0.800, 0.749, and 0.790 for liver, lung, colorectal, pancreatic, gastric, breast, and cervical cancers, respectively. For area under precision recall curve, the light gradient boosting models had the highest score with values of 0.383, 0.401, 0.387, 0.300, 0.385, 0.357, and 0.296 for liver, lung, colorectal, pancreatic, gastric, breast, and cervical cancers, respectively. CONCLUSIONS Our results show that it is possible to easily develop applicable cancer prediction models with nationwide claim data using machine learning. The 7 models showed acceptable performances and explainability, and thus can be distributed easily in real-world environments.


Author(s):  
Nancy McBride ◽  
Sara L. White ◽  
Lucilla Poston ◽  
Diane Farrar ◽  
Jane West ◽  
...  

AbstractBackgroundPrediction of pregnancy-related disorders is mostly done based on established and easily measured risk factors. However, these measures are at best moderate at discriminating between high and low risk women. Recent advances in metabolomics may provide earlier and more accurate prediction of women at risk of pregnancy-related disorders.Methods and FindingsWe used data collected from women in the Born in Bradford (BiB; n=8,212) and UK Pregnancies Better Eating and Activity Trial (UPBEAT; n=859) studies to create and validate prediction models for pregnancy-related disorders. These were gestational diabetes mellitus (GDM), hypertensive disorders of pregnancy (HDP), small for gestational age (SGA), large for gestational age (LGA) and preterm birth (PTB). We used ten-fold cross-validation and penalised regression to create prediction models. We compared the predictive performance of 1) risk factors (maternal age, pregnancy smoking status, body mass index, ethnicity and parity) to 2) nuclear magnetic resonance-derived metabolites (N = 156 quantified metabolites, collected at 24-28 weeks gestation) and 3) risk factors and metabolites combined. The multi-ethnic BiB cohort was used for training and testing the models, with independent validation conducted in UPBEAT, a study of obese pregnant women of multiple ethnicities.In BiB, discrimination for GDM, HDP, LGA and SGA was improved with the addition of metabolites to the risk factors only model. Risk factors area under the curve (AUC 95% confidence interval (CI)): GDM (0.69 (0.64, 0.73)), HDP (0.74 (0.70, 0.78)) and LGA (0.71 (0.66, 0.75)), and SGA (0.59 (0.56,0.63)). Combined AUC 95% (CI)): GDM (0.78 (0.74, 0.81)), HDP (0.76 (0.73, 0.79)) and LGA (0.75 (0.70, 0.79)), and SGA (0.66 (0.63,0.70)). For GDM, HDP, LGA, but not SGA, calibration was good for a combined risk factor and metabolite model. Prediction of PTB was poor for all models. Independent validation in UPBEAT at 24-28 weeks and 15-18 weeks gestation confirmed similar patterns of results, but AUC were attenuated. A key limitation was our inability to identify a large general pregnancy population for independent validation.ConclusionsOur results suggest metabolomics combined with established risk factors improves prediction GDM, HDP and LGA, when compared to risk factors alone. They also highlight the difficulty of predicting PTB, with all models performing poorly.Author SummaryBackgroundCurrent methods used to predict pregnancy-related disorders exhibit modest discrimination and calibration.Metabolomics may enable improved prediction of pregnancy-related disorders.Why Was This Study Done?We require tools to identify women with high-risk pregnancies earlier on, so that antenatal care can be more appropriately targeted at women who need it most and tailored to women’s needs and to facilitate early intervention.It has been suggested that metabolomic markers might improve prediction of future pregnancy-related disorders. Previous studies tend to be small and rarely undertake external validation.What Did the Researchers Do and Find?Using BiB (8,212 pregnant women of multiple ethnicities), we created prediction models, using established risk factors and 156 NMR-derived metabolites, for five pregnancy-related disorders. These were gestational diabetes mellitus (GDM), hypertensive disorders of pregnancy (HDP), small for gestational age (SGA), large for gestational age (LGA) and preterm birth (PTB). We sought external validation in UPBEAT (859 obese pregnant women).We compared the predictive discrimination (area under the curve - AUC) and calibration (calibration slopes) of the models. The prediction models we compared were 1) established risk factors (pregnancy smoking, maternal age, body mass index (BMI), maternal ethnicity and parity) 2) NMR-derived metabolites measured in the second trimester and 3) a combined model of risk factors and metabolites.Inclusion of metabolites with risk factors improved prediction of GDM, HDP, LGA and SGA in BiB. Prediction of PTB was poor with all models. Result patterns were similar in validation using UPBEAT, particularly for GDM and HDP, but AUC were attenuated.What Do These Findings Mean?These findings indicate that combining current risk factor and metabolomic data could improve the prediction of GDM, HDP, LGA and SGA. These findings need to be validated in larger, general populations of pregnant women.


Sign in / Sign up

Export Citation Format

Share Document