scholarly journals An empirical analysis of dealing with patients who are lost to follow-up when developing prognostic models using a cohort design

2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Jenna M. Reps ◽  
Peter Rijnbeek ◽  
Alana Cuthbert ◽  
Patrick B. Ryan ◽  
Nicole Pratt ◽  
...  

Abstract Background Researchers developing prediction models are faced with numerous design choices that may impact model performance. One key decision is how to include patients who are lost to follow-up. In this paper we perform a large-scale empirical evaluation investigating the impact of this decision. In addition, we aim to provide guidelines for how to deal with loss to follow-up. Methods We generate a partially synthetic dataset with complete follow-up and simulate loss to follow-up based either on random selection or on selection based on comorbidity. In addition to our synthetic data study we investigate 21 real-world data prediction problems. We compare four simple strategies for developing models when using a cohort design that encounters loss to follow-up. Three strategies employ a binary classifier with data that: (1) include all patients (including those lost to follow-up), (2) exclude all patients lost to follow-up or (3) only exclude patients lost to follow-up who do not have the outcome before being lost to follow-up. The fourth strategy uses a survival model with data that include all patients. We empirically evaluate the discrimination and calibration performance. Results The partially synthetic data study results show that excluding patients who are lost to follow-up can introduce bias when loss to follow-up is common and does not occur at random. However, when loss to follow-up was completely at random, the choice of addressing it had negligible impact on model discrimination performance. Our empirical real-world data results showed that the four design choices investigated to deal with loss to follow-up resulted in comparable performance when the time-at-risk was 1-year but demonstrated differential bias when we looked into 3-year time-at-risk. Removing patients who are lost to follow-up before experiencing the outcome but keeping patients who are lost to follow-up after the outcome can bias a model and should be avoided. Conclusion Based on this study we therefore recommend (1) developing models using data that includes patients that are lost to follow-up and (2) evaluate the discrimination and calibration of models twice: on a test set including patients lost to follow-up and a test set excluding patients lost to follow-up.

2020 ◽  
Author(s):  
Jenna M Reps ◽  
Peter Rijnbeek ◽  
Alana Cuthbert ◽  
Patrick B Ryan ◽  
Nicole Pratt ◽  
...  

Abstract Background: Researchers developing prediction models are faced with numerous design choices that may impact model performance. One key decision is how to include patients who are lost to follow-up. In this paper we perform a large-scale empirical evaluation investigating the impact of this decision. In addition, we aim to provide guidelines for how to deal with loss to follow-up.Methods: We generate a partially synthetic dataset with complete follow-up and simulate loss to follow-up based either on random selection or on selection based on comorbidity. In addition to our synthetic data study we investigate 21 real-world data prediction problems. We compare four simple strategies for developing models when using a cohort design that encounters loss to follow-up. Three strategies employ a binary classifier with data that: i) include all patients (including those lost to follow-up), ii) exclude all patients lost to follow-up or iii) only exclude patients lost to follow-up who do not have the outcome before being lost to follow-up. The fourth strategy uses a survival model with data that include all patients. We empirically evaluate the discrimination and calibration performance.Results: The partially synthetic data study results show that excluding patients who are lost to follow-up can introduce bias when loss to follow-up is common and does not occur at random. However, when loss to follow-up was completely at random, the choice of addressing it had negligible impact on model discrimination performance. Our empirical real-world data results showed that the four design choices investigated to deal with loss to follow-up resulted in comparable performance when the time-at-risk was 1-year but demonstrated differential bias when we looked into 3-year time-at-risk. Removing patients who are lost to follow-up before experiencing the outcome but keeping patients who are lost to follow-up after the outcome can bias a model and should be avoided.Conclusion: Based on this study we therefore recommend i) developing models using data that includes patients that are lost to follow-up and ii) evaluate the discrimination and calibration of models twice: on a test set including patients lost to follow-up and a test set excluding patients lost to follow-up.


2020 ◽  
Author(s):  
Jenna M Reps ◽  
Peter Rijnbeek ◽  
Alana Cuthbert ◽  
Patrick B Ryan ◽  
Nicole Pratt ◽  
...  

Abstract Background: Researchers developing prediction models are faced with numerous design choices that may impact model performance. One key decision is how to include patients who are lost to follow-up. In this paper we perform a large-scale empirical evaluation investigating the impact of this decision. In addition, we aim to provide guidelines for how to deal with loss to follow-up.Methods: We generate a partially synthetic dataset with complete follow-up and simulate loss to follow-up based either on random selection or on selection based on comorbidity. In addition to our synthetic data study we investigate 21 real-world data prediction problems. We compare four simple strategies for developing models when using a cohort design that encounters loss to follow-up. Three strategies employ a binary classifier with data that: i) include all patients (including those lost to follow-up), ii) exclude all patients lost to follow-up or iii) only exclude patients lost to follow-up who do not have the outcome before being lost to follow-up. The fourth strategy uses a survival model with data that include all patients. We empirically evaluate the discrimination and calibration performance.Results: The partially synthetic data study results show that excluding patients who are lost to follow-up can introduce bias when loss to follow-up is common and does not occur at random. However, when loss to follow-up was completely at random, the choice of addressing it had negligible impact on the model performance. Our empirical real-world data results showed that the four design choices investigated to deal with loss to follow-up resulted in comparable performance when the time-at-risk was 1-year, but demonstrated differential bias when we looked into 3-year time-at-risk. Removing patients who are lost to follow-up before experiencing the outcome but keeping patients who are lost to follow-up after the outcome can bias a model and should be avoided.Conclusion: Based on this study we therefore recommend i) developing models using data that includes patients that are lost to follow-up and ii) evaluate the discrimination and calibration of models twice: on a test set including patients lost to follow-up and a test set excluding patients lost to follow-up.


2020 ◽  
Author(s):  
Jenna M Reps ◽  
Peter Rijnbeek ◽  
Alana Cuthbert ◽  
Patrick B Ryan ◽  
Nicole Pratt ◽  
...  

Abstract Background: Researchers developing prediction models are faced with numerous design choices that may impact model performance. One of the main decisions is how to include patients who are lost to follow-up. In this paper we perform a large-scale empirical evaluation investigating the impact of this decision. In addition, we aim to provide guidelines for how to deal with loss to follow-up. Methods: We generate a synthetic dataset with complete follow-up and simulate loss to follow-up based either on random selection or on selection based on comorbidity. We investigate four simple strategies for developing models using data containing some patients with loss to follow-up. Three strategies employ a binary classifier with data that: i) include all patients (including those lost to follow-up), ii) exclude all patients lost to follow-up or iii) only exclude patients lost to follow-up who do not have the outcome before being lost to follow-up. The fourth strategy uses a survival model with data that include all patients. In addition to our synthetic data study, we empirically evaluate the discrimination and calibration performance of these strategies across 21 prediction problems using real-world data. Results: The synthetic data study results show that excluding patients lost to follow-up can introduce bias when loss to follow-up is common and does not occur at random. However, when loss to follow-up was completely at random, the choice of addressing it had negligible impact on the model performance. Our empirical results showed that the four design choices investigated to deal with loss to follow-up resulted in comparable performance when the time-at-risk was 1-year, but demonstrated differential bias when we looking into 3-year time-at-risk. Removing patients who are lost to follow-up before the outcome but keeping patients who are loss to follow-up after the outcome can bias a model and should be avoided. Conclusion: Based on this study we therefore recommend i) developing models using data that includes patients that are lost to follow-up and ii) evaluate the discrimination and calibration of models twice: on a test set including patients lost to follow-up and a test set excluding patients lost to follow-up.


Diabetes ◽  
2020 ◽  
Vol 69 (Supplement 1) ◽  
pp. 928-P
Author(s):  
REEMA MODY ◽  
MARIA YU ◽  
BAL K. NEPAL ◽  
MANIGE KONIG ◽  
MICHAEL GRABNER

2020 ◽  
Vol 7 ◽  
pp. 237428952096822
Author(s):  
Erik J. Landaas ◽  
Ashley M. Eckel ◽  
Jonathan L. Wright ◽  
Geoffrey S. Baird ◽  
Ryan N. Hansen ◽  
...  

We describe the methods and decision from a health technology assessment of a new molecular test for bladder cancer (Cxbladder), which was proposed for adoption to our send-out test menu by urology providers. The Cxbladder health technology assessment report contained mixed evidence; predominant concerns were related to the test’s low specificity and high cost. The low specificity indicated a high false-positive rate, which our laboratory formulary committee concluded would result in unnecessary confirmatory testing and follow-up. Our committee voted unanimously to not adopt the test system-wide for use for the initial diagnosis of bladder cancer but supported a pilot study for bladder cancer recurrence surveillance. The pilot study used real-world data from patient management in the scenario in which a patient is evaluated for possible recurrent bladder cancer after a finding of atypical cytopathology in the urine. We evaluated the type and number of follow-up tests conducted including urine cytopathology, imaging studies, repeat cystoscopy evaluation, biopsy, and repeat Cxbladder and their test results. The pilot identified ordering challenges and suggested potential use cases in which the results of Cxbladder affected a change in management. Our health technology assessment provided an objective process to efficiently review test performance and guide new test adoption. Based on our pilot, there were real-world data indicating improved clinician decision-making among select patients who underwent Cxbladder testing.


2020 ◽  
Vol 22 (8) ◽  
pp. 602-612
Author(s):  
Dirk Sandig ◽  
Julia Grimsmann ◽  
Christina Reinauer ◽  
Andreas Melmer ◽  
Stefan Zimny ◽  
...  

2019 ◽  
Vol 13 (Supplement_1) ◽  
pp. S268-S269
Author(s):  
M Guerra Veloz ◽  
M Belvis Jimenez ◽  
T Valdes Delgado ◽  
L Castro Laria ◽  
B Maldonado Pérez ◽  
...  

2019 ◽  
Vol 60 (12) ◽  
pp. 2939-2945 ◽  
Author(s):  
Maria Dimou ◽  
Theodoros Iliakis ◽  
Vasileios Pardalis ◽  
Catherin Bitsani ◽  
Theodoros P. Vassilakopoulos ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document