An empirical analysis of dealing with patients who are lost to follow-up when developing prognostic models using a cohort design

Abstract Background: Researchers developing prediction models are faced with numerous design choices that may impact model performance. One key decision is how to include patients who are lost to follow-up. In this paper we perform a large-scale empirical evaluation investigating the impact of this decision. In addition, we aim to provide guidelines for how to deal with loss to follow-up.Methods: We generate a partially synthetic dataset with complete follow-up and simulate loss to follow-up based either on random selection or on selection based on comorbidity. In addition to our synthetic data study we investigate 21 real-world data prediction problems. We compare four simple strategies for developing models when using a cohort design that encounters loss to follow-up. Three strategies employ a binary classifier with data that: i) include all patients (including those lost to follow-up), ii) exclude all patients lost to follow-up or iii) only exclude patients lost to follow-up who do not have the outcome before being lost to follow-up. The fourth strategy uses a survival model with data that include all patients. We empirically evaluate the discrimination and calibration performance.Results: The partially synthetic data study results show that excluding patients who are lost to follow-up can introduce bias when loss to follow-up is common and does not occur at random. However, when loss to follow-up was completely at random, the choice of addressing it had negligible impact on the model performance. Our empirical real-world data results showed that the four design choices investigated to deal with loss to follow-up resulted in comparable performance when the time-at-risk was 1-year, but demonstrated differential bias when we looked into 3-year time-at-risk. Removing patients who are lost to follow-up before experiencing the outcome but keeping patients who are lost to follow-up after the outcome can bias a model and should be avoided.Conclusion: Based on this study we therefore recommend i) developing models using data that includes patients that are lost to follow-up and ii) evaluate the discrimination and calibration of models twice: on a test set including patients lost to follow-up and a test set excluding patients lost to follow-up.

Download Full-text

An empirical analysis of dealing with patients who are lost to follow-up when developing prognostic models using a cohort design

10.21203/rs.3.rs-54715/v3 ◽

2020 ◽

Author(s):

Jenna M Reps ◽

Peter Rijnbeek ◽

Alana Cuthbert ◽

Patrick B Ryan ◽

Nicole Pratt ◽

...

Keyword(s):

At Risk ◽

Real World ◽

Synthetic Data ◽

Real World Data ◽

Test Set ◽

Loss To Follow Up ◽

World Data ◽

Cohort Design ◽

Lost To Follow Up

Abstract Background: Researchers developing prediction models are faced with numerous design choices that may impact model performance. One key decision is how to include patients who are lost to follow-up. In this paper we perform a large-scale empirical evaluation investigating the impact of this decision. In addition, we aim to provide guidelines for how to deal with loss to follow-up.Methods: We generate a partially synthetic dataset with complete follow-up and simulate loss to follow-up based either on random selection or on selection based on comorbidity. In addition to our synthetic data study we investigate 21 real-world data prediction problems. We compare four simple strategies for developing models when using a cohort design that encounters loss to follow-up. Three strategies employ a binary classifier with data that: i) include all patients (including those lost to follow-up), ii) exclude all patients lost to follow-up or iii) only exclude patients lost to follow-up who do not have the outcome before being lost to follow-up. The fourth strategy uses a survival model with data that include all patients. We empirically evaluate the discrimination and calibration performance.Results: The partially synthetic data study results show that excluding patients who are lost to follow-up can introduce bias when loss to follow-up is common and does not occur at random. However, when loss to follow-up was completely at random, the choice of addressing it had negligible impact on model discrimination performance. Our empirical real-world data results showed that the four design choices investigated to deal with loss to follow-up resulted in comparable performance when the time-at-risk was 1-year but demonstrated differential bias when we looked into 3-year time-at-risk. Removing patients who are lost to follow-up before experiencing the outcome but keeping patients who are lost to follow-up after the outcome can bias a model and should be avoided.Conclusion: Based on this study we therefore recommend i) developing models using data that includes patients that are lost to follow-up and ii) evaluate the discrimination and calibration of models twice: on a test set including patients lost to follow-up and a test set excluding patients lost to follow-up.

Download Full-text

An empirical analysis of dealing with patients who are lost to follow-up when developing prognostic models using a cohort design

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-021-01408-x ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Jenna M. Reps ◽

Peter Rijnbeek ◽

Alana Cuthbert ◽

Patrick B. Ryan ◽

Nicole Pratt ◽

...

Keyword(s):

At Risk ◽

Real World ◽

Synthetic Data ◽

Real World Data ◽

Test Set ◽

Loss To Follow Up ◽

World Data ◽

Cohort Design ◽

Lost To Follow Up

Abstract Background Researchers developing prediction models are faced with numerous design choices that may impact model performance. One key decision is how to include patients who are lost to follow-up. In this paper we perform a large-scale empirical evaluation investigating the impact of this decision. In addition, we aim to provide guidelines for how to deal with loss to follow-up. Methods We generate a partially synthetic dataset with complete follow-up and simulate loss to follow-up based either on random selection or on selection based on comorbidity. In addition to our synthetic data study we investigate 21 real-world data prediction problems. We compare four simple strategies for developing models when using a cohort design that encounters loss to follow-up. Three strategies employ a binary classifier with data that: (1) include all patients (including those lost to follow-up), (2) exclude all patients lost to follow-up or (3) only exclude patients lost to follow-up who do not have the outcome before being lost to follow-up. The fourth strategy uses a survival model with data that include all patients. We empirically evaluate the discrimination and calibration performance. Results The partially synthetic data study results show that excluding patients who are lost to follow-up can introduce bias when loss to follow-up is common and does not occur at random. However, when loss to follow-up was completely at random, the choice of addressing it had negligible impact on model discrimination performance. Our empirical real-world data results showed that the four design choices investigated to deal with loss to follow-up resulted in comparable performance when the time-at-risk was 1-year but demonstrated differential bias when we looked into 3-year time-at-risk. Removing patients who are lost to follow-up before experiencing the outcome but keeping patients who are lost to follow-up after the outcome can bias a model and should be avoided. Conclusion Based on this study we therefore recommend (1) developing models using data that includes patients that are lost to follow-up and (2) evaluate the discrimination and calibration of models twice: on a test set including patients lost to follow-up and a test set excluding patients lost to follow-up.

Download Full-text

To Include, or Not Include, that is the Question: An Empirical Analysis of Dealing with Patients who are Lost to Follow-up when Developing Prognostic Models Using a Cohort Design

10.21203/rs.3.rs-54715/v1 ◽

2020 ◽

Author(s):

Jenna M Reps ◽

Peter Rijnbeek ◽

Alana Cuthbert ◽

Patrick B Ryan ◽

Nicole Pratt ◽

...

Keyword(s):

At Risk ◽

Synthetic Data ◽

Model Performance ◽

Test Set ◽

Loss To Follow Up ◽

Using Data ◽

Lost To Follow Up ◽

The Impact ◽

Prediction Problems

Abstract Background: Researchers developing prediction models are faced with numerous design choices that may impact model performance. One of the main decisions is how to include patients who are lost to follow-up. In this paper we perform a large-scale empirical evaluation investigating the impact of this decision. In addition, we aim to provide guidelines for how to deal with loss to follow-up. Methods: We generate a synthetic dataset with complete follow-up and simulate loss to follow-up based either on random selection or on selection based on comorbidity. We investigate four simple strategies for developing models using data containing some patients with loss to follow-up. Three strategies employ a binary classifier with data that: i) include all patients (including those lost to follow-up), ii) exclude all patients lost to follow-up or iii) only exclude patients lost to follow-up who do not have the outcome before being lost to follow-up. The fourth strategy uses a survival model with data that include all patients. In addition to our synthetic data study, we empirically evaluate the discrimination and calibration performance of these strategies across 21 prediction problems using real-world data. Results: The synthetic data study results show that excluding patients lost to follow-up can introduce bias when loss to follow-up is common and does not occur at random. However, when loss to follow-up was completely at random, the choice of addressing it had negligible impact on the model performance. Our empirical results showed that the four design choices investigated to deal with loss to follow-up resulted in comparable performance when the time-at-risk was 1-year, but demonstrated differential bias when we looking into 3-year time-at-risk. Removing patients who are lost to follow-up before the outcome but keeping patients who are loss to follow-up after the outcome can bias a model and should be avoided. Conclusion: Based on this study we therefore recommend i) developing models using data that includes patients that are lost to follow-up and ii) evaluate the discrimination and calibration of models twice: on a test set including patients lost to follow-up and a test set excluding patients lost to follow-up.

Download Full-text

Estimating Real World Performance of a Predictive Model: A Case-Study in Predicting End-of-Life

10.1101/19008821 ◽

2019 ◽

Author(s):

Vincent J Major ◽

Neil Jethani ◽

Yindalon Aphinyanaphongs

Keyword(s):

Experimental Design ◽

End Of Life ◽

Real World ◽

Hospital Admissions ◽

Model Performance ◽

Training Data ◽

Real World Data ◽

Test Set ◽

Subsequent Effect ◽

One Year

AbstractObjectiveThe main criteria for choosing how models are built is the subsequent effect on future (estimated) model performance. In this work, we evaluate the effects of experimental design choices on both estimated and actual model performance.Materials and MethodsFour years of hospital admissions are used to develop a 1 year end-of-life prediction model. Two common methods to select appropriate prediction timepoints (backwards-from-outcome and forwards-from-admission) are introduced and combined with two ways of separating cohorts for training and testing (internal and temporal). Two models are trained in identical conditions, and their performances are compared. Finally, operating thresholds are selected in each test set and applied in a final, ‘real-world’ cohort consisting of one year of admissions.ResultsBackwards-from-outcome cohort selection discards 75% of candidate admissions (n=23,579), whereas forwards-from-admission selection includes many more (n=92,148). Both selection methods produce similar global performances when applied to an internal test set. However, when applied to the temporally defined ‘real-world’ set, forwards-from-admission yields higher areas under the ROC and precision recall curves (88.3 and 56.5% vs. 83.2 and 41.6%).DiscussionA backwards-from-outcome experiment effectively transforms the training data such that it no longer resembles real-world data. This results in optimistic estimates of test set performance, especially at high precision. In contrast, a forwards-from-admission experiment with a temporally separated test set consistently and conservatively estimates real-world performance.ConclusionExperimental design choices impose bias upon selected cohorts. A forwards-from-admission experiment, validated temporally, can conservatively estimate real-world performance.

Download Full-text

Untreated congenital hypothyroidism due to loss to follow-up: developing preventive strategies through quality improvement

Journal of Pediatric Endocrinology and Metabolism ◽

10.1515/jpem-2018-0149 ◽

2018 ◽

Vol 31 (9) ◽

pp. 987-994

Author(s):

Kristal Anne Matlock ◽

Sarah Dawn Corathers ◽

Nana-Hawa Yayah Jones

Keyword(s):

At Risk ◽

Quality Improvement ◽

Congenital Hypothyroidism ◽

Practice Guidelines ◽

Provider Education ◽

Loss To Follow Up ◽

Patients Âgés ◽

Provider Adherence ◽

Lost To Follow Up

Abstract Background Children with congenital hypothyroidism (CH) are at risk for preventable intellectual disability without adequate medical management. The purpose of this manuscript is to discuss quality improvement (QI)-based processes for improving provider adherence to practice guidelines and ultimately identifying at-risk patients with chronic illness prior to the occurrence of adverse events. Methods Our study population included patients ages ≤3 years diagnosed with CH; lost to follow-up was defined as >180 days since last evaluation by an endocrinology provider. Iterative testing of interventions focused on establishing standardized care through (1) registry-based identification, (2) scheduling future appointments during current visits, (3) outreach to patients lost to follow-up and (4) provider and family education of current practice guidelines. Results A population-validated, electronic medical registry identified approximately 100 patients ages ≤3 years diagnosed with CH; initially, 12% of patients met criteria for lost to follow-up. Through serial testing of interventions, the rate of loss to follow-up declined to the goal of <5% within 8 months. Additional measures showed improvement in provider adherence to standard of care. All patients identified as lost to follow-up initially were seen within the first 3 months of intervention. Conclusions Applying QI methodology, a multidisciplinary team implemented a process to identify and contact high-risk CH patients with inadequate follow-up. Focused interventions targeting population management, scheduling and patient/provider education yield sustained improvement in the percentage of patients with a chronic condition who are lost to follow-up.

Download Full-text

928-P: Dulaglutide Has Higher Adherence and Persistence than Semaglutide and Exenatide QW: 6-Month Follow-Up from U.S. Real-World Data

Diabetes ◽

10.2337/db20-928-p ◽

2020 ◽

Vol 69 (Supplement 1) ◽

pp. 928-P

Author(s):

REEMA MODY ◽

MARIA YU ◽

BAL K. NEPAL ◽

MANIGE KONIG ◽

MICHAEL GRABNER

Keyword(s):

Real World ◽

Real World Data ◽

World Data ◽

Exenatide Qw

Download Full-text

Predictors of loss to follow-up in art experienced patients in Nigeria: a 13 year review (2004–2017)

AIDS Research and Therapy ◽

10.1186/s12981-019-0241-3 ◽

2019 ◽

Vol 16 (1) ◽

Author(s):

Ahmad Aliyu ◽

Babatunde Adelekan ◽

Nifarta Andrew ◽

Eunice Ekong ◽

Stephen Dapiap ◽

...

Keyword(s):

Patient Adherence ◽

Cross Sectional Study ◽

Emergency Plan ◽

Cross Sectional ◽

Loss To Follow Up ◽

Viral Loads ◽

Review Period ◽

Lost To Follow Up ◽

Differentiated Care

Abstract Background Expanded access to antiretroviral therapy (ART) leads to improved HIV/AIDS treatment outcomes in Nigeria, however, increasing rates of loss to follow-up among those on ART is threatening optimal standard achievement. Therefore, this retrospective cross-sectional study is aimed at identifying correlates and predictors of loss to follow-up in patients commencing ART in a large HIV program in Nigeria. Methods Records of all patients from 432 US CDC Presidents Emergency Plan for AIDS Relief (PEPFAR) supported facilities across 10 States and FCT who started ART from 2004 to 2017 were used for this study. Bivariate and multivariate analysis of the demographic and clinical parameters of all patients was conducted using STATA version 14 to determine correlates and predictors of loss to follow-up. Results Within the review period, 245,257 patients were ever enrolled on anti-retroviral therapy. 150,191 (61.2%) remained on treatment, 10,960 (4.5%) were transferred out to other facilities, 6926 (2.8%) died, 2139 (0.9%) self-terminated treatment and 75,041 (30.6%) had a loss to follow-up event captured. Males (OR: 1.16), Non-pregnant female (OR: 4.55), Patients on ≥ 3-monthly ARV refills (OR: 1.32), Patients with un-suppressed viral loads on ART (OR: 4.52), patients on adult 2nd line regimen (OR: 1.23) or pediatric on 1st line regimen (OR: 1.70) were significantly more likely to be lost to follow-up. Conclusion Despite increasing access to anti-retroviral therapy, loss to follow-up is still a challenge in the HIV program in Nigeria. Differentiated care approaches that will focus on males, non-pregnant females and paediatrics is encouraged. Reducing months of Anti-retroviral drug refill to less than 3 months is advocated for increased patient adherence.

Download Full-text

Long Term Real-World Outcomes of Trifluridine/Tipiracil in Metastatic Colorectal Cancer—A Single UK Centre Experience

Current Oncology ◽

10.3390/curroncol28030208 ◽

2021 ◽

Vol 28 (3) ◽

pp. 2260-2269

Author(s):

Daniel Tong ◽

Lei Wang ◽

Jeewaka Mendis ◽

Sharadah Essapen

Keyword(s):

Colorectal Cancer ◽

Metastatic Colorectal Cancer ◽

Real World ◽

Systemic Treatment ◽

Progression Free Survival ◽

Median Number ◽

Current Data ◽

Real World Data

In the UK, Trifluridine-tipiracil (Lonsurf) is used to treat metastatic colorectal cancer in the third-line setting, after prior exposure to fluoropyrimidine-based regimes. Current data on the real-world use of Lonsurf lack long-term follow-up data. A retrospective evaluation of patients receiving Lonsurf at our Cancer Centre in 2016–2017 was performed, all with a minimum of two-year follow-up. Fifty-six patients were included in the review. The median number of cycles of Lonsurf administered was 3. Median follow-up was 6.0 months, with all patients deceased at the time of analysis. Median progression-free survival (PFS) was 3.2 months, and overall survival (OS) was 5.8 months. The median interval from Lonsurf discontinuation to death was two months, but seven patients received further systemic treatment and median OS gained was 12 months. Lonsurf offered a slightly better PFS but inferior OS to that of the RECOURSE trial, with PFS similar to real-world data previously presented. Interestingly, 12.5% had a PFS > 9 months, and this cohort had primarily left-sided and RAS wild-type disease. A subset received further systemic treatment on Lonsurf discontinuation with good additional OS benefit. Lonsurf may alter the course of disease for a subset of patients, and further treatment on progression can be considered in carefully selected patients.

Download Full-text

Boosting Instance Segmentation with Synthetic Data: A study to overcome the limits of real world data sets

10.1109/iccvw54120.2021.00110 ◽

2021 ◽

Author(s):

Florentin Poucin ◽

Andrea Kraus ◽

Martin Simon

Keyword(s):

Real World ◽

Synthetic Data ◽

Data Sets ◽

Real World Data ◽

World Data ◽

Instance Segmentation

Download Full-text

Prospective cohort study of Kaposi sarcoma treated under real-world conditions in Malawi.

Journal of Clinical Oncology ◽

10.1200/jco.2021.39.15_suppl.11569 ◽

2021 ◽

Vol 39 (15_suppl) ◽

pp. 11569-11569

Author(s):

Edwards Kasonkanji ◽

Yolanda Gondwe ◽

Morgan Dewey ◽

Joe Gumulira ◽

Matthew Painschab ◽

...

Keyword(s):

Cohort Study ◽

Kaposi Sarcoma ◽

Median Number ◽

Sub Saharan Africa ◽

Worst Case ◽

Loss To Follow Up ◽

Presenting Symptoms ◽

Grade 3 ◽

Lost To Follow Up

11569 Background: Kaposi sarcoma (KS) is the leading cancer in Malawi (34% of cancers). Outside of clinical trials, prospective KS studies from sub-Saharan Africa (SSA) are few and limited by loss to follow up. We conducted a prospective KS cohort study of standard of care bleomycin/vincristine (BV) at Lighthouse HIV clinic, in Lilongwe, Malawi. Methods: We enrolled pathologically confirmed, newly diagnosed, HIV+ KS patients from Feb 2017 to Jun 2019. We collected clinical and treatment characteristics, toxicity, and outcomes of KS with follow-up censored Jun 2020. Patients were treated with bleomycin (25 mg/m2) and vincristine (0.4 mg/m2) every 14 days for a planned maximum of 16 cycles. STATA v13.0 was used to calculate descriptive statistics and Kaplan Meier survival analysis. Toxicity was graded using NCI CTCAE v5.0. Results: We enrolled 138 participants, median age 36 (IQR 32-44) and 110 (80%) male. By ACTG staging, 107 (78%) were T1 (tumour severity), 46 (33%) were S1 (illness severity) and 46 (33%) had Karnofsky performance status ≤70. Presenting symptoms included edema in 69 (53%), visceral disease in 9 (7%), and oral involvement in 43 (33%). Prior to KS diagnosis, 70 (51%) participants were aware of being HIV+ for median 17 months (IQR 6-60) and had been on ART for median 16 months (IQR 6-60). Median CD4 count was 197 (IQR 99-339), median HIV-viral load was 2.6 log copies/mL (IQR 1.6 – 4.8) and 57% were HIV-suppressed ( < 1000 HIV copies/ml). The median number of cycles was 16 (IQR 7-16). 62 (45%) participants missed at least one dose due to stock out. Amongst patients with missed doses, the median number was 3 (IQR 2-4) for bleomycin and 2 (IQR 1-3) for vincristine. 14 (10%) participants experienced at least one reduced dose due to toxicity. 5 (4%) participants suffered grade ≥3 anaemia, 13 (9%) grade ≥3 neutropenia, and one participant had grade 4 bleomycin-induced dermatitis. There was no reported grade ≥3 bleomycin lung toxicity or vincristine-induced neuropathy. Of 115 evaluable participants, responses at the end of therapy were: complete response in 52 (45%), partial response in 27 (23%) stable disease in 5 (4%), and progressive disease in 31 (28%). Median duration of follow-up was 20 months. At censoring, 69 (50%) were alive, 36 (26%) dead, and 33 (24%) lost to follow-up. Overall survival is shown Table as crude and worst-case scenario; worst-case assumes all participants lost to follow up died. Conclusions: Here, we present one of the most complete characterizations of KS presentation and treatment from SSA. As in other studies from the region, the majority of patients presented with advanced disease, chemotherapy stock-outs and loss to follow up were common, and mortality was high. Studies are planned to understand the virologic characteristics, improve therapies, and better implement existing therapies.[Table: see text]

Download Full-text