Factors affecting the performance of brain arteriovenous malformation rupture prediction models

Abstract Background In many cases, both the rupture rate of cerebral arteriovenous malformation (bAVM) in patients and the risk of endovascular or surgical treatment (when radiosurgery is not appropriate) are not low, it is important to assess the risk of rupture more cautiously before treatment. Based on the current high-risk predictors and clinical data, different sample sizes, sampling times and algorithms were used to build prediction models for the risk of hemorrhage in bAVM, and the accuracy and stability of the models were investigated. Our purpose was to remind researchers that there may be some pitfalls in developing similar prediction models. Methods The clinical data of 353 patients with bAVMs were collected. During the creation of prediction models for bAVM rupture, we changed the ratio of the training dataset to the test dataset, increased the number of sampling times, and built models for predicting bAVM rupture by the logistic regression (LR) algorithm and random forest (RF) algorithm. The area under the curve (AUC) was used to evaluate the predictive performances of those models. Results The performances of the prediction models built by both algorithms were not ideal (AUCs: 0.7 or less). The AUCs from the models built by the LR algorithm with different sample sizes were better than those built by the RF algorithm (0.70 vs 0.68, p < 0.001). The standard deviations (SDs) of the AUCs from both prediction models with different sample sizes displayed wide ranges (max range > 0.1). Conclusions Based on the current risk predictors, it may be difficult to build a stable and accurate prediction model for the hemorrhagic risk of bAVMs. Compared with sample size and algorithms, meaningful predictors are more important in establishing an accurate and stable prediction model.

Download Full-text

SAT-LB121 Development of a Machine-Learning Method for Predicting New Onset of Diabetes Mellitus: A Retrospective Analysis of 509,153 Annual Specific Health Checkup Records

Journal of the Endocrine Society ◽

10.1210/jendso/bvaa046.2194 ◽

2020 ◽

Vol 4 (Supplement_1) ◽

Author(s):

Akihiro Nomura ◽

Sho Yamamoto ◽

Yuta Hayakawa ◽

Kouki Taniguchi ◽

Takuya Higashitani ◽

...

Keyword(s):

Diabetes Mellitus ◽

Machine Learning ◽

Prediction Model ◽

Performance Test ◽

Bootstrap Method ◽

Area Under The Curve ◽

Training Dataset ◽

Gradient Boosting ◽

Health Checkup ◽

Specific Health

Abstract Diabetes mellitus (DM) is a chronic disorder, characterized by impaired glucose metabolism. It is linked to increased risks of several diseases such as atrial fibrillation, cancer, and cardiovascular diseases. Therefore, DM prevention is essential. However, the traditional regression-based DM-onset prediction methods are incapable of investigating future DM for generally healthy individuals without DM. Employing gradient-boosting decision trees, we developed a machine learning-based prediction model to identify the DM signatures, prior to the onset of DM. We employed the nationwide annual specific health checkup records, collected during the years 2008 to 2018, from Kanazawa city, Ishikawa, Japan. The data included the physical examinations, blood and urine tests, and participant questionnaires. Individuals without DM (at baseline), who underwent more than two annual health checkups during the said period, were included. The new cases of DM onset were recorded when the participants were diagnosed with DM in the annual check-ups. The dataset was divided into three subsets in a 6:2:2 ratio to constitute the training, tuning (internal validation), and testing datasets. Employing the testing dataset, the ability of our trained prediction model to calculate the area under the curve (AUC), precision, recall, F1 score, and overall accuracy was evaluated. Using a 1,000-iteration bootstrap method, every performance test resulted in a two-sided 95% confidence interval (CI). We included 509,153 annual health checkup records of 139,225 participants. Among them, 65,505 participants without DM were included, which constituted36,303 participants in the training dataset and 13,101 participants in each of the tuning and testing datasets. We identified a total of 4,696 new DM-onset patients (7.2%) in the study period. Our trained model predicted the future incidence of DM with the AUC, precision, recall, F1 score, and overall accuracy of 0.71 (0.69-0.72 with 95% CI), 75.3% (71.6-78.8), 42.2% (39.3-45.2), 54.1% (51.2-56.7), and 94.9% (94.5-95.2), respectively. In conclusion, the machine learning-based prediction model satisfactorily identified the DM onset prior to the actual incidence.

Download Full-text

Predicting poor outcome in patients with suspected COVID-19 presenting to the Emergency Department (COVERED) – Development, internal and external validation of a prediction model

Acute Medicine Journal ◽

10.52964/amja.0836 ◽

2021 ◽

Vol 20 (1) ◽

pp. 4-14

Author(s):

K. Azijli ◽

◽

A.W.E. Lieveld ◽

S.F.B. van der Horst ◽

N. de Graaf ◽

...

Keyword(s):

Emergency Department ◽

Prediction Model ◽

Clinical Decision Making ◽

Prediction Models ◽

External Validation ◽

Model Performance ◽

Area Under The Curve ◽

Brier Score ◽

Multi Centre Study ◽

Poor Outcome

Background: A recent systematic review recommends against the use of any of the current COVID-19 prediction models in clinical practice. To enable clinicians to appropriately profile and treat suspected COVID-19 patients at the emergency department (ED), externally validated models that predict poor outcome are desperately needed. Objective: Our aims were to identify predictors of poor outcome, defined as mortality or ICU admission within 30 days, in patients presenting to the ED with a clinical suspicion of COVID-19, and to develop and externally validate a prediction model for poor outcome. Methods: In this prospective, multi-centre study, we enrolled suspected COVID-19 patients presenting at the EDs of two hospitals in the Netherlands. We used backward logistic regression to develop a prediction model. We used the area under the curve (AUC), Brier score and pseudo-R2 to assess model performance. The model was externally validated in an Italian cohort. Results: We included 1193 patients between March 12 and May 27 2020, of whom 196 (16.4%) had a poor outcome. We identified 10 predictors of poor outcome: current malignancy (OR 2.774; 95%CI 1.682-4.576), systolic blood pressure (OR 0.981; 95%CI 0.964-0.998), heart rate (OR 1.001; 95%CI 0.97-1.028), respiratory rate (OR 1.078; 95%CI 1.046-1.111), oxygen saturation (OR 0.899; 95%CI 0.850-0.952), body temperature (OR 0.505; 95%CI 0.359-0.710), serum urea (OR 1.404; 95%CI 1.198-1.645), C-reactive protein (OR 1.013; 95%CI 1.001-1.024), lactate dehydrogenase (OR 1.007; 95%CI 1.002-1.013) and SARS-CoV-2 PCR result (OR 2.456; 95%CI 1.526-3.953). The AUC was 0.86 (95%CI 0.83-0.89), with a Brier score of 0.32 and, and R2 of 0.41. The AUC in the external validation in 500 patients was 0.70 (95%CI 0.65-0.75). Conclusion: The COVERED risk score showed excellent discriminatory ability, also in external validation. It may aid clinical decision making, and improve triage at the ED in health care environments with high patient throughputs.

Download Full-text

Abstract 238: Validation of In-hospital Mortality Prediction Models for Patients with Heart Failure

Circulation Cardiovascular Quality and Outcomes ◽

10.1161/circoutcomes.8.suppl_2.238 ◽

2015 ◽

Vol 8 (suppl_2) ◽

Author(s):

Tara Lagu ◽

Mihaela Stefan ◽

Quinn Pack ◽

Auras Atreya ◽

Mohammad A Kashef ◽

...

Keyword(s):

Heart Failure ◽

Hospital Mortality ◽

Prediction Model ◽

Clinical Data ◽

Prediction Models ◽

Mortality Prediction ◽

Mortality Prediction Model ◽

Mortality Prediction Models ◽

Clinical Models ◽

Administrative Model

Background: Mortality prediction models, developed with the goal of improving risk stratification in hospitalized heart failure (HF) patients, show good performance characteristics in the datasets in which they were developed but have not been validated in external populations. Methods: We used a novel multi-hospital dataset [HealthFacts (Cerner Corp)] derived from the electronic health record (years 2010-2012). We examined the performance of four published HF inpatient mortality prediction models developed using data from: the Acute Decompensated Heart Failure National Registry (ADHERE), the Enhanced Feedback for Effective Cardiac Treatment (EFFECT) study, and the Get With the Guidelines-Heart Failure (GWTG-HF) registry. We compared to an administrative HF mortality prediction model (Premier model) that includes selected patient demographics, comorbidities, prior heart failure admissions, and therapies administered (e.g., inotropes, mechanical ventilation) in the first 2 hospital days. We also compared to a model that uses clinical data but is not heart failure-specific: the Laboratory-Based Acute Physiology Score (LAPS2). We included patients aged ≥18 years admitted with HF to one of 62 hospitals in the database. We applied all 6 models to the data and calculated the c-statistics. Results: We identified 13,163 patients ≥18 years old with a diagnosis of heart failure. Median age was 74 years; approximately half were women; 65% of patients were white and 27% were black. In-hospital mortality was 4.3%. Bland-Altman plots revealed that, at higher predicted mortality, the Premier model outperformed the clinical models. Discrimination of the models varied: ADHERE model (0.68); EFFECT (0.70); GWTG-HF, Peterson (0.69); GWTG-HF, Eapen (0.70); LAPS2 (0.74); Premier (0.81) (Figure). Conclusions: Clinically-derived inpatient heart failure mortality models exhibited similar performance with c statistics hovering around 0.70. A generic clinical mortality prediction model (LAPS2) had slightly better performance, as did a detailed administrative model. Any of these models may be useful for severity adjustment in comparative effectiveness studies of heart failure patients. When clinical data are not available, the administrative model performs similarly to clinical models.

Download Full-text

Individual participant data validation of the PICNICC prediction model for febrile neutropenia

Archives of Disease in Childhood ◽

10.1136/archdischild-2019-317308 ◽

2019 ◽

Vol 105 (5) ◽

pp. 439-445 ◽

Cited By ~ 1

Author(s):

Bob Phillips ◽

Jessica Elizabeth Morgan ◽

Gabrielle M Haeusler ◽

Richard D Riley

Keyword(s):

Prediction Model ◽

Prediction Models ◽

Meta Analysis ◽

Characteristic Curve ◽

Area Under The Curve ◽

Infectious Complications ◽

Cancer Therapies ◽

Individual Participant Data ◽

Documented Infection ◽

Individual Participant

BackgroundRisk-stratified approaches to managing cancer therapies and their consequent complications rely on accurate predictions to work effectively. The risk-stratified management of fever with neutropenia is one such very common area of management in paediatric practice. Such rules are frequently produced and promoted without adequate confirmation of their accuracy.MethodsAn individual participant data meta-analytic validation of the ‘Predicting Infectious ComplicatioNs In Children with Cancer’ (PICNICC) prediction model for microbiologically documented infection in paediatric fever with neutropenia was undertaken. Pooled estimates were produced using random-effects meta-analysis of the area under the curve-receiver operating characteristic curve (AUC-ROC), calibration slope and ratios of expected versus observed cases (E/O).ResultsThe PICNICC model was poorly predictive of microbiologically documented infection (MDI) in these validation cohorts. The pooled AUC-ROC was 0.59, 95% CI 0.41 to 0.78, tau2=0, compared with derivation value of 0.72, 95% CI 0.71 to 0.76. There was poor discrimination (pooled slope estimate 0.03, 95% CI −0.19 to 0.26) and calibration in the large (pooled E/O ratio 1.48, 95% CI 0.87 to 2.1). Three different simple recalibration approaches failed to improve performance meaningfully.ConclusionThis meta-analysis shows the PICNICC model should not be used at admission to predict MDI. Further work should focus on validating alternative prediction models. Validation across multiple cohorts from diverse locations is essential before widespread clinical adoption of such rules to avoid overtreating or undertreating children with fever with neutropenia.

Download Full-text

Establishment of a prediction model for early and mid-term complications for patients undergoing catheter insertion for peritoneal dialysis

Journal of International Medical Research ◽

10.1177/03000605211004524 ◽

2021 ◽

Vol 49 (4) ◽

pp. 030006052110045

Author(s):

Yibo Ma ◽

Shuiqing Liu ◽

Min Yang ◽

Yun Zou ◽

Dong Xue ◽

...

Keyword(s):

Peritoneal Dialysis ◽

Confidence Interval ◽

Prediction Model ◽

Prediction Models ◽

Multivariate Logistic Regression Analysis ◽

Area Under The Curve ◽

General Information ◽

Catheter Insertion ◽

Early Complications ◽

Roc Curve Analysis

Objective To investigate the factors involved in early and mid-term complications after catheter insertion for peritoneal dialysis and to establish prediction models. Methods A total of 158 patients with peritoneal dialysis in the Department of Nephrology of our hospital were retrospectively analyzed. General information, laboratory indices, early complications (within 1 month after the operation), mid-term complications (1–6 months after the operation), and other relevant data were recorded. Multivariate logistic regression analysis was performed to establish a prediction model of complications and generate a nomogram. Receiver operating characteristic (ROC) curve analysis was used to evaluate the efficacy of the model. Results Among the patients, 48 (30.8%) had early complications, which were mainly catheter-related complications, and 29 (18.4%) had mid-term complications, which were mainly abdominal infection and catheter migration. We constructed a prediction model for early complications (area under the curve = 0.697, 95% confidence interval: 0.609–0.785) and mid-term complications (area under the curve = 0.730, 95% confidence interval: 0.622–0.839). The sensitivity was 0.750 and 0.607, and the specificity was 0.589 and 0.765, respectively. Conclusions Our prediction model has clinical significance for risk assessment of early and mid-term complications and prevention of complications after catheterization for peritoneal dialysis.

Download Full-text

Prediction Scores for Acute Kidney Injury following Adult Cardiac Surgery

10.21203/rs.3.rs-88293/v1 ◽

2020 ◽

Author(s):

Yu Tian ◽

Wei Zhao ◽

Yuefu Wang ◽

Chunrong Wang ◽

Xiaolin Diao ◽

...

Keyword(s):

Acute Kidney Injury ◽

Cardiac Surgery ◽

Prediction Model ◽

Prediction Models ◽

Area Under The Curve ◽

Kidney Injury ◽

Scoring Systems ◽

Multivariable Logistic Regression Analysis ◽

Derivation Cohort ◽

Associated Risk Factors

Abstract Background In the development of scoring systems for acute kidney injury (AKI) following cardiac surgery, previous investigations have primarily and solely attached importance to preoperative associated risk factors without any consideration for surgery-derived physiopathology. We sought to internally derive and then validate risk score systems using pre- and intraoperative variables to predict the occurrence of any-stage (stage 1-3) and stage-3 AKI within 7 days.Methods Patients undergoing cardiac surgery from Jan 1, 2012, to Jan 1, 2019, were enrolled in our retrospective study. The clinical data were divided into a derivation cohort (n= 43799) and a validation cohort (n= 14600). Multivariable logistic regression analysis was used to develop the prediction models.Results The overall prevalence of any-stage and stage-3 AKI after cardiac surgery was 34.3% and 1.7%, respectively. Any-stage AKI prediction-model discrimination measured by the area under the curve (AUC) was acceptable (AUC = 0.69, 95% CI: 0.68, 0.69), and the prediction model calibration measured by the Hosmer-Lemshow test was good (P = 0.95). The stage-3 AKI prediction model had an AUC of 0.84 (95% CI 0.83, 0.85) and good calibration according to the Hosmer-Lemshow test (P = 0.73).Conclusions Using pre- and intraoperative data, we developed two scoring systems for any-stage AKI and stage-3 AKI in a cardiac surgery population. These scoring systems can potentially be adopted clinically in the field of AKI recognition and therapeutic intervention.

Download Full-text

Easily created prediction model using deep learning software (Prediction One, Sony Network Communications Inc.) for subarachnoid hemorrhage outcomes from small dataset at admission

Surgical Neurology International ◽

10.25259/sni_636_2020 ◽

2020 ◽

Vol 11 ◽

pp. 374

Author(s):

Masahito Katsuki ◽

Yukinari Kakizawa ◽

Akihiro Nishikawa ◽

Yasunaga Yamamoto ◽

Toshiya Uchiyama

Keyword(s):

Deep Learning ◽

Subarachnoid Hemorrhage ◽

Prediction Model ◽

Prediction Models ◽

External Validation ◽

Training Dataset ◽

Validation Dataset ◽

Good Prediction ◽

Network Communications ◽

Learning Software

Background: Reliable prediction models of subarachnoid hemorrhage (SAH) outcomes are needed for decision-making of the treatment. SAFIRE score using only four variables is a good prediction scoring system. However, making such prediction models needs a large number of samples and time-consuming statistical analysis. Deep learning (DL), one of the artificial intelligence, is attractive, but there were no reports on prediction models for SAH outcomes using DL. We herein made a prediction model using DL software, Prediction One (Sony Network Communications Inc., Tokyo, Japan) and compared it to SAFIRE score. Methods: We used 153 consecutive aneurysmal SAH patients data in our hospital between 2012 and 2019. Modified Rankin Scale (mRS) 0–3 at 6 months was defined as a favorable outcome. We randomly divided them into 102 patients training dataset and 51 patients external validation dataset. Prediction one made the prediction model using the training dataset with internal cross-validation. We used both the created model and SAFIRE score to predict the outcomes using the external validation set. The areas under the curve (AUCs) were compared. Results: The model made by Prediction One using 28 variables had AUC of 0.848, and its AUC for the validation dataset was 0.953 (95%CI 0.900–1.000). AUCs calculated using SAFIRE score were 0.875 for the training dataset and 0.960 for the validation dataset, respectively. Conclusion: We easily and quickly made prediction models using Prediction One, even with a small single-center dataset. The accuracy of the model was not so inferior to those of previous statistically calculated prediction models.

Download Full-text

Prediction of remission in pharmacotherapy of untreated major depression: development and validation of multivariable prediction models

Psychological Medicine ◽

10.1017/s0033291718003331 ◽

2018 ◽

Vol 49 (14) ◽

pp. 2405-2413 ◽

Cited By ~ 3

Author(s):

Toshi A. Furukawa ◽

Tadashi Kato ◽

Yoshihiro Shinagawa ◽

Kazuhira Miki ◽

Hirokazu Fujita ◽

...

Keyword(s):

Major Depression ◽

Prediction Model ◽

Goodness Of Fit ◽

Prediction Models ◽

Secondary Analysis ◽

Area Under The Curve ◽

Stepwise Logistic Regression ◽

Patient Health ◽

Validation Set ◽

Using Data

AbstractBackgroundDepression is increasingly recognized as a chronic and relapsing disorder. However, an important minority of patients who start treatment for their major depressive episode recover to euthymia. It is clinically important to be able to predict such individuals.MethodsThe study is a secondary analysis of a recently completed pragmatic megatrial examining first- and second-line treatments for hitherto untreated episodes of non-psychotic unipolar major depression (n = 2011). Using the first half of the cohort as the derivation set, we applied multiply-imputed stepwise logistic regression with backward selection to build a prediction model to predict remission, defined as scoring 4 or less on the Patient Health Quetionnaire-9 at week 9. We used three successively richer sets of predictors at baseline only, up to week 1, and up to week 3. We examined the external validity of the derived prediction models with the second half of the cohort.ResultsIn total, 37.0% (95% confidence interval 34.8–39.1%) were in remission at week 9. Only the models using data up to week 1 or 3 showed reasonable performance. Age, education, length of episode and depression severity remained in the multivariable prediction models. In the validation set, the discrimination of the prediction model was satisfactory with the area under the curve of 0.73 (0.70–0.77) and 0.82 (0.79–0.85), while the calibration was excellent with non-significant goodness-of-fit χ2 values (p = 0.41 and p = 0.29), respectively.ConclusionsPatients and clinicians can use these prediction models to estimate their predicted probability of achieving remission after acute antidepressant therapy.

Download Full-text

Clinical Data Prediction Model to Identify Patients With Early-Stage Pancreatic Cancer

JCO Clinical Cancer Informatics ◽

10.1200/cci.20.00137 ◽

2021 ◽

pp. 279-287

Author(s):

Qinyu Chen ◽

Daniel R. Cherry ◽

Vinit Nalawade ◽

Edmund M. Qiao ◽

Abhishek Kumar ◽

...

Keyword(s):

Pancreatic Cancer ◽

Early Detection ◽

Prediction Model ◽

Clinical Data ◽

Late Stage ◽

Early Stage ◽

Area Under The Curve ◽

Screening Tools ◽

Data Set ◽

Extreme Gradient Boosting

PURPOSE Pancreatic cancer is an aggressive malignancy with patients often experiencing nonspecific symptoms before diagnosis. This study evaluates a machine learning approach to help identify patients with early-stage pancreatic cancer from clinical data within electronic health records (EHRs). MATERIALS AND METHODS From the Optum deidentified EHR data set, we identified early-stage (n = 3,322) and late-stage (n = 25,908) pancreatic cancer cases over 40 years of age diagnosed between 2009 and 2017. Patients with early-stage pancreatic cancer were matched to noncancer controls (1:16 match). We constructed a prediction model using eXtreme Gradient Boosting (XGBoost) to identify early-stage patients on the basis of 18,220 features within the EHR including diagnoses, procedures, information within clinical notes, and medications. Model accuracy was assessed with sensitivity, specificity, positive predictive value, and the area under the curve. RESULTS The final predictive model included 582 predictive features from the EHR, including 248 (42.5%) physician note elements, 146 (25.0%) procedure codes, 91 (15.6%) diagnosis codes, 89 (15.3%) medications, and 9 (1.5%) demographic features. The final model area under the curve was 0.84. Choosing a model cut point with a sensitivity of 60% and specificity of 90% would enable early detection of 58% late-stage patients with a median of 24 months before their actual diagnosis. CONCLUSION Prediction models using EHR data show promise in the early detection of pancreatic cancer. Although widespread use of this approach on an unselected population would produce high rates of false-positive tests, this technique may be rapidly impactful if deployed among high-risk patients or paired with other imaging or biomarker screening tools.

Download Full-text

External Validation of Three Prediction Tools for Patients at Risk of a Complicated Course of Clostridium difficile Infection: Disappointing in an Outbreak Setting

Infection Control and Hospital Epidemiology ◽

10.1017/ice.2017.89 ◽

2017 ◽

Vol 38 (8) ◽

pp. 897-905 ◽

Cited By ~ 4

Author(s):

Yvette H. van Beurden ◽

Marjolein P. M. Hensgens ◽

Olaf M. Dekkers ◽

Saskia Le Cessie ◽

Chris J. J. Mulder ◽

...

Keyword(s):

Clostridium Difficile ◽

Prediction Model ◽

Model Calibration ◽

Prediction Models ◽

Validation Cohort ◽

External Validation ◽

Area Under The Curve ◽

Patients At Risk ◽

Complicated Course ◽

Welfare Model

OBJECTIVEEstimating the risk of a complicated course of Clostridium difficile infection (CDI) might help doctors guide treatment. We aimed to validate 3 published prediction models: Hensgens (2014), Na (2015), and Welfare (2011).METHODSThe validation cohort comprised 148 patients diagnosed with CDI between May 2013 and March 2014. During this period, 70 endemic cases of CDI occurred as well as 78 cases of CDI related to an outbreak of C. difficile ribotype 027. Model calibration and discrimination were assessed for the 3 prediction rules.RESULTSA complicated course (ie, death, colectomy, or ICU admission due to CDI) was observed in 31 patients (21%), and 23 patients (16%) died within 30 days of CDI diagnosis. The performance of all 3 prediction models was poor when applied to the total validation cohort with an estimated area under the curve (AUC) of 0.68 for the Hensgens model, 0.54 for the Na model, and 0.61 for the Welfare model. For those patients diagnosed with CDI due to non-outbreak strains, the prediction model developed by Hensgens performed the best, with an AUC of 0.78.CONCLUSIONAll 3 prediction models performed poorly when using our total cohort, which included CDI cases from an outbreak as well as endemic cases. The prediction model of Hensgens performed relatively well for patients diagnosed with CDI due to non-outbreak strains, and this model may be useful in endemic settings.Infect Control Hosp Epidemiol 2017;38:897–905

Download Full-text