scholarly journals Prognostic Nomogram for Liver Metastatic Colon Cancer Based on Histological Type, Tumor Differentiation, and Tumor Deposit: A TRIPOD Compliant Large-Scale Survival Study

2021 ◽  
Vol 11 ◽  
Author(s):  
Le Kuai ◽  
Ying Zhang ◽  
Ying Luo ◽  
Wei Li ◽  
Xiao-dong Li ◽  
...  

ObjectiveA proportional hazard model was applied to develop a large-scale prognostic model and nomogram incorporating clinicopathological characteristics, histological type, tumor differentiation grade, and tumor deposit count to provide clinicians and patients diagnosed with colon cancer liver metastases (CLM) a more comprehensive and practical outcome measure.MethodsUsing the Transparent Reporting of multivariable prediction models for individual Prognosis or Diagnosis (TRIPOD) guidelines, this study identified 14,697 patients diagnosed with CLM from 1975 to 2017 in the Surveillance, Epidemiology, and End Results (SEER) 21 registry database. Patients were divided into a modeling group (n=9800), an internal validation group (n=4897) using computerized randomization. An independent external validation cohort (n=60) was obtained. Univariable and multivariate Cox analyses were performed to identify prognostic predictors for overall survival (OS). Subsequently, the nomogram was constructed, and the verification was undertaken by receiver operating curves (AUC) and calibration curves.ResultsHistological type, tumor differentiation grade, and tumor deposit count were independent prognostic predictors for CLM. The nomogram consisted of age, sex, primary site, T category, N category, metastasis of bone, brain or lung, surgery, and chemotherapy. The model achieved excellent prediction power on both internal (mean AUC=0.811) and external validation (mean AUC=0.727), respectively, which were significantly higher than the American Joint Committee on Cancer (AJCC) TNM system.ConclusionThis study proposes a prognostic nomogram for predicting 1- and 2-year survival based on histopathological and population-based data of CLM patients developed using TRIPOD guidelines. Compared with the TNM stage, our nomogram has better consistency and calibration for predicting the OS of CLM patients.

2020 ◽  
Author(s):  
Jenna Marie Reps ◽  
Ross Williams ◽  
Seng Chan You ◽  
Thomas Falconer ◽  
Evan Minty ◽  
...  

Abstract Objective: To demonstrate how the Observational Healthcare Data Science and Informatics (OHDSI) collaborative network and standardization can be utilized to scale-up external validation of patient-level prediction models by enabling validation across a large number of heterogeneous observational healthcare datasets.Materials & Methods: Five previously published prognostic models (ATRIA, CHADS2, CHADS2VASC, Q-Stroke and Framingham) that predict future risk of stroke in patients with atrial fibrillation were replicated using the OHDSI frameworks. A network study was run that enabled the five models to be externally validated across nine observational healthcare datasets spanning three countries and five independent sites. Results: The five existing models were able to be integrated into the OHDSI framework for patient-level prediction and they obtained mean c-statistics ranging between 0.57-0.63 across the 6 databases with sufficient data to predict stroke within 1 year of initial atrial fibrillation diagnosis for females with atrial fibrillation. This was comparable with existing validation studies. The validation network study was run across nine datasets within 60 days once the models were replicated. An R package for the study was published at https://github.com/OHDSI/StudyProtocolSandbox/tree/master/ExistingStrokeRiskExternalValidation.Discussion: This study demonstrates the ability to scale up external validation of patient-level prediction models using a collaboration of researchers and a data standardization that enable models to be readily shared across data sites. External validation is necessary to understand the transportability or reproducibility of a prediction model, but without collaborative approaches it can take three or more years for a model to be validated by one independent researcher. Conclusion : In this paper we show it is possible to both scale-up and speed-up external validation by showing how validation can be done across multiple databases in less than 2 months. We recommend that researchers developing new prediction models use the OHDSI network to externally validate their models.


Author(s):  
Jenna Marie Reps ◽  
Ross D Williams ◽  
Seng Chan You ◽  
Thomas Falconer ◽  
Evan Minty ◽  
...  

Abstract Background: To demonstrate how the Observational Healthcare Data Science and Informatics (OHDSI) collaborative network and standardization can be utilized to scale-up external validation of patient-level prediction models by enabling validation across a large number of heterogeneous observational healthcare datasets.Methods: Five previously published prognostic models (ATRIA, CHADS2, CHADS2VASC, Q-Stroke and Framingham) that predict future risk of stroke in patients with atrial fibrillation were replicated using the OHDSI frameworks. A network study was run that enabled the five models to be externally validated across nine observational healthcare datasets spanning three countries and five independent sites. Results: The five existing models were able to be integrated into the OHDSI framework for patient-level prediction and they obtained mean c-statistics ranging between 0.57-0.63 across the 6 databases with sufficient data to predict stroke within 1 year of initial atrial fibrillation diagnosis for females with atrial fibrillation. This was comparable with existing validation studies. The validation network study was run across nine datasets within 60 days once the models were replicated. An R package for the study was published at https://github.com/OHDSI/StudyProtocolSandbox/tree/master/ExistingStrokeRiskExternalValidation.Conclusion : This study demonstrates the ability to scale up external validation of patient-level prediction models using a collaboration of researchers and a data standardization that enable models to be readily shared across data sites. External validation is necessary to understand the transportability or reproducibility of a prediction model, but without collaborative approaches it can take three or more years for a model to be validated by one independent researcher. In this paper we show it is possible to both scale-up and speed-up external validation by showing how validation can be done across multiple databases in less than 2 months. We recommend that researchers developing new prediction models use the OHDSI network to externally validate their models.


Author(s):  
Jet M. J. Vonk ◽  
Jacoba P. Greving ◽  
Vilmundur Gudnason ◽  
Lenore J. Launer ◽  
Mirjam I. Geerlings

AbstractWe aimed to evaluate the external performance of prediction models for all-cause dementia or AD in the general population, which can aid selection of high-risk individuals for clinical trials and prevention. We identified 17 out of 36 eligible published prognostic models for external validation in the population-based AGES-Reykjavik Study. Predictive performance was assessed with c statistics and calibration plots. All five models with a c statistic > .75 (.76–.81) contained cognitive testing as a predictor, while all models with lower c statistics (.67–.75) did not. Calibration ranged from good to poor across all models, including systematic risk overestimation or overestimation for particularly the highest risk group. Models that overestimate risk may be acceptable for exclusion purposes, but lack the ability to accurately identify individuals at higher dementia risk. Both updating existing models or developing new models aimed at identifying high-risk individuals, as well as more external validation studies of dementia prediction models are warranted.


2021 ◽  
Vol 49 (5) ◽  
pp. 030006052110150
Author(s):  
Shuanhu Wang ◽  
Yakui Liu ◽  
Yi Shi ◽  
Jiajia Guan ◽  
Mulin Liu ◽  
...  

Objective To develop and externally validate a prognostic nomogram to predict overall survival (OS) in patients with resectable colon cancer. Methods Data for 50,996 patients diagnosed with non-metastatic colon cancer were retrieved from the Surveillance, Epidemiology, and End Results (SEER) database. Patients were assigned randomly to the training set (n = 34,168) or validation set (n = 16,828). Independent prognostic factors were identified by multivariate Cox proportional hazards regression analysis and used to construct the nomogram. Harrell’s C-index and calibration plots were calculated using the SEER validation set. Additional external validation was performed using a Chinese dataset (n = 342). Results Harrell’s C-index of the nomogram for OS in the SEER validation set was 0.71, which was superior to that using the 7th edition of the American Joint Committee on Cancer TNM staging (0.59). Calibration plots showed consistency between actual observations and predicted 1-, 3-, and 5-year survival. Harrell’s C-index (0.72) and calibration plot showed excellent predictive accuracy in the external validation set. Conclusions We developed a nomogram to predict OS after curative resection for colon cancer. Validation using the SEER and external datasets revealed good discrimination and calibration. This nomogram may help predict individual survival in patients with colon cancer.


2020 ◽  
Author(s):  
Jenna Marie Reps ◽  
Ross D Williams ◽  
Seng Chan You ◽  
Thomas Falconer ◽  
Evan Minty ◽  
...  

Abstract Background To demonstrate how the Observational Healthcare Data Science and Informatics (OHDSI) collaborative network and standardization can be utilized to scale-up external validation of patient-level prediction models by enabling validation across a large number of heterogeneous observational healthcare datasets.Methods Five previously published prognostic models (ATRIA, CHADS2, CHADS2VASC, Q-Stroke and Framingham) that predict future risk of stroke in patients with atrial fibrillation were replicated using the OHDSI frameworks. A network study was run that enabled the five models to be externally validated across nine observational healthcare datasets spanning three countries and five independent sites. Results The five existing models were able to be integrated into the OHDSI framework for patient-level prediction and they obtained mean c-statistics ranging between 0.57-0.63 across the 6 databases with sufficient data to predict stroke within 1 year of initial atrial fibrillation diagnosis for females with atrial fibrillation. This was comparable with existing validation studies. The validation network study was run across nine datasets within 60 days once the models were replicated. An R package for the study was published at https://github.com/OHDSI/StudyProtocolSandbox/tree/master/ExistingStrokeRiskExternalValidation .Conclusion This study demonstrates the ability to scale up external validation of patient-level prediction models using a collaboration of researchers and a data standardization that enable models to be readily shared across data sites. External validation is necessary to understand the transportability or reproducibility of a prediction model, but without collaborative approaches it can take three or more years for a model to be validated by one independent researcher. In this paper we show it is possible to both scale-up and speed-up external validation by showing how validation can be done across multiple databases in less than 2 months. We recommend that researchers developing new prediction models use the OHDSI network to externally validate their models.


2020 ◽  
Author(s):  
Robert Minařík ◽  
Daniel Žížala ◽  
Anna Juřicová

<p>Legacy soil data arising from traditional soil surveys are an important resource for digital soil mapping. In the Czech Republic, a large-scale (1:10 000) mapping of agricultural land was completed in 1970 after a decade of field investigation mapping. It represents a worldwide unique database of soil samples by its national extent and detail. This study aimed to create a detailed map of soil properties (organic carbon, ph, texture, soil unit) by using state-of-the-art digital soil mapping (DSM) methods. For this purpose we chose four geomorphologically different areas (2440 km<sup>2</sup> in total). A selected ensemble machine learning techniques based on bagging, boosting and stacking with random hyperparameters tuning were used to model each soil property. In addition to soil sample data, a DEM and its derivatives were used as common covariate layers. The models were evaluated using both internal repeated cross-validation and external validation. The best model was used for prediction of soil properties. The accuracy of prediction models is comparable with other studies. The resulting maps were also compared with the available original soil maps of the Czech Republic. The new maps reveal more spatial detail and natural variability of soil properties resulting from the use of DEM. This combination of high detailed legacy data with DSM results in the production of more spatially detailed and accurate maps, which may be particularly beneficial in supporting the decision-making of stakeholders.</p><p>The research has been supported by the project no. QK1820389 " Production of actual detailed maps of soil properties in the Czech Republic based on database of Large-scale Mapping of Agricultural Soils in Czechoslovakia and application of digital soil mapping" funding by Ministry of Agriculture of the Czech Republic.</p>


2021 ◽  
Author(s):  
Benjamin S. Wessler ◽  
Jason Nelson ◽  
Jinny G. Park ◽  
Hannah McGinnes ◽  
Gaurav Gulati ◽  
...  

AbstractBackgroundThere are many clinical prediction models (CPMs) available to inform treatment decisions for patients with cardiovascular disease. However, the extent to which they have been externally tested and how well they generally perform has not been broadly evaluated.MethodsA SCOPUS citation search was run on March 22, 2017 to identify external validations of cardiovascular CPMs in the Tufts PACE CPM Registry. We assessed the extent of external validation, performance heterogeneity across databases, and explored factors associated with model performance, including a global assessment of the clinical relatedness between the derivation and validation data.Results2030 external validations of 1382 CPMs were identified. 807 (58%) of the CPMs in the Registry have never been externally validated. On average there were 1.5 validations per CPM (range 0-94). The median external validation AUC was 0.73 (25th −75th percentile [IQR] 0.66, 0.79), representing a median percent decrease in discrimination of −11.1% (IQR −32.4%, +2.7%) compared to performance on derivation data. 81% (n = 1333) of validations reporting AUC showed discrimination below that reported in the derivation dataset. 53% (n = 983) of the validations report some measure of CPM calibration. For CPMs evaluated more than once, there was typically a large range of performance. Of 1702 validations classified by relatedness, the percent change in discrimination was −3.7% (IQR −13.2, 3.1) for ‘closely related’ validations (n=123), −9.0 (IQR −27.6, 3.9) for ‘related validations’ (n=862) and −17.2% (IQR −42.3, 0) for ‘distantly related’ validations (n=717) (p<0.001).ConclusionMany published cardiovascular CPMs have never been externally validated and for those that have, apparent performance during development is often overly optimistic. A single external validation appears insufficient to broadly understand the performance heterogeneity across different settings.


2021 ◽  
Author(s):  
Sara Khalid ◽  
Cynthia Yang ◽  
Clair Blacketer ◽  
Talita Duarte-Salles ◽  
Sergio Fernández-Bertolín ◽  
...  

Background and ObjectiveAs a response to the ongoing COVID-19 pandemic, several prediction models have been rapidly developed, with the aim of providing evidence-based guidance. However, no COVID-19 prediction model in the existing literature has been found to be reliable. Models are commonly assessed to have a risk of bias, often due to insufficient reporting, use of non-representative data, and lack of large-scale external validation. In this paper, we present the Observational Health Data Sciences and Informatics (OHDSI) analytics pipeline for patient-level prediction as a standardized approach for rapid yet reliable development and validation of prediction models. We demonstrate how our analytics pipeline and open-source software can be used to answer important prediction questions while limiting potential causes of bias (e.g., by validating phenotypes, specifying the target population, performing large-scale external validation and publicly providing all analytical source code).MethodsWe show step-by-step how to implement the pipeline for the question: ‘In patients hospitalized with COVID-19, what is the risk of death 0 to 30 days after hospitalization’. We develop models using six different machine learning methods in a US claims database containing over 20,000 COVID-19 hospitalizations and externally validate the models using data containing over 45,000 COVID-19 hospitalizations from South Korea, Spain, and the US.ResultsOur open-source tools enabled us to efficiently go end-to-end from problem design to reliable model development and evaluation. When predicting death in patients hospitalized for COVID-19 adaBoost, random forest, gradient boosting machine, and decision tree yielded similar or lower internal and external validation discrimination performance compared to L1-regularized logistic regression, whereas the MLP neural network consistently resulted in lower discrimination. L1-regularized logistic regression models were well calibrated.ConclusionOur results show that following the OHDSI analytics pipeline for patient-level prediction can enable the rapid development towards reliable prediction models. The OHDSI tools and pipeline are open source and available to researchers around the world.


2021 ◽  
Vol 11 ◽  
Author(s):  
Chuan Liu ◽  
Chuan Hu ◽  
Jiale Huang ◽  
Kanghui Xiang ◽  
Zhi Li ◽  
...  

BackgroundAmong colon cancer patients, liver metastasis is a commonly deadly phenomenon, but there are few prognostic models for these patients.MethodsThe clinicopathologic data of colon cancer with liver metastasis (CCLM) patients were downloaded from the Surveillance, Epidemiology and End Results (SEER) database. All patients were randomly divided into training and internal validation sets based on the ratio of 7:3. A prognostic nomogram was established with Cox analysis in the training set, which was validated by two independent validation sets.ResultsA total of 5,700 CCLM patients were included. Age, race, tumor size, tumor site, histological type, grade, AJCC N status, carcinoembryonic antigen (CEA), lung metastasis, bone metastasis, surgery, and chemotherapy were independently associated with the overall survival (OS) of CCLM in the training set, which were used to establish a nomogram. The AUCs of 1-, 2- and 3-year were higher than or equal to 0.700 in the training, internal validation, and external validation sets, indicating the favorable effects of our nomogram. Besides, whether in overall or subgroup analysis, the risk score calculated by this nomogram can divide CCLM patients into high-, middle- and low-risk groups, which suggested that the nomogram can significantly determine patients with different prognosis and is suitable for different patients.ConclusionHigher age, the race of black, larger tumor size, higher grade, histological type of mucinous adenocarcinoma and signet ring cell carcinoma, higher N stage, RCC, lung metastasis, bone metastasis, without surgery, without chemotherapy, and elevated CEA were independently associated with poor prognosis of CCLM patients. A nomogram incorporating the above variables could accurately predict the prognosis of CCLM.


Sign in / Sign up

Export Citation Format

Share Document