scholarly journals Value of External Data in the Extrapolation of Survival Data: A Study Using the NJR Data Set

2018 ◽  
Vol 21 (7) ◽  
pp. 822-829 ◽  
Author(s):  
Mark Pennington ◽  
Richard Grieve ◽  
Jan Van der Meulen ◽  
Neil Hawkins
Keyword(s):  
Data Set ◽  
2019 ◽  
Vol 39 (8) ◽  
pp. 926-938
Author(s):  
Adrian Vickers

Objectives. Uncertainty in survival prediction beyond trial follow-up is highly influential in cost-effectiveness analyses of oncology products. This research provides an empirical evaluation of the accuracy of alternative methods and recommendations for their implementation. Methods. Mature (15-year) survival data were reconstructed from a published database study for “no treatment,” radiotherapy, surgery plus radiotherapy, and surgery in early stage non–small cell lung cancer in an elderly patient population. Censored data sets were created from these data to simulate immature trial data (for 1- to 10-year follow-up). A second data set with mature (9-year) survival data for no treatment was used to extrapolate the predictions from models fitted to the first data set. Six methodological approaches were used to fit models to the simulated data and extrapolate beyond trial follow-up. Model performance was evaluated by comparing the relative difference in mean survival estimates and the absolute error in the difference in mean survival v. the control with those from the original mature survival data set. Results. Model performance depended on the treatment comparison scenario. All models performed reasonably well when there was a small short-term treatment effect, with the Bayesian model coping better with shorter follow-up times. However, in other scenarios, the most flexible Bayesian model that could be estimated in practice appeared to fit the data less well than the models that used the external data separately. Where there was a large treatment effect (hazard ratio = 0.4), models that used external data separately performed best. Conclusions. Models that directly use mature external data can improve the accuracy of survival predictions. Recommendations on modeling strategies are made for different treatment benefit scenarios.


2003 ◽  
Vol 42 (05) ◽  
pp. 564-571 ◽  
Author(s):  
M. Schumacher ◽  
E. Graf ◽  
T. Gerds

Summary Objectives: A lack of generally applicable tools for the assessment of predictions for survival data has to be recognized. Prediction error curves based on the Brier score that have been suggested as a sensible approach are illustrated by means of a case study. Methods: The concept of predictions made in terms of conditional survival probabilities given the patient’s covariates is introduced. Such predictions are derived from various statistical models for survival data including artificial neural networks. The idea of how the prediction error of a prognostic classification scheme can be followed over time is illustrated with the data of two studies on the prognosis of node positive breast cancer patients, one of them serving as an independent test data set. Results and Conclusions: The Brier score as a function of time is shown to be a valuable tool for assessing the predictive performance of prognostic classification schemes for survival data incorporating censored observations. Comparison with the prediction based on the pooled Kaplan Meier estimator yields a benchmark value for any classification scheme incorporating patient’s covariate measurements. The problem of an overoptimistic assessment of prediction error caused by data-driven modelling as it is, for example, done with artificial neural nets can be circumvented by an assessment in an independent test data set.


2021 ◽  
Vol 6 (1) ◽  
Author(s):  
Peter W. Eide ◽  
Seyed H. Moosavi ◽  
Ina A. Eilertsen ◽  
Tuva H. Brunsell ◽  
Jonas Langerud ◽  
...  

AbstractGene expression-based subtypes of colorectal cancer have clinical relevance, but the representativeness of primary tumors and the consensus molecular subtypes (CMS) for metastatic cancers is not well known. We investigated the metastatic heterogeneity of CMS. The best approach to subtype translation was delineated by comparisons of transcriptomic profiles from 317 primary tumors and 295 liver metastases, including multi-metastatic samples from 45 patients and 14 primary-metastasis sets. Associations were validated in an external data set (n = 618). Projection of metastases onto principal components of primary tumors showed that metastases were depleted of CMS1-immune/CMS3-metabolic signals, enriched for CMS4-mesenchymal/stromal signals, and heavily influenced by the microenvironment. The tailored CMS classifier (available in an updated version of the R package CMScaller) therefore implemented an approach to regress out the liver tissue background. The majority of classified metastases were either CMS2 or CMS4. Nonetheless, subtype switching and inter-metastatic CMS heterogeneity were frequent and increased with sampling intensity. Poor-prognostic value of CMS1/3 metastases was consistent in the context of intra-patient tumor heterogeneity.


Circulation ◽  
2015 ◽  
Vol 132 (suppl_3) ◽  
Author(s):  
Luca Marengo ◽  
Wolfgang Ummenhofer ◽  
Gerster Pascal ◽  
Falko Harm ◽  
Marc Lüthy ◽  
...  

Introduction: Agonal respiration has been shown to be commonly associated with witnessed events, ventricular fibrillation, and increased survival during out-of-hospital cardiac arrest. There is little information on incidence of gasping for in-hospital cardiac arrest (IHCA). Our “Rapid Response Team” (RRT) missions were monitored between December 2010 and March 2015, and the prevalence of gasping and survival data for IHCA were investigated. Methods: A standardized extended in-hospital Utstein data set of all RRT-interventions occurring at the University Hospital Basel, Switzerland, from December 13, 2010 until March 31, 2015 was consecutively collected and recorded in Microsoft Excel (Microsoft Corp., USA). Data were analyzed using IBM SPSS Statistics 22.0 (IBM Corp., USA), and are presented as descriptive statistics. Results: The RRT was activated for 636 patients, with 459 having a life-threatening status (72%; 33 missing). 270 patients (59%) suffered IHCA. Ventricular fibrillation or pulseless ventricular tachycardia occurred in 42 patients (16% of CA) and were associated with improved return of spontaneous circulation (ROSC) (36 (97%) vs. 143 (67%; p<0.001)), hospital discharge (25 (68%) vs. 48 (23%; p<0.001)), and discharge with good neurological outcome (Cerebral Performance Categories of 1 or 2 (CPC) (21 (55%) vs. 41 (19%; p<0.001)). Gasping was seen in 128 patients (57% of CA; 46 missing) and was associated with an overall improved ROSC (99 (78%) vs. 55 (59%; p=0.003)). In CAs occurring on the ward (154, 57% of all CAs), gasping was associated with a higher proportion of shockable rhythms (11 (16%) vs. 2 (3%; p=0.019)), improved ROSC (62 (90%) vs. 34 (55%; p<0.001)), and hospital discharge (21 (32%) vs. 7 (11%; p=0.006)). Gasping was not associated with neurological outcome. Conclusions: Gasping was frequently observed accompanying IHCA. The faster in-hospital patient access is probably the reason for the higher prevalence compared to the prehospital setting. For CA on the ward without continuous monitoring, gasping correlates with increased shockable rhythms, ROSC, and hospital discharge.


Risks ◽  
2021 ◽  
Vol 9 (11) ◽  
pp. 204
Author(s):  
Chamay Kruger ◽  
Willem Daniel Schutte ◽  
Tanja Verster

This paper proposes a methodology that utilises model performance as a metric to assess the representativeness of external or pooled data when it is used by banks in regulatory model development and calibration. There is currently no formal methodology to assess representativeness. The paper provides a review of existing regulatory literature on the requirements of assessing representativeness and emphasises that both qualitative and quantitative aspects need to be considered. We present a novel methodology and apply it to two case studies. We compared our methodology with the Multivariate Prediction Accuracy Index. The first case study investigates whether a pooled data source from Global Credit Data (GCD) is representative when considering the enrichment of internal data with pooled data in the development of a regulatory loss given default (LGD) model. The second case study differs from the first by illustrating which other countries in the pooled data set could be representative when enriching internal data during the development of a LGD model. Using these case studies as examples, our proposed methodology provides users with a generalised framework to identify subsets of the external data that are representative of their Country’s or bank’s data, making the results general and universally applicable.


2021 ◽  
Author(s):  
shenglan li ◽  
Zhuang Kang ◽  
jinyi Chen ◽  
Can Wang ◽  
Zehao Cai ◽  
...  

Abstract Background Medulloblastoma is a common intracranial tumor among children. In recent years, research on cancer genome has established four distinct subtypes of medulloblastoma: WNT, SHH, Group3, and Group4. Each subtype has its own transcriptional profile, methylation changes, and different clinical outcomes. Treatment and prognosis also vary depending on the subtype. Methods Based on the methylation data of medulloblastoma samples, methylCIBERSORT was used to evaluate the level of immune cell infiltration in medulloblastoma samples and identified 10 kinds of immune cells with different subtypes. Combined with the immune database, 293 Imm-DEGs were screened. Imm-DEGs were used to construct the co-expression network, and the key modules related to the level of differential immune cell infiltration were identified. Three immune hub genes (GAB1, ABL1, CXCR4) were identified according to the gene connectivity and the correlation with phenotype in the key modules, as well as the PPI network involved in the genes in the modules. Results The subtype marker was recognized according to the immune hub, and the subtype marker was verified in the external data set, the methylation level of immune hub gene among different subtypes was compared and analyzed, at the same time, tissue microarray was used for immunohistochemical verification, and a multi-factor regulatory network of hub gene was constructed. Conclusions Identifying subtype marker is helpful to accurately identify the subtypes of medulloblastoma patients, and can accurately evaluate the treatment and prognosis, so as to improve the overall survival of patients.


Author(s):  
Sven Fuchs ◽  
Graeme Beardsmore ◽  
Paolo Chiozzi ◽  
Orlando Miguel Espinoza-Ojeda ◽  
Gianluca Gola ◽  
...  

Periodic revisions of the Global Heat Flow Database (GHFD) take place under the auspices of the International Heat Flow Commission (IHFC) of the International Association of Seismology and Physics of the Earth's Interior (IASPEI). A growing number of heat-flow values, advances in scientific methods, digitization, and improvements in database technologies all warrant a revision of the structure of the GHFD that was last amended in 1976. We present a new structure for the GHFD, which will provide a basis for a reassessment and revision of the existing global heat-flow data set. The database fields within the new structure are described in detail to ensure a common understanding of the respective database entries. The new structure of the database takes advantage of today's possibilities for data management. It supports FAIR and open data principles, including interoperability with external data services, and links to DOI and IGSN numbers and other data resources (e.g., world geological map, world stratigraphic system, and International Ocean Drilling Program data). Aligned with this publication, a restructured version of the existing database is published, which provides a starting point for the upcoming collaborative process of data screening, quality control and revision. In parallel, the IHFC will work on criteria for a new quality scheme that will allow future users of the database to evaluate the quality of the collated heat-flow data based on specific criteria.


2019 ◽  
Vol 8 (2) ◽  
pp. 231-263 ◽  
Author(s):  
Richard Valliant

Abstract Three approaches to estimation from nonprobability samples are quasi-randomization, superpopulation modeling, and doubly robust estimation. In the first, the sample is treated as if it were obtained via a probability mechanism, but unlike in probability sampling, that mechanism is unknown. Pseudo selection probabilities of being in the sample are estimated by using the sample in combination with some external data set that covers the desired population. In the superpopulation approach, observed values of analysis variables are treated as if they had been generated by some model. The model is estimated from the sample and, along with external population control data, is used to project the sample to the population. The specific techniques are the same or similar to ones commonly employed for estimation from probability samples and include binary regression, regression trees, and calibration. When quasi-randomization and superpopulation modeling are combined, this is referred to as doubly robust estimation. This article reviews some of the estimation options and compares them in a series of simulation studies.


2020 ◽  
Vol 189 (11) ◽  
pp. 1408-1411 ◽  
Author(s):  
Stephen R Cole ◽  
Jessie K Edwards ◽  
Ashley I Naimi ◽  
Alvaro Muñoz

Abstract The Kaplan-Meier (KM) estimator of the survival function imputes event times for right-censored and left-truncated observations, but these imputations are hidden and therefore sometimes unrecognized by applied health scientists. Using a simple example data set and the redistribution algorithm, we illustrate how imputations are made by the KM estimator. We also discuss the assumptions necessary for valid analyses of survival data. Illustrating imputations hidden by the KM estimator helps to clarify these assumptions and therefore may reduce inappropriate inferences.


2017 ◽  
Vol 32 (5) ◽  
pp. 752-770 ◽  
Author(s):  
Tuomas Huikkola ◽  
Marko Kohtamäki

Purpose Drawing on the resource-based view of the firm, this study aims to analyze solution providers’ strategic capabilities that facilitate above-average returns. Design/methodology/approach The study applies a qualitative comparative case method. In addition to an extensive set of secondary data, the results are based on interviews with 35 executives from nine leading industrial solution providers, their strategic customers and suppliers. The analyzed solution providers were identified based on quantitative survey data. Findings By observing six distinctive resources and three strategic business processes, the present study identifies seven strategic capabilities that occur in different phases of solution development and deployment: fleet management capability, technology-development capability, mergers and acquisitions capability, value quantifying capability, project management capability, supplier network management capability and value co-creation capability. Research limitations/implications The study develops a generic model for the strategic capabilities of servitization. Application of the developed model to different contexts would further validate and enhance it. Practical implications Managers can use the developed model to benchmark, identify, build and manage solution providers’ strategic capabilities and associated practices. Originality/value The study develops a valuable conceptual model based on the comparative case data. Case firms were selected for the study based on a representative quantitative data set. The results were verified and triangulated with external data.


Sign in / Sign up

Export Citation Format

Share Document