scholarly journals Claims-Based Approach to Predict Cause-Specific Survival in Men With Prostate Cancer

2019 ◽  
pp. 1-7 ◽  
Author(s):  
Paul Riviere ◽  
Christopher Tokeshi ◽  
Jiayi Hou ◽  
Vinit Nalawade ◽  
Reith Sarkar ◽  
...  

PURPOSE Treatment decisions about localized prostate cancer depend on accurate estimation of the patient’s life expectancy. Current cancer and noncancer survival models use a limited number of predefined variables, which could restrict their predictive capability. We explored a technique to create more comprehensive survival prediction models using insurance claims data from a large administrative data set. These data contain substantial information about medical diagnoses and procedures, and thus may provide a broader reflection of each patient’s health. METHODS We identified 57,011 Medicare beneficiaries with localized prostate cancer diagnosed between 2004 and 2009. We constructed separate cancer survival and noncancer survival prediction models using a training data set and assessed performance on a test data set. Potential model inputs included clinical and demographic covariates, and 8,971 distinct insurance claim codes describing comorbid diseases, procedures, surgeries, and diagnostic tests. We used a least absolute shrinkage and selection operator technique to identify predictive variables in the final survival models. Each model’s predictive capacity was compared with existing survival models with a metric of explained randomness (ρ2) ranging from 0 to 1, with 1 indicating an ideal prediction. RESULTS Our noncancer survival model included 143 covariates and had improved survival prediction (ρ2 = 0.60) compared with the Charlson comorbidity index (ρ2 = 0.26) and Elixhauser comorbidity index (ρ2 = 0.26). Our cancer-specific survival model included nine covariates, and had similar survival predictions (ρ2 = 0.71) to the Memorial Sloan Kettering prediction model (ρ2 = 0.68). CONCLUSION Survival prediction models using high-dimensional variable selection techniques applied to claims data show promise, particularly with noncancer survival prediction. After further validation, these analyses could inform clinical decisions for men with prostate cancer.

Open Medicine ◽  
2019 ◽  
Vol 14 (1) ◽  
pp. 593-606 ◽  
Author(s):  
Yi-Ting Lin ◽  
Michael Tian-Shyug Lee ◽  
Yen-Chun Huang ◽  
Chih-Kuang Liu ◽  
Yi-Tien Li ◽  
...  

AbstractResearch has failed to resolve the dilemma experienced by localized prostate cancer patients who must choose between radical prostatectomy (RP) and external beam radiotherapy (RT). Because the Charlson Comorbidity Index (CCI) is a measurable factor that affects survival events, this research seeks to validate the potential of the CCI to improve the accuracy of various prediction models. Thus, we employed the Cox proportional hazard model and machine learning methods, including random forest (RF) and support vector machine (SVM), to model the data of medical records in the National Health Insurance Research Database (NHIRD). In total, 8581 individuals were enrolled, of whom 4879 had received RP and 3702 had received RT. Patients in the RT group were older and exhibited higher CCI scores and higher incidences of some CCI items. Moderate-to-severe liver disease, dementia, congestive heart failure, chronic pulmonary disease, and cerebrovascular disease all increase the risk of overall death in the Cox hazard model. The CCI-reinforced SVM and RF models are 85.18% and 81.76% accurate, respectively, whereas the SVM and RF models without the use of the CCI are relatively less accurate, at 75.81% and 74.83%, respectively. Therefore, CCI and some of its items are useful predictors of overall and prostate-cancer-specific survival and could constitute valuable features for machine-learning modeling.


2011 ◽  
Vol 41 (10) ◽  
pp. 1928-1935 ◽  
Author(s):  
Xiongqing Zhang ◽  
Yuancai Lei ◽  
Quang V. Cao ◽  
Xinmei Chen ◽  
Xianzhao Liu

The tree mortality model plays an important role in simulating stand dynamic processes. Past work has shown that the disaggregation method was successful in improving tree survival prediction. This method was used in this study to forecast tree survival probability of Chinese pine (Pinus tabulaeformis Carrière) in Beijing. Outputs from the tree survival model were adjusted from either the stand-level model prediction or the combined estimator from the forecast combination method. Our results show that the disaggregation approach improved the performance of tree survival models. We also showed that stand-level prediction played a crucial role in refining outputs from a tree survival model, especially when it is a very simple model. Because the forecast combination method produced better stand-level prediction, we prefer the use of this method in conjunction with the disaggregation approach, even though the performance gain in using the forecast combination method shown for this data set was modest.


2019 ◽  
pp. 109442811987745
Author(s):  
Hans Tierens ◽  
Nicky Dries ◽  
Mike Smet ◽  
Luc Sels

Multilevel paradigms have permeated organizational research in recent years, greatly advancing our understanding of organizational behavior and management decisions. Despite the advancements made in multilevel modeling, taking into account complex hierarchical structures in data remains challenging. This is particularly the case for models used for predicting the occurrence and timing of events and decisions—often referred to as survival models. In this study, the authors construct a multilevel survival model that takes into account subjects being nested in multiple environments—known as a multiple-membership structure. Through this article, the authors provide a step-by-step guide to building a multiple-membership survival model, illustrating each step with an application on a real-life, large-scale, archival data set. Easy-to-use R code is provided for each model-building step. The article concludes with an illustration of potential applications of the model to answer alternative research questions in the organizational behavior and management fields.


2020 ◽  
Author(s):  
Dongyan Ding ◽  
Tingyuan Lang ◽  
Dongling Zou ◽  
Jiawei Tan ◽  
Jia Chen ◽  
...  

Abstract Backgroud: Accurately forecasting the prognosis could improve therapeutic management of cancer patients, however, the currently used clinical features are difficult to provide enought information. The purpose of this study is to develop a survival prediction model for cervical cancer patients with big data and machine learning algorithms. Results: The cancer genome atlas cervical cancer data, including the expression of 1046 microRNAs and the clinical information of 309 cervical and endocervical cancer and 3 control samples, were downloaded. Missing values and outliers imputation, samples normalization, log transformation and features scaling were performed for preprocessing and 3 control, 2 metastatic samples and 707 microRNAs with missing values ≥ 20% were excluded. By Cox Proportional-Hazards analysis, 55 prognosis-related microRNAs (20 positively and 35 negatively correlated with survival) were identified. K-means clustering analysis showed that the cervical cancer samples can be separated into two and three subgroups with top 20 identified survival-related microRNAs for best stratification. By Support Vector Machine algorithm, two prediction models were developed which can segment the patients into two and three groups with different survival rate, respectively. The models exhibite high performance : for two classes, Area under the curve = 0.976 (training set), 0.972 (test set), 0.974 (whole data set); for three classes, AUC = 0.983, 0.996 and 0.991 (group1, 2 and 3 in training set), 0.955, 0.989 and 0.991 (group 1, 2 and 3 in test set), 0.974, 0.993 and 0.991 (group 1, 2 and 3 in whole data set) .Conclusion: The survival prediction models for cervical cancer were developed. The patients with very low survival rate (≤ 40%) can be separated by the three classes prediction model first. The rest patients can be identified by the two classes prediction model as high survival rate (≈ 75%) and low survival rate (≈ 50%).


2019 ◽  
Vol 37 (15_suppl) ◽  
pp. 6556-6556 ◽  
Author(s):  
Smita Agrawal ◽  
Vivek Vaidya ◽  
Prajwal Chandrashekaraiah ◽  
Hemant Kulkarni ◽  
Li Chen ◽  
...  

6556 Background: Survival prediction models for lung cancer patients could help guide their care and therapy decisions. The objectives of this study were to predict probability of survival beyond 90, 180 and 360 days from any point in a lung cancer patient’s journey. Methods: We developed a Gradient Boosting model (XGBoost) using data from 55k lung cancer patients in the ASCO CancerLinQ database that used 3958 unique variables including Dx and Rx codes, biomarkers, surgeries and lab tests from ≤1 year prior to the prediction point, which was chosen at random for each patient. We used 40% data for training, 25% for hyper-parameter tuning, 20% for testing and 15% for holdout validation. Death date available in the Electronic Health Record was cross checked by linkage to death registries. Results: The model was validated on the holdout set of 8,468 patients. The Area Under the Curve (AUC) for the model was 0.79. The precision and recall for predicting survival beyond the three time points were between 0.7-0.8 and 0.8-0.9 respectively (see table). This compares favourably to other lung cancer survival models created using different machine learning techniques (Jochems 2017, Dekker 2009). A Cox-PH model created using the top 20 variables also had a significantly lower performance (see table). Analysis of input variables yielded distinctive patterns for patient subgroups and time points. Tumor status, medications, lab values and functional status were found to be significant in patient sub cohorts. Conclusions: An AI model to predict survival of lung cancer patients built using a large real world dataset yielded high accuracy. This general model can further be used to predict survival of sub cohorts stratified by variables such as stage or various treatment effects. Such a model could be useful for assessing patient risk and treatment options, evaluating cost and quality of care or determining clinical trial eligibility. [Table: see text]


Author(s):  
Chen Ji ◽  
Terry P Brown ◽  
Scott J Booth ◽  
Claire Hawkes ◽  
Jerry P Nolan ◽  
...  

Abstract Aims The out-of-hospital cardiac arrest (OHCA) outcomes project is a national research registry. One of its aims is to explore sources of variation in OHCA survival outcomes. This study reports the development and validation of risk prediction models for return of spontaneous circulation (ROSC) at hospital handover and survival to hospital discharge. Methods and results The study included OHCA patients who were treated during 2014 and 2015 by emergency medical services (EMS) from seven English National Health Service ambulance services. The 2014 data were used to identify important variables and to develop the risk prediction models, which were validated using the 2015 data. Model prediction was measured by area under the curve (AUC), Hosmer–Lemeshow test, Cox calibration regression, and Brier score. All analyses were conducted using mixed-effects logistic regression models. Important factors included age, gender, witness/bystander cardiopulmonary resuscitation (CPR) combined, aetiology, and initial rhythm. Interaction effects between witness/bystander CPR with gender, aetiology and initial rhythm and between aetiology and initial rhythm were significant in both models. The survival model achieved better discrimination and overall accuracy compared with the ROSC model (AUC = 0.86 vs. 0.67, Brier score = 0.072 vs. 0.194, respectively). Calibration tests showed over- and under-estimation for the ROSC and survival models, respectively. A sensitivity analysis individually assessing Index of Multiple Deprivation scores and location in the final models substantially improved overall accuracy with inconsistent impact on discrimination. Conclusion Our risk prediction models identified and quantified important pre-EMS intervention factors determining survival outcomes in England. The survival model had excellent discrimination.


2019 ◽  
Vol 26 (1) ◽  
pp. 8-20 ◽  
Author(s):  
Yi Guo ◽  
Jiang Bian ◽  
Francois Modave ◽  
Qian Li ◽  
Thomas J George ◽  
...  

Cancer is the second leading cause of death in the United States. To improve cancer prognosis and survival rates, a better understanding of multi-level contributory factors associated with cancer survival is needed. However, prior research on cancer survival has primarily focused on factors from the individual level due to limited availability of integrated datasets. In this study, we sought to examine how data integration impacts the performance of cancer survival prediction models. We linked data from four different sources and evaluated the performance of Cox proportional hazard models for breast, lung, and colorectal cancers under three common data integration scenarios. We showed that adding additional contextual-level predictors to survival models through linking multiple datasets improved model fit and performance. We also showed that different representations of the same variable or concept have differential impacts on model performance. When building statistical models for cancer outcomes, it is important to consider cross-level predictor interactions.


2003 ◽  
Vol 21 (24) ◽  
pp. 4568-4571 ◽  
Author(s):  
Michael W. Kattan ◽  
Michael J. Zelefsky ◽  
Patrick A. Kupelian ◽  
Daniel Cho ◽  
Peter T. Scardino ◽  
...  

Purpose: There are several nomograms for the patient considering radiation therapy for clinically localized prostate cancer. Because of the questionable clinical implications of prostate-specific antigen (PSA) recurrence, its use as an end point has been criticized in several of these nomograms. The goal of this study was to create and to externally validate a nomogram for predicting the probability that a patient will develop metastasis within 5 years after three-dimensional conformal radiation therapy (CRT). Patients and Methods: We conducted a retrospective, nonrandomized analysis of 1,677 patients treated with three-dimensional CRT at Memorial Sloan-Kettering Cancer Center (MSKCC) from 1988 to 2000. Clinical parameters examined were pretreatment PSA level, clinical stage, and biopsy Gleason sum. Patients were followed until their deaths, and the time at which they developed metastasis was noted. A nomogram for predicting the 5-year probability of developing metastasis was constructed from the MSKCC cohort and validated using the Cleveland Clinic series of 1,626 patients. Results: After three-dimensional CRT, 159 patients developed metastasis. At 5 years, 11% of patients experienced metastasis by cumulative incidence analysis (95% CI, 9% to 13%). A nomogram constructed from the data gathered from these men showed an excellent ability to discriminate among patients in an external validation data set, as shown by a concordance index of 0.81. Conclusion: A nomogram with reasonable accuracy and discrimination has been constructed and validated using an external data set to predict the probability that a patient will experience metastasis within 5 years after three-dimensional CRT.


Sign in / Sign up

Export Citation Format

Share Document