Ascertainment of the number of samples in the validation set in Monte Carlo cross validation and the selection of model dimension with Monte Carlo cross validation

2006 ◽  
Vol 82 (1-2) ◽  
pp. 83-89 ◽  
Author(s):  
Yi Ping Du ◽  
Sumaporn Kasemsumran ◽  
Katsuhiko Maruo ◽  
Takehiro Nakagawa ◽  
Yukihiro Ozaki
2020 ◽  
Vol 8 (Suppl 3) ◽  
pp. A699-A699
Author(s):  
Wolfgang Beck ◽  
Tracy Rose ◽  
Matthew Milowsky ◽  
William Kim ◽  
Jeff Klomp ◽  
...  

BackgroundUrothelial cancer patients treated with immune checkpoint inhibitor (ICI) therapy have varied response and survival.1 Clinical and immunogenomic biomarkers could help predict ICI response and survival to inform decisions about patient selection for ICI treatment.MethodsThe association of clinical metadata and immunogenomic signatures with response and survival was analyzed in a set of 347 urothelial cancer patients treated with the PD-L1 inhibitor atezolizumab as part of the IMVigor210 study.1 Data were divided into a discovery set (2/3 of patients) and validation set (1/3 of patients). We analyzed as potential predictors 70 total variables, of which 16 were clinical metadata and 54 were immunogenomic signatures. Categorical variables were converted to dummy variables (89 total variables: 35 clinical, 54 immunogenomic). Using the discovery set, elastic net regression with Monte Carlo cross-validation was used to build optimal models for response (logistic regression) and survival (Cox proportional-hazards). Model performance was evaluated using the validation set.ResultsIn the optimal model of response, 17 variables (10 clinical, 7 immunogenomic) were selected as informative predictors, including Baseline Eastern Cooperative Oncology Group (ECOG) Score = 0, Neoantigen Burden, Lymph Node Metastases, and Tumor Mutation Burden (figure 1). The final model predicted patient response with good performance (Area Under Curve = 0.828, pAUC = 2.38e-3; True Negative Rate = 91.7%, True Positive Rate = 87.5%, pconfusion matrix = 0.0252). In the optimal model of survival, 32 variables (17 clinical, 15 immunogenomic) were selected as informative predictors, including baseline ECOG Score = 0, IC Level 2+, Race = Asian, and Consensus Tumor Subtype = Neuroendocrine (figure 2). The final model predicted patient survival with good performance (c-indexmodel = 0.652, pc-index = 0.0290).Abstract 662 Figure 1Elastic Net Logistic Regression with Monte Carlo Cross-Validation to Predict Response to Atezolizumab in Urothelial Cancer. (A) Predictive variables with beta coefficient 95% confidence intervals that exclude 0, derived from Monte Carlo cross-validation. (B) Confusion matrix of actual vs. predicted response data in the validation set. (C) Total response proportions of actual and predicted response data in the validation setAbstract 662 Figure 2Elastic Net Cox Proportional-Hazards Regression with Monte Carlo Cross-Validation to Predict Survival. (A) Predictor variables with beta coefficient 95% confidence intervals that exclude 0, derived from Monte Carlo cross-validation. (B) Predictions vs. survival outcomes in the validation set. (C) Loess models of density curves for survival outcomes in the validation set. 95% confidence intervals were generated through bootstrapping with replacement. (D) Loess fit of predictions vs. survival outcomes in the validation set. 95% confidence interval indicates strength of fitConclusionsModels incorporating clinical metadata and immunogenomic signatures can predict response and survival for urothelial cancer patients treated with atezolizumab. Among predictors in those models, baseline performance status is the greatest and most positive predictor of response and survival.ReferenceMariathasan S, Turley S, Nickles D, et al. TGFβ attenuates tumour response to PD-L1 blockade by contributing to exclusion of T cells. Nature 2018;554:544–548.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
A. Wong ◽  
Z. Q. Lin ◽  
L. Wang ◽  
A. G. Chung ◽  
B. Shen ◽  
...  

AbstractA critical step in effective care and treatment planning for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the cause for the coronavirus disease 2019 (COVID-19) pandemic, is the assessment of the severity of disease progression. Chest x-rays (CXRs) are often used to assess SARS-CoV-2 severity, with two important assessment metrics being extent of lung involvement and degree of opacity. In this proof-of-concept study, we assess the feasibility of computer-aided scoring of CXRs of SARS-CoV-2 lung disease severity using a deep learning system. Data consisted of 396 CXRs from SARS-CoV-2 positive patient cases. Geographic extent and opacity extent were scored by two board-certified expert chest radiologists (with 20+ years of experience) and a 2nd-year radiology resident. The deep neural networks used in this study, which we name COVID-Net S, are based on a COVID-Net network architecture. 100 versions of the network were independently learned (50 to perform geographic extent scoring and 50 to perform opacity extent scoring) using random subsets of CXRs from the study, and we evaluated the networks using stratified Monte Carlo cross-validation experiments. The COVID-Net S deep neural networks yielded R$$^2$$ 2 of $$0.664 \pm 0.032$$ 0.664 ± 0.032 and $$0.635 \pm 0.044$$ 0.635 ± 0.044 between predicted scores and radiologist scores for geographic extent and opacity extent, respectively, in stratified Monte Carlo cross-validation experiments. The best performing COVID-Net S networks achieved R$$^2$$ 2 of 0.739 and 0.741 between predicted scores and radiologist scores for geographic extent and opacity extent, respectively. The results are promising and suggest that the use of deep neural networks on CXRs could be an effective tool for computer-aided assessment of SARS-CoV-2 lung disease severity, although additional studies are needed before adoption for routine clinical use.


2017 ◽  
Vol 12 (1) ◽  
pp. 473-480
Author(s):  
Han-Qing Cai ◽  
Shi-Hong Lv ◽  
Chun-Jing Shi

AbstractObjectiveTo explore potential functional biomarkers in diabetes mellitus (DM) by utilizing gene pathway cross-talk.MethodsFirstly, potential disrupted pathways that were enriched by differentially expressed genes (DEGs) were identified based on biological pathways downloaded from the Ingenuity Pathways Analysis (IPA) database. In addition, we quantified the pathway crosstalk for each pair of pathways based on Discriminating Score (DS). Random forest (RF) classification was then employed to find the top 10 pairs of pathways with a high area under the curve (AUC) value between DM samples versus normal samples based on 10-fold cross-validation. Finally, a Monte Carlo Cross-Validation was applied to demonstrate the identified pairs of pathways by a mutual information analysis.ResultsA total of 247 DEGs in normal and disease samples were identified. Based on the F-test, 50 disrupted pathways were obtained with false discovery rate (FDR) < 0.01. Simultaneously, after calculating the DS, the top 10 pairs of pathways were selected based on a higher AUC value as measured by RF classification. From the Monte Carlo Cross-Validation, we considered the top 10 pairs of pathways with higher AUC values ranked for all 50 bootstraps as the most frequently detected ones.ConclusionThe pairs of pathways identified in our study might be key regulators in DM.


Sign in / Sign up

Export Citation Format

Share Document