scholarly journals Probability calibration-based prediction of recurrence rate in patients with diffuse large B-cell lymphoma

2021 ◽  
Vol 14 (1) ◽  
Author(s):  
Shuanglong Fan ◽  
Zhiqiang Zhao ◽  
Yanbo Zhang ◽  
Hongmei Yu ◽  
Chuchu Zheng ◽  
...  

Abstract Background Although many patients receive good prognoses with standard therapy, 30–50% of diffuse large B-cell lymphoma (DLBCL) cases may relapse after treatment. Statistical or computational intelligent models are powerful tools for assessing prognoses; however, many cannot generate accurate risk (probability) estimates. Thus, probability calibration-based versions of traditional machine learning algorithms are developed in this paper to predict the risk of relapse in patients with DLBCL. Methods Five machine learning algorithms were assessed, namely, naïve Bayes (NB), logistic regression (LR), random forest (RF), support vector machine (SVM) and feedforward neural network (FFNN), and three methods were used to develop probability calibration-based versions of each of the above algorithms, namely, Platt scaling (Platt), isotonic regression (IsoReg) and shape-restricted polynomial regression (RPR). Performance comparisons were based on the average results of the stratified hold-out test, which was repeated 500 times. We used the AUC to evaluate the discrimination ability (i.e., classification ability) of the model and assessed the model calibration (i.e., risk prediction accuracy) using the H-L goodness-of-fit test, ECE, MCE and BS. Results Sex, stage, IPI, KPS, GCB, CD10 and rituximab were significant factors predicting the 3-year recurrence rate of patients with DLBCL. For the 5 uncalibrated algorithms, the LR (ECE = 8.517, MCE = 20.100, BS = 0.188) and FFNN (ECE = 8.238, MCE = 20.150, BS = 0.184) models were well-calibrated. The errors of the initial risk estimate of the NB (ECE = 15.711, MCE = 34.350, BS = 0.212), RF (ECE = 12.740, MCE = 27.200, BS = 0.201) and SVM (ECE = 9.872, MCE = 23.800, BS = 0.194) models were large. With probability calibration, the biased NB, RF and SVM models were well-corrected. The calibration errors of the LR and FFNN models were not further improved regardless of the probability calibration method. Among the 3 calibration methods, RPR achieved the best calibration for both the RF and SVM models. The power of IsoReg was not obvious for the NB, RF or SVM models. Conclusions Although these algorithms all have good classification ability, several cannot generate accurate risk estimates. Probability calibration is an effective method of improving the accuracy of these poorly calibrated algorithms. Our risk model of DLBCL demonstrates good discrimination and calibration ability and has the potential to help clinicians make optimal therapeutic decisions to achieve precision medicine.

2021 ◽  
Vol 14 (10) ◽  
pp. 101188
Author(s):  
Raoul Santiago ◽  
Johanna Ortiz Jimenez ◽  
Reza Forghani ◽  
Nikesh Muthukrishnan ◽  
Olivier Del Corpo ◽  
...  

2015 ◽  
Vol 14 (11) ◽  
pp. 2947-2960 ◽  
Author(s):  
Sally J. Deeb ◽  
Stefka Tyanova ◽  
Michael Hummel ◽  
Marc Schmidt-Supprian ◽  
Juergen Cox ◽  
...  

2020 ◽  
Vol 38 (15_suppl) ◽  
pp. 8047-8047
Author(s):  
Selin Merdan ◽  
Kritika Subramanian ◽  
Turgay Ayer ◽  
Jean Louise Koff ◽  
Andres Chang ◽  
...  

8047 Background: The current clinical risk stratification of Diffuse Large B-cell Lymphoma (DLBCL) relies on the International Prognostic Index (IPI) comprising a limited number of clinical variables but is imperfect in the identification of high-risk disease. Our study aimed to: (1) develop a risk prediction model based on the genetic and clinical features; and (2) evaluate the model’s biological implications in association with the estimated profiles of immune infiltration. Methods: Gene-expression profiling was performed on 718 patients with DLBCL for which RNA sequencing data and clinical covariates were available by Reddy et al (2017). Unsupervised and supervised machine learning methods were used to discover and identify the best set of survival-associated gene signatures for prediction. A multivariate model of survival from these signatures was constructed in the training set and validated in an independent test set. The compositions of the tumor-infiltrating immune cells were enumerated using CIBERSORT for deconvolution analysis. Results: A four gene-signature-based score was developed that separated patients into high- and low-risk groups with a significant difference in survival in the training, validation and complete cohorts (p < 0.001), independently of the IPI. The combination of the gene-expression-based score with the IPI improved the discrimination on the validation and complete sets. The area-under-the-curve at 2 and 5 years increased from 0.71 and 0.69 to 0.75 and 0.74 in the validation set, respectively. Conclusions: By analyzing the gene-expression data with a systematic approach, we developed and validated a risk prediction model that outperforms existing risk assessment methods. Our study, which integrated the profiles of immune infiltration with prognostic prediction, unraveled important associations that have the potential to identify patients who could benefit from the various therapeutic interventions, as well as highlighting possible targets for new drugs.


2002 ◽  
Vol 8 (1) ◽  
pp. 68-74 ◽  
Author(s):  
Margaret A. Shipp ◽  
Ken N. Ross ◽  
Pablo Tamayo ◽  
Andrew P. Weng ◽  
Jeffery L. Kutok ◽  
...  

Cancers ◽  
2021 ◽  
Vol 13 (24) ◽  
pp. 6384
Author(s):  
Joaquim Carreras ◽  
Shinichiro Hiraiwa ◽  
Yara Yukie Kikuti ◽  
Masashi Miyaoka ◽  
Sakura Tomita ◽  
...  

Diffuse large B-cell lymphoma (DLBCL) is one of the most frequent subtypes of non-Hodgkin lymphomas. We used artificial neural networks (multilayer perceptron and radial basis function), machine learning, and conventional bioinformatics to predict the overall survival and molecular subtypes of DLBCL. The series included 106 cases and 730 genes of a pancancer immune oncology panel (nCounter) as predictors. The multilayer perceptron predicted the outcome with high accuracy, with an area under the curve (AUC) of 0.98, and ranked all the genes according to their importance. In a multivariate analysis, ARG1, TNFSF12, REL, and NRP1 correlated with favorable survival (hazard risks: 0.3–0.5), and IFNA8, CASP1, and CTSG, with poor survival (hazard risks = 1.0–2.1). Gene set enrichment analysis (GSEA) showed enrichment toward poor prognosis. These high-risk genes were also associated with the gene expression of M2-like tumor-associated macrophages (CD163), and MYD88 expression. The prognostic relevance of this set of 7 genes was also confirmed within the IPI and MYC translocation strata, the EBER-negative cases, the DLBCL not-otherwise specified (NOS) (High-grade B-cell lymphoma with MYC and BCL2 and/or BCL6 rearrangements excluded), and an independent series of 414 cases of DLBCL in Europe and North America (GSE10846). The perceptron analysis also predicted molecular subtypes (based on the Lymph2Cx assay) with high accuracy (AUC = 1). STAT6, TREM2, and REL were associated with the germinal center B-cell (GCB) subtype, and CD37, GNLY, CD46, and IL17B were associated with the activated B-cell (ABC)/unspecified subtype. The GSEA had a sinusoidal-like plot with association to both molecular subtypes, and immunohistochemistry analysis confirmed the correlation of MAPK3 with the GCB subtype in another series of 96 cases (notably, MAPK3 also correlated with LMO2, but not with M2-like tumor-associated macrophage markers CD163, CSF1R, TNFAIP8, CASP8, PD-L1, PTX3, and IL-10). Finally, survival and molecular subtypes were successfully modeled using other machine learning techniques including logistic regression, discriminant analysis, SVM, CHAID, C5, C&R trees, KNN algorithm, and Bayesian network. In conclusion, prognoses and molecular subtypes were predicted with high accuracy using neural networks, and relevant genes were highlighted.


Sign in / Sign up

Export Citation Format

Share Document