scholarly journals A Nomogram Combining a Four-Gene Biomarker and Clinical Factors for Predicting Survival of Melanoma

2021 ◽  
Vol 11 ◽  
Author(s):  
Chuan Zhang ◽  
Dan Dang ◽  
Yuqian Wang ◽  
Xianling Cong

BackgroundCurrently there is no effective prognostic indicator for melanoma, the deadliest skin cancer. Thus, we aimed to develop and validate a nomogram predictive model for predicting survival of melanoma.MethodsFour hundred forty-nine melanoma cases with RNA sequencing (RNA-seq) data from TCGA were randomly divided into the training set I (n = 224) and validation set I (n = 225), 210 melanoma cases with RNA-seq data from Lund cohort of Lund University (available in GSE65904) were used as an external test set. The prognostic gene biomarker was developed and validated based on the above three sets. The developed gene biomarker combined with clinical characteristics was used as variables to develop and validate a nomogram predictive model based on 379 patients with complete clinical data from TCGA (Among 470 cases, 91 cases with missing clinical data were excluded from the study), which were randomly divided into the training set II (n = 189) and validation set II (n = 190). Area under the curve (AUC), concordance index (C-index), calibration curve, and Kaplan-Meier estimate were used to assess predictive performance of the nomogram model.ResultsFour genes, i.e., CLEC7A, CLEC10A, HAPLN3, and HCP5 comprise an immune-related prognostic biomarker. The predictive performance of the biomarker was validated using tROC and log-rank test in the training set I (n = 224, 5-year AUC of 0.683), validation set I (n = 225, 5-year AUC of 0.644), and test set I (n = 210, 5-year AUC of 0.645). The biomarker was also significantly associated with improved survival in the training set (P < 0.01), validation set (P < 0.05), and test set (P < 0.001), respectively. In addition, a nomogram combing the four-gene biomarker and six clinical factors for predicting survival in melanoma was developed in the training set II (n = 189), and validated in the validation set II (n = 190), with a concordance index of 0.736 ± 0.041 and an AUC of 0.832 ± 0.071.ConclusionWe developed and validated a nomogram predictive model combining a four-gene biomarker and six clinical factors for melanoma patients, which could facilitate risk stratification and treatment planning.

2021 ◽  
Vol 94 (1117) ◽  
pp. 20200634
Author(s):  
Hang Chen ◽  
Ming Zeng ◽  
Xinglan Wang ◽  
Liping Su ◽  
Yuwei Xia ◽  
...  

Objectives: To identify the value of radiomics method derived from CT images to predict prognosis in patients with COVID-19. Methods: A total of 40 patients with COVID-19 were enrolled in the study. Baseline clinical data, CT images, and laboratory testing results were collected from all patients. We defined that ROIs in the absorption group decreased in the density and scope in GGO, and ROIs in the progress group progressed to consolidation. A total of 180 ROIs from absorption group (n = 118) and consolidation group (n = 62) were randomly divided into a training set (n = 145) and a validation set (n = 35) (8:2). Radiomics features were extracted from CT images, and the radiomics-based models were built with three classifiers. A radiomics score (Rad-score) was calculated by a linear combination of selected features. The Rad-score and clinical factors were incorporated into the radiomics nomogram construction. The prediction performance of the clinical factors model and the radiomics nomogram for prognosis was estimated. Results: A total of 15 radiomics features with respective coefficients were calculated. The AUC values of radiomics models (kNN, SVM, and LR) were 0.88, 0.88, and 0.84, respectively, showing a good performance. The C-index of the clinical factors model was 0.82 [95% CI (0.75–0.88)] in the training set and 0.77 [95% CI (0.59–0.90)] in the validation set. The radiomics nomogram showed optimal prediction performance. In the training set, the C-index was 0.91 [95% CI (0.85–0.95)], and in the validation set, the C-index was 0.85 [95% CI (0.69–0.95)]. For the training set, the C-index of the radiomics nomogram was significantly higher than the clinical factors model (p = 0.0021). Decision curve analysis showed that radiomics nomogram outperformed the clinical model in terms of clinical usefulness. Conclusions: The radiomics nomogram based on CT images showed favorable prediction performance in the prognosis of COVID-19. The radiomics nomogram could be used as a potential biomarker for more accurate categorization of patients into different stages for clinical decision-making process. Advances in knowledge: Radiomics features based on chest CT images help clinicians to categorize the patients of COVID-19 into different stages. Radiomics nomogram based on CT images has favorable predictive performance in the prognosis of COVID-19. Radiomics act as a potential modality to supplement conventional medical examinations.


2021 ◽  
Author(s):  
Xiaobo Wen ◽  
Biao Zhao ◽  
Meifang Yuan ◽  
Jinzhi Li ◽  
Mengzhen Sun ◽  
...  

Abstract Objectives: To explore the performance of Multi-scale Fusion Attention U-net (MSFA-U-net) in thyroid gland segmentation on CT localization images for radiotherapy. Methods: CT localization images for radiotherapy of 80 patients with breast cancer or head and neck tumors were selected; label images were manually delineated by experienced radiologists. The data set was randomly divided into the training set (n=60), the validation set (n=10), and the test set (n=10). Data expansion was performed in the training set, and the performance of the MSFA-U-net model was evaluated using the evaluation indicators Dice similarity coefficient (DSC), Jaccard similarity coefficient (JSC), positive predictive value (PPV), sensitivity (SE), and Hausdorff distance (HD). Results: With the MSFA-U-net model, the DSC, JSC, PPV, SE, and HD indexes of the segmented thyroid gland in the test set were 0.8967±0.0935, 0.8219±0.1115, 0.9065±0.0940, 0.8979±0.1104, and 2.3922±0.5423, respectively. Compared with U-net, HR-net, and Attention U-net, MSFA-U-net showed that DSC increased by 0.052, 0.0376, and 0.0346 respectively; JSC increased by 0.0569, 0.0805, and 0.0433, respectively; SE increased by 0.0361, 0.1091, and 0.0831, respectively; and HD increased by −0.208, −0.1952, and −0.0548, respectively. The test set image results showed that the thyroid edges segmented by the MSFA-U-net model were closer to the standard thyroid delineated by the experts, in comparison with those segmented by the other three models. Moreover, the edges were smoother, over-anti-noise interference was stronger, and oversegmentation and undersegmentation were reduced. Conclusion: The MSFA-U-net model can meet basic clinical requirements and improve the efficiency of physicians' clinical work.


Molecules ◽  
2019 ◽  
Vol 24 (10) ◽  
pp. 2006 ◽  
Author(s):  
Liadys Mora Lagares ◽  
Nikola Minovski ◽  
Marjana Novič

P-glycoprotein (P-gp) is a transmembrane protein that actively transports a wide variety of chemically diverse compounds out of the cell. It is highly associated with the ADMET (absorption, distribution, metabolism, excretion and toxicity) properties of drugs/drug candidates and contributes to decreasing toxicity by eliminating compounds from cells, thereby preventing intracellular accumulation. Therefore, in the drug discovery and toxicological assessment process it is advisable to pay attention to whether a compound under development could be transported by P-gp or not. In this study, an in silico multiclass classification model capable of predicting the probability of a compound to interact with P-gp was developed using a counter-propagation artificial neural network (CP ANN) based on a set of 2D molecular descriptors, as well as an extensive dataset of 2512 compounds (1178 P-gp inhibitors, 477 P-gp substrates and 857 P-gp non-active compounds). The model provided a good classification performance, producing non error rate (NER) values of 0.93 for the training set and 0.85 for the test set, while the average precision (AvPr) was 0.93 for the training set and 0.87 for the test set. An external validation set of 385 compounds was used to challenge the model’s performance. On the external validation set the NER and AvPr values were 0.70 for both indices. We believe that this in silico classifier could be effectively used as a reliable virtual screening tool for identifying potential P-gp ligands.


Blood ◽  
2012 ◽  
Vol 120 (21) ◽  
pp. 197-197
Author(s):  
Ricky D Edmondson ◽  
Shweta S. Chavan ◽  
Christoph Heuck ◽  
Bart Barlogie

Abstract Abstract 197 We and others have used gene expression profiling to classify multiple myeloma into high and low risk groups; here, we report the first combined GEP and proteomics study of a large number of baseline samples (n=85) of highly enriched tumor cells from patients with newly diagnosed myeloma. Peptide expression levels from MS data on CD138-selected plasma cells from a discovery set of 85 patients with newly diagnosed myeloma were used to identify proteins that were linked to short survival (OS < 3 years vs OS ≥ 3 years). The proteomics dataset consisted of intensity values for 11,006 peptides (representing 2,155 proteins), where intensity is the quantitative measure of peptide abundance; Peptide intensities were normalized by Z score transformation and significance analysis of microarray (SAM) was applied resulting in the identification 24 peptides as differentially expressed between the two groups (OS < 3 years vs OS ≥ 3 years), with fold change ≥1.5 and FDR <5%. The 24 peptides mapped to 19 unique proteins, and all were present at higher levels in the group with shorter overall survival than in the group with longer overall survival. An independent SAM analysis with parameters identical to the proteomics analysis (fold change ≥1.5; FDR <5%) was performed with the Affymetrix U133Plus2 microarray chip based expression data. This analysis identified 151 probe sets that were differentially expressed between the two groups; 144 probe sets were present at higher levels and seven at lower levels in the group with shorter overall survival. Comparing the SAM analyses of proteomics and GEP data, we identified nine probe sets, corresponding to seven genes, with increased levels of both protein and mRNA in the short lived group. In order to validate these findings from the discovery experiment we used GEP data from a randomized subset of the TT3 patient population as a training set for determining the optimal cut-points for each of the nine probe sets. Thus, TT3 population was randomized into two sub-populations for the training set (two-thirds of the population; n=294) and test set (one-third of the population; n=147); the Total Therapy 2 (TT2) patient population was used as an additional test set (n=441). A running log rank test was performed on the training set for each of the nine probe sets to determine its optimal gene expression cut-point. The cut-points derived from the training set were then applied to TT3 and TT2 test sets to investigate survival differences for the groups separated by the optimal cutpoint for each probe. The overall survival of the groups was visualized using the method of Kaplan and Meier, and a P-value was calculated (based on log-rank test) to determine whether there was a statistically significant difference in survival between the two groups (P ≤0.05). We performed univariate regression analysis using Cox proportional hazard model with the nine probe sets as variables on the TT3 test set. To identify which of the genes corresponding to these nine probes had an independent prognostic value, we performed a multivariate stepwise Cox regression analysis. wherein CACYBP, FABP5, and IQGAP2 retained significance after competing with the remaining probe sets in the analysis. CACYBP had the highest hazard ratio (HR 2.70, P-value 0.01). We then performed the univariate and multivariate analyses on the TT2 test set where CACYBP, CORO1A, ENO1, and STMN1 were selected by the multivariate analysis, and CACYBP had the highest hazard ratio (HR 1.93, P-value 0.004). CACYBP was the only gene selected by multivariate analyses of both test sets. Disclosures: No relevant conflicts of interest to declare.


2021 ◽  
Vol 11 ◽  
Author(s):  
Yinghao Meng ◽  
Hao Zhang ◽  
Qi Li ◽  
Fang Liu ◽  
Xu Fang ◽  
...  

PurposeTo develop and validate a machine learning classifier based on multidetector computed tomography (MDCT), for the preoperative prediction of tumor–stroma ratio (TSR) expression in patients with pancreatic ductal adenocarcinoma (PDAC).Materials and MethodsIn this retrospective study, 227 patients with PDAC underwent an MDCT scan and surgical resection. We quantified the TSR by using hematoxylin and eosin staining and extracted 1409 arterial and portal venous phase radiomics features for each patient, respectively. Moreover, we used the least absolute shrinkage and selection operator logistic regression algorithm to reduce the features. The extreme gradient boosting (XGBoost) was developed using a training set consisting of 167 consecutive patients, admitted between December 2016 and December 2017. The model was validated in 60 consecutive patients, admitted between January 2018 and April 2018. We determined the XGBoost classifier performance based on its discriminative ability, calibration, and clinical utility.ResultsWe observed low and high TSR in 91 (40.09%) and 136 (59.91%) patients, respectively. A log-rank test revealed significantly longer survival for patients in the TSR-low group than those in the TSR-high group. The prediction model revealed good discrimination in the training (area under the curve [AUC]= 0.93) and moderate discrimination in the validation set (AUC= 0.63). While the sensitivity, specificity, accuracy, positive predictive value, and negative predictive value for the training set were 94.06%, 81.82%, 0.89, 0.89, and 0.90, respectively, those for the validation set were 85.71%, 48.00%, 0.70, 0.70, and 0.71, respectively.ConclusionsThe CT radiomics-based XGBoost classifier provides a potentially valuable noninvasive tool to predict TSR in patients with PDAC and optimize risk stratification.


Author(s):  
Ade Nurhopipah ◽  
Uswatun Hasanah

The performance of classification models in machine learning algorithms is influenced by many factors, one of which is dataset splitting method. To avoid overfitting, it is important to apply a suitable dataset splitting strategy. This study presents comparison of four dataset splitting techniques, namely Random Sub-sampling Validation (RSV), k-Fold Cross Validation (k-FCV), Bootstrap Validation (BV) and Moralis Lima Martin Validation (MLMV). This comparison is done in face classification on CCTV images using Convolutional Neural Network (CNN) algorithm and Support Vector Machine (SVM) algorithm. This study is also applied in two image datasets. The results of the comparison are reviewed by using model accuracy in training set, validation set and test set, also bias and variance of the model. The experiment shows that k-FCV technique has more stable performance and provide high accuracy on training set as well as good generalizations on validation set and test set. Meanwhile, data splitting using MLMV technique has lower performance than the other three techniques since it yields lower accuracy. This technique also shows higher bias and variance values and it builds overfitting models, especially when it is applied on validation set.


Blood ◽  
2016 ◽  
Vol 128 (22) ◽  
pp. 509-509 ◽  
Author(s):  
Matthew J Hartwell ◽  
Umut Ozbek ◽  
Ernst Holler ◽  
Anne S. Renteria ◽  
Pavan R. Reddy ◽  
...  

Abstract No laboratory test can predict non-relapse mortality (NRM) after hematopoietic cellular transplantation (HCT) prior to the onset graft-versus-host disease (GVHD). Recently, we have shown that a signature of three GVHD plasma biomarkers (TNFR1, ST2, and REG3α) can predict response to GVHD therapy and NRM at the onset of clinical GVHD (Levine, Lancet Haem, 2015). Our goal in the current study was to identify a blood biomarker signature that could predict lethal GVHD and six-month NRM well in advance of the onset of GVHD symptoms. Patient samples on day +7 after HCT were obtained from 1,287 patients from 11 HCT centers in the Mount Sinai Acute GVHD International Consortium (MAGIC). Samples from two large centers (n = 929) were combined and randomly assigned to a training set (n = 620) and test set (n = 309). 358 patients from nine others centers constituted an independent validation set. The overall cumulative incidences of 6-month NRM were 11%, 12%, and 13% for the training, test, and validation sets respectively. The incidence of lethal GVHD, defined as death without preceding relapse while under steroid treatment for acute GVHD, were 18%, 24%, and 14% in the same groups, respectively. The median day of GVHD onset was 28 days in the training set and 29 days in the test and validation sets. We measured four GVHD related biomarkers [ST2, REG3α, TNFR1, and IL2Rα] in all samples and used the training set alone to develop competing risks regression models that used all 13 possible combinations of one to four biomarkers to predict 6-month NRM. The best algorithm, which we rigorously confirmed through Monte Carlo cross-validation of 75 different combinations of training sets, included ST2 and REG3α. No combination of one, three, or four biomarkers was superior to the combination of these two biomarkers. The day 7 algorithm identified high risk (HR) and low risk (LR) groups with 6-month NRMs of 28% and 7%, respectively (p<0.001) (Fig 1A). The relapse rates did not differ between risk groups so that overall survival (OS) was 60% for HR and 84% for LR (p<0.001) (Fig 1B). When applied to the test set (Fig 1C/D), the algorithm identified 54/309 (17%) of the patients as HR with an NRM of 33% vs 7% for LR patients (p<0.001) and 6-month OS of 57% and 81% for HR and LR patients, respectively (p<0.001). In the independent validation set (Fig 1 E/F), the algorithm identified 72/358 (20%) of the patients as HR with an NRM of 26% vs 10% for LR patients (p<0.001) and OS of 68% and 85% for HR and LR patients, respectively (p<0.001). High risk patients were three times more likely to die from GVHD than LR patients in each cohort (p<0.001) (Fig 2). The GI tract is the GVHD target organ that is most resistant to treatment and represents a major cause of NRM, and we observed twice as much severe GI GVHD (stage 3 or 4) in HR patients as in LR patients (p<0.001, data not shown). The algorithm successfully separated HR and LR strata for 6 month NRM in several groups with differing risks for GVHD and NRM, including donor type, degree. of HLA-match, age group, and conditioning regimen intensity (Fig 3). In conclusion, we have developed a blood biomarker algorithm that predicts the development of lethal GVHD seven days after HCT, which performed successfully in large multicenter validation sets. The GVH reaction is already in progress by day +7, even though clinical symptoms may not occur until days or weeks later. We speculate that the blood biomarker concentrations at this early time point reflect subclinical GI pathology, a notion that is reinforced by the fact that ST2 and REG3α, the two biomarkers in the algorithm, are closely associated with GI GVHD. The algorithm identified HR and LR strata in several patient groups with different overall risk for lethal GVHD (donor, HLA match, conditioning regimen intensity, age). This day +7 algorithm should prove useful in clinical BMT research by identifying patients at high risk for lethal GVHD who might benefit from aggressive preemptive treatment strategies. Disclosures Chen: Novartis: Research Funding; Incyte Corporation: Consultancy, Membership on an entity's Board of Directors or advisory committees, Research Funding. Jagasia:Therakos: Consultancy. Kitko:Therakos: Honoraria, Speakers Bureau. Kroeger:Novartis: Honoraria, Research Funding. Levine:Viracor: Patents & Royalties: GVHD biomarkers patent. Ferrara:Viracor: Patents & Royalties: GVHD biomarkers patent.


2020 ◽  
Vol 2020 ◽  
pp. 1-19
Author(s):  
Hongxia Ma ◽  
Lihong Tong ◽  
Qian Zhang ◽  
Wenjun Chang ◽  
Fengsen Li

Background. Lung squamous cell carcinoma (LSCC) is a frequently diagnosed cancer worldwide, and it has a poor prognosis. The current study is aimed at developing the prediction of LSCC prognosis by integrating multiomics data including transcriptome, copy number variation data, and mutation data analysis, so as to predict patients’ survival and discover new therapeutic targets. Methods. RNASeq, SNP, CNV data, and LSCC patients’ clinical follow-up information were downloaded from The Cancer Genome Atlas (TCGA), and the samples were randomly divided into two groups, namely, the training set and the validation set. In the training set, the genes related to prognosis and those with different copy numbers or with different SNPs were integrated to extract features using random forests, and finally, robust biomarkers were screened. In addition, a gene-related prognostic model was established and further verified in the test set and GEO validation set. Results. We obtained a total of 804 prognostic-related genes and 535 copy amplification genes, 621 copy deletions genes, and 388 significantly mutated genes in genomic variants; noticeably, these genomic variant genes were found closely related to tumor development. A total of 51 candidate genes were obtained by integrating genomic variants and prognostic genes, and 5 characteristic genes (HIST1H2BH, SERPIND1, COL22A1, LCE3C, and ADAMTS17) were screened through random forest feature selection; we found that many of those genes had been reported to be related to LSCC progression. Cox regression analysis was performed to establish 5-gene signature that could serve as an independent prognostic factor for LSCC patients and can stratify risk samples in training set, test set, and external validation set (p<0.01), and the 5-year survival areas under the curve (AUC) of both training set and validation set were > 0.67. Conclusion. In the current study, 5 gene signatures were constructed as novel prognostic markers to predict the survival of LSCC patients. The present findings provide new diagnostic and prognostic biomarkers and therapeutic targets for LSCC treatment.


2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Sung Yeon Sarah Han ◽  
Jason D. Cooper ◽  
Sureyya Ozcan ◽  
Nitin Rustogi ◽  
Brenda W.J.H. Penninx ◽  
...  

Abstract Individuals with subthreshold depression have an increased risk of developing major depressive disorder (MDD). The aim of this study was to develop a prediction model to predict the probability of MDD onset in subthreshold individuals, based on their proteomic, sociodemographic and clinical data. To this end, we analysed 198 features (146 peptides representing 77 serum proteins (measured using MRM-MS), 22 sociodemographic factors and 30 clinical features) in 86 first-episode MDD patients (training set patient group), 37 subthreshold individuals who developed MDD within two or four years (extrapolation test set patient group), and 86 subthreshold individuals who did not develop MDD within four years (shared reference group). To ensure the development of a robust and reproducible model, we applied feature extraction and model averaging across a set of 100 models obtained from repeated application of group LASSO regression with ten-fold cross-validation on the training set. This resulted in a 12-feature prediction model consisting of six serum proteins (AACT, APOE, APOH, FETUA, HBA and PHLD), three sociodemographic factors (body mass index, childhood trauma and education level) and three depressive symptoms (sadness, fatigue and leaden paralysis). Importantly, the model demonstrated a fair performance in predicting future MDD diagnosis of subthreshold individuals in the extrapolation test set (AUC = 0.75), which involved going beyond the scope of the model. These findings suggest that it may be possible to detect disease indications in subthreshold individuals up to four years prior to diagnosis, which has important clinical implications regarding the identification and treatment of high-risk individuals.


2021 ◽  
Vol 8 ◽  
Author(s):  
Sam Polesie ◽  
Martin Gillstedt ◽  
Gustav Ahlgren ◽  
Hannah Ceder ◽  
Johan Dahlén Gyllencreutz ◽  
...  

Background: Melanomas are often easy to recognize clinically but determining whether a melanoma is in situ (MIS) or invasive is often more challenging even with the aid of dermoscopy. Recently, convolutional neural networks (CNNs) have made significant and rapid advances within dermatology image analysis. The aims of this investigation were to create a de novo CNN for differentiating between MIS and invasive melanomas based on clinical close-up images and to compare its performance on a test set to seven dermatologists.Methods: A retrospective study including clinical images of MIS and invasive melanomas obtained from our department during a five-year time period (2016–2020) was conducted. Overall, 1,551 images [819 MIS (52.8%) and 732 invasive melanomas (47.2%)] were available. The images were randomized into three groups: training set (n = 1,051), validation set (n = 200), and test set (n = 300). A de novo CNN model with seven convolutional layers and a single dense layer was developed.Results: The area under the curve was 0.72 for the CNN (95% CI 0.66–0.78) and 0.81 for dermatologists (95% CI 0.76–0.86) (P &lt; 0.001). The CNN correctly classified 208 out of 300 lesions (69.3%) whereas the corresponding number for dermatologists was 216 (72.0%). When comparing the CNN performance to each individual reader, three dermatologists significantly outperformed the CNN.Conclusions: For this classification problem, the CNN was outperformed by the dermatologist. However, since the algorithm was only trained and validated on 1,251 images, future refinement and development could make it useful for dermatologists in a real-world setting.


Sign in / Sign up

Export Citation Format

Share Document