scholarly journals Machine learning modeling of genome-wide copy number alteration signatures reliably predicts IDH mutational status in adult diffuse glioma

2021 ◽  
Vol 9 (1) ◽  
Author(s):  
Nicholas Nuechterlein ◽  
Linda G. Shapiro ◽  
Eric C. Holland ◽  
Patrick J. Cimino

AbstractKnowledge of 1p/19q-codeletion and IDH1/2 mutational status is necessary to interpret any investigational study of diffuse gliomas in the modern era. While DNA sequencing is the gold standard for determining IDH mutational status, genome-wide methylation arrays and gene expression profiling have been used for surrogate mutational determination. Previous studies by our group suggest that 1p/19q-codeletion and IDH mutational status can be predicted by genome-wide somatic copy number alteration (SCNA) data alone, however a rigorous model to accomplish this task has yet to be established. In this study, we used SCNA data from 786 adult diffuse gliomas in The Cancer Genome Atlas (TCGA) to develop a two-stage classification system that identifies 1p/19q-codeleted oligodendrogliomas and predicts the IDH mutational status of astrocytic tumors using a machine-learning model. Cross-validated results on TCGA SCNA data showed near perfect classification results. Furthermore, our astrocytic IDH mutation model validated well on four additional datasets (AUC = 0.97, AUC = 0.99, AUC = 0.95, AUC = 0.96) as did our 1p/19q-codeleted oligodendroglioma screen on the two datasets that contained oligodendrogliomas (MCC = 0.97, MCC = 0.97). We then retrained our system using data from these validation sets and applied our system to a cohort of REMBRANDT study subjects for whom SCNA data, but not IDH mutational status, is available. Overall, using genome-wide SCNAs, we successfully developed a system to robustly predict 1p/19q-codeletion and IDH mutational status in diffuse gliomas. This system can assign molecular subtype labels to tumor samples of retrospective diffuse glioma cohorts that lack 1p/19q-codeletion and IDH mutational status, such as the REMBRANDT study, recasting these datasets as validation cohorts for diffuse glioma research.

2019 ◽  
Author(s):  
Quanhua Mu ◽  
Jiguang Wang

AbstractCopy number alteration (CNA), the abnormal number of copies of genomic regions, plays a key role in cancer initiation and progression. Current high-throughput CNA detection methods, including DNA arrays and genomic sequencing, are relatively expensive and require DNA samples at a microgram level, which are not achievable in certain occasions such as clinical biopsies or single-cell genomes. Here we proposed an alternative method—CNAPE to computationally infer CNA using gene expression data. A prior knowledge-aided machine learning model was proposed, trained and tested on the transcriptomic profiles with matched CNA data of 9,740 cancers from The Cancer Genome Atlas. Using brain tumors as a proof-of-concept study, CNAPE achieved over 90% accuracy in the prediction of arm-level CNAs. Prediction performance for 12 gene-level CNAs (commonly altered genes in glioma) was also evaluated, and CNAPE achieved reasonable accuracy. CNAPE is developed as an easy-to-use tool at http://wang-lab.ust.hk/software/Software.html.


BMC Cancer ◽  
2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Ramon Gonzalez Manzano ◽  
Ana Catalan-Latorre ◽  
Antonio Brugarolas

Abstract Background Muscle invasive urothelial bladder carcinoma (MIBC) present RB1 and TP53 somatic alterations in a variable percentage of tumors throughout all molecular subtypes. MIBCs with neuroendocrine features have a high response rate to immunity checkpoint inhibitors (ICIs). Whether the presence of somatic co-alterations in these 2 genes in MIBCs is relevant to their responsiveness to ICIs is not known. Methods The potential correlation of different genomic biomarkers of response to ICIs like tumor mutational burden (TMB), single nucleotide variants (SNV) predicted neoantigens, DNA damage response (DDR) genes, DNA somatic signatures and TILs infiltrate was explored in patients with somatic co-alterations in RB1 and TP53 (RB1&TP53) as compared with patients with no alterations in any (double wild type, DWT) or with alterations in just one of the 2 genes. The Cancer Genome Atlas (TCGA) pancancer BLCA dataset of cystectomy specimens (n = 407) with mutation, copy number alterations and transcriptomic (RNA sequencing) data as well as the IMVigor 210 study (n = 348) of metastatic urothelial bladder cancers treated with atezolizumab (PD-L1 inhibitor) with clinical response data containing transcriptomic (RNA sequencing), along with a subset (n = 274) with mutation and copy number data were used for this purpose. A novel tumor microenvironment metascore (TMM) was developed based in a LASSO regularized Cox model with predictive and prognostic ability. Results Samples with co-altered RB1&TP53: a) were enriched in immunity effectors (CD8 cytotoxic lymphocytes, NK cells) and display higher scores of a T cell inflamed signature; b) have a higher TMB, higher number of SNV predicted neoantigens and higher TILs fractions; c) have a higher number of DDR mutated and deep deleted DDR genes; d) have DNA somatic signatures 2 and 13 related to APOBEC mutagenesis. Using the IMVigor 210 dataset, RB1&TP53 samples had the highest response rate to atezolizumab and a strong correlation with TMB and TMM. The consensus molecular subtype classification in the IMVigor 210 dataset showed a significant correlation with both the response to treatment (p = 0.001, Chisquare) and the presence of RB1 and TP53 genomic alterations (p < 0.001, Chisquare). Conclusions RB1&TP53 co-alterations are strongly associated with genomic biomarkers of response to ICIs in MIBCs.


2021 ◽  
Vol 23 (Supplement_6) ◽  
pp. vi120-vi120
Author(s):  
Nicholas Nuechterlein ◽  
Patrick Cimino

Abstract Inactivating mutations in NOTCH1 occur in many cancer types and are frequently observed in IDH-mutant, 1p/19q-codeleted oligodendroglioma. Although the role of NOTCH1 as a tumor suppressor in diffuse glioma has become appreciated in human tissue and small animal models, the spectrum of inactivating mutations in Notch pathway genes in diffuse astrocytic gliomas has not been well described. To address this, we queried the TCGA lower-grade glioma and glioblastoma datasets to establish the extent of inactivation of Notch pathway genes, specifically by cataloging single nucleotide variants and those with copy number loss or deletion. Key alteration frequencies were found to be similar in two-independent glioma cohorts (Col, MSK). Notch pathway genes with inactivating alterations (overwhelmingly copy number loss) were present in 77% of TCGA diffuse gliomas. Across all diffuse gliomas, DLL3 loss was the most common alteration (TCGA 31%). For IDH-mutant diffuse astrocytic gliomas, JAG2 loss was the most common alteration (TCGA 23.0%, Col 35%, MSK 27%). DLL1 loss and MAML1 loss were mutually exclusive (p&lt; 0.001) in TCGA IDH-mutant astrocytomas with a combined frequency of 39% (Col 47%, MSK 56%). The presence of any alteration in the top 10 altered Notch pathway genes indicated a shorter progression-free survival (p = 0.028) for TCGA IDH-mutant diffuse astrocytomas. For IDH-wildtype diffuse astrocytic gliomas, EP300 loss was the most common inactivating alteration (TCGA 35.4%, Col 49%, MSK 38%). EP300 loss, DLL1 loss, DLL4 loss were mutually exclusive (p = 0.006) in TCGA IDH-wildtype diffuse astrocytic gliomas with a combined frequency of 61% (Col 72%, MSK 66%). The presence of alterations in any of these three genes indicated a decreased overall survival (p = 0.045) in TCGA IDH-wildtype diffuse astrocytic gliomas. Overall, loss of differential Notch pathway genes has prognostic implications in both IDH-wildtype and IDH-mutant diffuse astrocytic gliomas.


Author(s):  
Yinglei Lai ◽  
Joseph L. Gastwirth

AbstractCopy number alteration (CNA) data have been collected to study disease related chromosomal amplifications and deletions. The CUSUM procedure and related plots have been used to explore CNA data. In practice, it is possible to observe outliers. Then, modifications of the CUSUM procedure may be required. An outlier reset modification of the CUSUM (ORCUSUM) procedure is developed in this paper. The threshold value for detecting outliers or significant CUSUMs can be derived using results for sums of independent truncated normal random variables. Bartel’s non-parametric test for autocorrelation is also introduced to the analysis of copy number variation data. Our simulation results indicate that the ORCUSUM procedure can still be used even in the situation where the degree of autocorrelation level is low. Furthermore, the results show the outlier’s impact on the traditional CUSUM’s performance and illustrate the advantage of the ORCUSUM’s outlier reset feature. Additionally, we discuss how the ORCUSUM can be applied to examine CNA data with a simulated data set. To illustrate the procedure, recently collected single nucleotide polymorphism (SNP) based CNA data from The Cancer Genome Atlas (TCGA) Research Network is analyzed. The method is applied to a data set collected in an ovarian cancer study. Three cytogenetic bands (cytobands) are considered to illustrate the method. The cytobands 11q13 and 9p21 have been shown to be related to ovarian cancer. They are presented as positive examples. The cytoband 3q22, which is less likely to be disease related, is presented as a negative example. These results illustrate the usefulness of the ORCUSUM procedure as an exploratory tool for the analysis of SNP based CNA data.


2020 ◽  
Vol 22 (11) ◽  
pp. 1602-1613
Author(s):  
Jeanette E Eckel-Passow ◽  
Kristen L Drucker ◽  
Thomas M Kollmeyer ◽  
Matt L Kosel ◽  
Paul A Decker ◽  
...  

Abstract Background Twenty-five germline variants are associated with adult diffuse glioma, and some of these variants have been shown to be associated with particular subtypes of glioma. We hypothesized that additional germline variants could be identified if a genome-wide association study (GWAS) were performed by molecular subtype. Methods A total of 1320 glioma cases and 1889 controls were used in the discovery set and 799 glioma cases and 808 controls in the validation set. Glioma cases were classified into molecular subtypes based on combinations of isocitrate dehydrogenase (IDH) mutation, telomerase reverse transcriptase (TERT) promoter mutation, and 1p/19q codeletion. Logistic regression was applied to the discovery and validation sets to test for associations of variants with each of the subtypes. A meta-analysis was subsequently performed using a genome-wide P-value threshold of 5 × 10−8. Results Nine variants in or near D-2-hydroxyglutarate dehydrogenase (D2HGDH) on chromosome 2 were genome-wide significant in IDH-mutated glioma (most significant was rs5839764, meta P = 2.82 × 10−10). Further stratifying by 1p/19q codeletion status, one variant in D2HGDH was genome-wide significant in IDH-mutated non-codeleted glioma (rs1106639, meta P = 4.96 × 10−8). Further stratifying by TERT mutation, one variant near FAM20C (family with sequence similarity 20, member C) on chromosome 7 was genome-wide significant in gliomas that have IDH mutation, TERT mutation, and 1p/19q codeletion (rs111976262, meta P = 9.56 × 10−9). Thirty-six variants in or near GMEB2 on chromosome 20 near regulator of telomere elongation helicase 1 (RTEL1) were genome-wide significant in IDH wild-type glioma (most significant was rs4809313, meta P = 2.60 × 10−10). Conclusions Performing a GWAS by molecular subtype identified 2 new regions and a candidate independent region near RTEL1, which were associated with specific glioma molecular subtypes.


Sign in / Sign up

Export Citation Format

Share Document