Pan-cancer analysis of differential DNA methylation patterns

Abstract Background DNA methylation is a key epigenetic regulator contributing to cancer development. To understand the role of DNA methylation in tumorigenesis, it is important to investigate and compare differential methylation (DM) patterns between normal and case samples across different cancer types. However, current pan-cancer analyses call DM separately for each cancer, which suffers from lower statistical power and fails to provide a comprehensive view for patterns across cancers. Methods In this work, we propose a rigorous statistical model, PanDM, to jointly characterize DM patterns across diverse cancer types. PanDM uses the hidden correlations in the combined dataset to improve statistical power through joint modeling. PanDM takes summary statistics from separate analyses as input and performs methylation site clustering, differential methylation detection, and pan-cancer pattern discovery. We demonstrate the favorable performance of PanDM using simulation data. We apply our model to 12 cancer methylome data collected from The Cancer Genome Atlas (TCGA) project. We further conduct ontology- and pathway-enrichment analyses to gain new biological insights into the pan-cancer DM patterns learned by PanDM. Results PanDM outperforms two types of separate analyses in the power of DM calling in the simulation study. Application of PanDM to TCGA data reveals 37 pan-cancer DM patterns in the 12 cancer methylomes, including both common and cancer-type-specific patterns. These 37 patterns are in turn used to group cancer types. Functional ontology and biological pathways enriched in the non-common patterns not only underpin the cancer-type-specific etiology and pathogenesis but also unveil the common environmental risk factors shared by multiple cancer types. Moreover, we also identify PanDM-specific DM CpG sites that the common strategy fails to detect. Conclusions PanDM is a powerful tool that provides a systematic way to investigate aberrant methylation patterns across multiple cancer types. Results from real data analyses suggest a novel angle for us to understand the common and specific DM patterns in different cancers. Moreover, as PanDM works on the summary statistics for each cancer type, the same framework can in principle be applied to pan-cancer analyses of other functional genomic profiles. We implement PanDM as an R package, which is freely available at http://www.sta.cuhk.edu.hk/YWei/PanDM.html.

Download Full-text

Systematic Investigation of DNA Methylation Associated With Platinum Chemotherapy Resistance Across 13 Cancer Types

Frontiers in Pharmacology ◽

10.3389/fphar.2021.616529 ◽

2021 ◽

Vol 12 ◽

Author(s):

Ruizheng Sun ◽

Chao Du ◽

Jiaxin Li ◽

Yanhong Zhou ◽

Wei Xiong ◽

...

Keyword(s):

Dna Methylation ◽

Systematic Investigation ◽

Support Vector ◽

Platinum Resistance ◽

Chemotherapy Response ◽

Aberrant Methylation ◽

Cancer Type ◽

Cancer Types ◽

Methylation Patterns ◽

Platinum Chemotherapy

Background: Platinum resistance poses a significant problem for oncology clinicians. As a result, the role of epigenetics and DNA methylation in platinum-based chemoresistance has gained increasing attention from researchers in recent years. A systematic investigation of aberrant methylation patterns related to platinum resistance across various cancer types is urgently needed.Methods: We analyzed the platinum chemotherapy response-related methylation patterns from different perspectives of 618 patients across 13 cancer types and integrated transcriptional and clinical data. Spearman’s test was used to evaluate the correlation between methylation and gene expression. Cox analysis, the Kaplan-Meier method, and log-rank tests were performed to identify potential risk biomarkers based on differentially methylated positions (DMPs) and compare survival based on DMP values. Support vector machines and receiver operating characteristic curves were used to identify the platinum-response predictive DMPs.Results: A total of 3,703 DMPs (p value < 0.001 and absolute delta beta >0.10) were identified, and the DMP numbers of each cancer type varied. A total of 39.83% of DMPs were hypermethylated and 60.17% were hypomethylated in platinum-resistant patients. Among them, 405 DMPs (Benjamini and Hochberg adjusted p value < 0.05) were found to be associated with prognosis in tumor patients treated with platinum-based regimens, and 664 DMPs displayed the potential to predict platinum chemotherapy response. In addition, we defined six DNA DMPs consisting of four gene members (mesothelin, protein kinase cAMP-dependent type II regulatory subunit beta, msh homeobox 1, and par-6 family cell polarity regulator alpha) that may have favorable prognostic and predictive values for platinum chemotherapy.Conclusion: The methylation-transcription axis exists and participates in the complex biological mechanism of platinum resistance in various cancers. Six DMPs and four associated genes may have the potential to serve as promising epigenetic biomarkers for platinum-based chemotherapy and guide clinical selection of optimal treatment.

Download Full-text

Exploration of the MSI landscape in Chinese pan-cancer patient by Next-Generation Sequencing.

Journal of Clinical Oncology ◽

10.1200/jco.2021.39.15_suppl.e14576 ◽

2021 ◽

Vol 39 (15_suppl) ◽

pp. e14576-e14576

Author(s):

Xinlu Liu ◽

Jiasheng Xu ◽

Jian Sun ◽

Deng Wei ◽

Xinsheng Zhang ◽

...

Keyword(s):

Colorectal Cancer ◽

Correlation Analysis ◽

Cancer Patients ◽

Cancer Type ◽

Multiple Tumor ◽

Treatment Plans ◽

Cancer Types ◽

Chinese Cancer Patients ◽

Colorectal Cancer Patients ◽

Pan Cancer

e14576 Background: Clinically, MSI had been used as an important molecular marker for the prognosis of colorectal cancer and other solid tumors and the formulation of adjuvant treatment plans, and it had been used to assist in the screening of Lynch syndrome. However, there were currently few reports on the incidence of MSI-H in Chinese pan-cancer patients. This study described the occurrence of MSI in a large multi-center pan-cancer cohort in China, and explored the correlation between MSI and patients' TMB, age, PD-L1 expression and other indicators. Methods: The study included 8361 patients with 8 cancer types from multiple tumor centers. Use immunohistochemistry to detect the expression of MMR protein (MLH1, MSH2, MSH6 and PMS2) in patients with various cancer types to determine the MSI status and detect the expression of PD-L1 in patients. Through NGS technology, 831 genes of 8361 Chinese cancer patients were sequenced and the tumor mutation load of the patients was calculated. The MSI mutations of patients in 8 cancer types were analyzed and the correlation between MSI mutations of patients and the patient's age, TMB and PD-L1 expression was analyzed. Results: The test results showed that MSI patients accounted for 1.66% of pan-cancers. Among them, MSI-H patients accounted for the highest proportion in intestinal cancer, reaching 7.2%. The correlation analysis between MSI and TMB was performed on patients of various cancer types. The results showed that: in each cancer type, MSI-H patients had TMB greater than 10, and 26.83% of MSI-H patients had TMB greater than 100 in colorectal cancer patients. The result of correlation analysis showed that there was no significant correlation between the patient's age and the risk of MSI mutation ( P> 0.05). In addition to PAAD and LUAD, the expression of PD-L1 in MSI-H patients was higher than that in MSS patients in other cancer types( P< 0.05). The correlation analysis between PD-L1 expression and TMB in patients found that in colorectal cancer, the higher the expression of PD-L1, the higher the patient's TMB ( P< 0.05). Conclusions: In this study, we explored the incidence of MSI-H in pan-cancer patients in China and found that the TMB was greater than 10 in patients with MSI-H. Compared with MSS patients, MSI-H patients have higher PD-L1 expression, and the higher the PD-L1 expression in colorectal cancer, the higher the TMB value of patients.

Download Full-text

Multi-omic data helps improve prediction of personalised tumor suppressors and oncogenes

10.1101/2022.01.13.476163 ◽

2022 ◽

Author(s):

Malvika Sudhakar ◽

Raghunathan Rengaswamy ◽

Karthik Raman

Keyword(s):

Tumour Progression ◽

Suppressor Gene ◽

Tumour Suppressor Gene ◽

Driver Mutations ◽

Cancer Type ◽

Expression Data ◽

Multiple Cancer ◽

Driver Genes ◽

Cancer Types ◽

Omic Data

The progression of tumorigenesis starts with a few mutational and structural driver events in the cell. Various cohort-based computational tools exist to identify driver genes but require a large number of samples to produce reliable results. Many studies use different methods to identify driver mutations/genes from mutations that have no impact on tumour progression; however, a small fraction of patients show no mutational events in any known driver genes. Current unsupervised methods map somatic and expression data onto a network to identify the perturbation in the network. Our method is the first machine learning model to classify genes as tumour suppressor gene (TSG), oncogene (OG) or neutral, thus assigning the functional impact of the gene in the patient. In this study, we develop a multi-omic approach, PIVOT (Personalised Identification of driVer OGs and TSGs), to train on experimentally or computationally validated mutational and structural driver events. Given the lack of any gold standards for the identification of personalised driver genes, we label the data using four strategies and, based on classification metrics, show gene-based labelling strategies perform best. We build different models using SNV, RNA, and multi-omic features to be used based on the data available. Our models trained on multi-omic data improved predictions compared to mutation and expression data, achieving an accuracy >0.99 for BRCA, LUAD and COAD datasets. We show network and expression-based features contribute the most to PIVOT. Our predictions on BRCA, COAD and LUAD cancer types reveal commonly altered genes such as TP53, and PIK3CA, which are predicted drivers for multiple cancer types. Along with known driver genes, our models also identify new driver genes such as PRKCA, SOX9 and PSMD4. Our multi-omic model labels both CNV and mutations with a more considerable contribution by CNV alterations. While predicting labels for genes mutated in multiple samples, we also label rare driver events occurring in as few as one sample. We also identify genes with dual roles within the same cancer type. Overall, PIVOT labels personalised driver genes as TSGs and OGs and also identifies rare driver genes. PIVOT is available at https://github.com/RamanLab/PIVOT.

Download Full-text

Systematic identification of non-coding somatic single nucleotide variants associated with altered transcription and DNA methylation in adult and pediatric cancers

NAR Cancer ◽

10.1093/narcan/zcab001 ◽

2021 ◽

Vol 3 (1) ◽

Author(s):

Fengju Chen ◽

Yiqun Zhang ◽

Chad J Creighton

Keyword(s):

Dna Methylation ◽

Brain Tumor ◽

Low Grade ◽

Single Nucleotide Variants ◽

Multiple Cancer ◽

Single Nucleotide ◽

Altered Expression ◽

Wide Range ◽

Mutational Hotspots ◽

Cancer Types

Abstract Whole-genome sequencing combined with transcriptomics can reveal impactful non-coding single nucleotide variants (SNVs) in cancer. Here, we developed an integrative analytical approach that, as a first step, identifies genes altered in expression or DNA methylation in association with nearby somatic SNVs, in contrast to alternative approaches that first identify mutational hotspots. Using genomic datasets from the Pan-Cancer Analysis of Whole Genomes (PCAWG) consortium and the Children's Brain Tumor Tissue Consortium (CBTTC), we identified hundreds of genes and associated CpG islands for which the nearby presence of a non-coding somatic SNV recurrently associated with altered expression or DNA methylation, respectively. Genomic regions upstream or downstream of genes, gene introns and gene untranslated regions were all involved. The PCAWG adult cancer cohort yielded different significant SNV-expression associations from the CBTTC pediatric brain tumor cohort. The SNV-expression associations involved a wide range of cancer types and histologies, as well as potential gain or loss of transcription factor binding sites. Notable genes with SNV-associated increased expression include TERT, COPS3, POLE2 and HDAC2—involving multiple cancer types—MYC, BCL2, PIM1 and IGLL5—involving lymphomas—and CYHR1—involving pediatric low-grade gliomas. Non-coding somatic SNVs show a major role in shaping the cancer transcriptome, not limited to mutational hotspots.

Download Full-text

Dynamic DNA Methylation in Plant Growth and Development

International Journal of Molecular Sciences ◽

10.3390/ijms19072144 ◽

2018 ◽

Vol 19 (7) ◽

pp. 2144 ◽

Cited By ~ 59

Author(s):

Arthur Bartels ◽

Qiang Han ◽

Pooja Nair ◽

Liam Stacey ◽

Hannah Gaynier ◽

...

Keyword(s):

Dna Methylation ◽

Plant Growth ◽

Growth And Development ◽

Genome Stability ◽

Epigenetic Modification ◽

Plant Growth And Development ◽

Complex Dynamic ◽

Dynamic Changes ◽

The Common ◽

Methylation Patterns

DNA methylation is an epigenetic modification required for transposable element (TE) silencing, genome stability, and genomic imprinting. Although DNA methylation has been intensively studied, the dynamic nature of methylation among different species has just begun to be understood. Here we summarize the recent progress in research on the wide variation of DNA methylation in different plants, organs, tissues, and cells; dynamic changes of methylation are also reported during plant growth and development as well as changes in response to environmental stresses. Overall DNA methylation is quite diverse among species, and it occurs in CG, CHG, and CHH (H = A, C, or T) contexts of genes and TEs in angiosperms. Moderately expressed genes are most likely methylated in gene bodies. Methylation levels decrease significantly just upstream of the transcription start site and around transcription termination sites; its levels in the promoter are inversely correlated with the expression of some genes in plants. Methylation can be altered by different environmental stimuli such as pathogens and abiotic stresses. It is likely that methylation existed in the common eukaryotic ancestor before fungi, plants and animals diverged during evolution. In summary, DNA methylation patterns in angiosperms are complex, dynamic, and an integral part of genome diversity after millions of years of evolution.

Download Full-text

Integrative Analysis Reveals Comprehensive Altered Metabolic Genes Linking with Tumor Epigenetics Modification in Pan-Cancer

BioMed Research International ◽

10.1155/2019/6706354 ◽

2019 ◽

Vol 2019 ◽

pp. 1-17 ◽

Cited By ~ 1

Author(s):

Yahui Shi ◽

Jinfen Wei ◽

Zixi Chen ◽

Yuchen Yuan ◽

Xingsong Li ◽

...

Keyword(s):

Gene Expression ◽

Dna Methylation ◽

Histone Acetylation ◽

Epigenetic Modification ◽

Metabolic Reprogramming ◽

The Cancer Genome Atlas ◽

Altered Expression ◽

Metabolic Genes ◽

Cancer Types ◽

Pan Cancer

Background. Cancer cells undergo various rewiring of metabolism and dysfunction of epigenetic modification to support their biosynthetic needs. Although the major features of metabolic reprogramming have been elucidated, the global metabolic genes linking epigenetics were overlooked in pan-cancer. Objectives. Identifying the critical metabolic signatures with differential expressions which contributes to the epigenetic alternations across cancer types is an urgent issue for providing the potential targets for cancer therapy. Method. The differential gene expression and DNA methylation were analyzed by using the 5726 samples data from the Cancer Genome Atlas (TCGA). Results. Firstly, we analyzed the differential expression of metabolic genes and found that cancer underwent overall metabolism reprogramming, which exhibited a similar expression trend with the data from the Gene Expression Omnibus (GEO) database. Secondly, the regulatory network of histone acetylation and DNA methylation according to altered expression of metabolism genes was summarized in our results. Then, the survival analysis showed that high expression of DNMT3B had a poorer overall survival in 5 cancer types. Integrative altered methylation and expression revealed specific genes influenced by DNMT3B through DNA methylation across cancers. These genes do not overlap across various cancer types and are involved in different function annotations depending on the tissues, which indicated DNMT3B might influence DNA methylation in tissue specificity. Conclusions. Our research clarifies some key metabolic genes, ACLY, SLC2A1, KAT2A, and DNMT3B, which are most disordered and indirectly contribute to the dysfunction of histone acetylation and DNA methylation in cancer. We also found some potential genes in different cancer types influenced by DNMT3B. Our study highlights possible epigenetic disorders resulting from the deregulation of metabolic genes in pan-cancer and provides potential therapy in the clinical treatment of human cancer.

Download Full-text

DNA Methylation Changes in Human Papillomavirus-Driven Head and Neck Cancers

Cells ◽

10.3390/cells9061359 ◽

2020 ◽

Vol 9 (6) ◽

pp. 1359 ◽

Cited By ~ 1

Author(s):

Chameera Ekanayake Weeramange ◽

Kai Dun Tang ◽

Sarju Vasani ◽

Julian Langton-Lockton ◽

Liz Kenny ◽

...

Keyword(s):

Dna Methylation ◽

Human Papillomavirus ◽

Head And Neck ◽

Dna Methyltransferase ◽

Long Control Region ◽

Cancer Types ◽

E6 And E7 ◽

Hpv Oncoproteins ◽

Cellular Dna ◽

Methylation Patterns

Disruption of DNA methylation patterns is one of the hallmarks of cancer. Similar to other cancer types, human papillomavirus (HPV)-driven head and neck cancer (HNC) also reveals alterations in its methylation profile. The intrinsic ability of HPV oncoproteins E6 and E7 to interfere with DNA methyltransferase activity contributes to these methylation changes. There are many genes that have been reported to be differentially methylated in HPV-driven HNC. Some of these genes are involved in major cellular pathways, indicating that DNA methylation, at least in certain instances, may contribute to the development and progression of HPV-driven HNC. Furthermore, the HPV genome itself becomes a target of the cellular DNA methylation machinery. Some of these methylation changes appearing in the viral long control region (LCR) may contribute to uncontrolled oncoprotein expression, leading to carcinogenesis. Consistent with these observations, demethylation therapy appears to have significant effects on HPV-driven HNC. This review article comprehensively summarizes DNA methylation changes and their diagnostic and therapeutic indications in HPV-driven HNC.

Download Full-text

Pan-cancer subtyping in a 2D-map shows substructures that are driven by specific combinations of molecular characteristics

Scientific Reports ◽

10.1038/srep24949 ◽

2016 ◽

Vol 6 (1) ◽

Cited By ~ 13

Author(s):

Erdogan Taskesen ◽

Sjoerd M. H. Huisman ◽

Ahmed Mahfouz ◽

Jesse H. Krijthe ◽

Jeroen de Ridder ◽

...

Keyword(s):

Therapy Response ◽

Visual Exploration ◽

Molecular Characteristics ◽

Cancer Type ◽

Breast Cancers ◽

Data Types ◽

Multiple Cancer ◽

Genome Wide ◽

Genome Wide Data ◽

Cancer Types

Abstract The use of genome-wide data in cancer research, for the identification of groups of patients with similar molecular characteristics, has become a standard approach for applications in therapy-response, prognosis-prediction, and drug-development. To progress in these applications, the trend is to move from single genome-wide measurements in a single cancer-type towards measuring several different molecular characteristics across multiple cancer-types. Although current approaches shed light on molecular characteristics of various cancer-types, detailed relationships between patients within cancer clusters are unclear. We propose a novel multi-omic integration approach that exploits the joint behavior of the different molecular characteristics, supports visual exploration of the data by a two-dimensional landscape, and inspection of the contribution of the different genome-wide data-types. We integrated 4,434 samples across 19 cancer-types, derived from TCGA, containing gene expression, DNA-methylation, copy-number variation and microRNA expression data. Cluster analysis revealed 18 clusters, where three clusters showed a complex collection of cancer-types, squamous-cell-carcinoma, colorectal cancers, and a novel grouping of kidney-cancers. Sixty-four samples were identified outside their tissue-of-origin cluster. Known and novel patient subgroups were detected for Acute Myeloid Leukemia’s, and breast cancers. Quantification of the contributions of the different molecular types showed that substructures are driven by specific (combinations of) molecular characteristics.

Download Full-text

Genome-Wide DNA Methylation Analysis Shows Enrichment of Differential Methylation in “Open Seas” and Enhancers and Reveals Hypomethylation in DNMT3A Mutated Cytogenetically Normal AML (CN-AML)

Blood ◽

10.1182/blood.v120.21.653.653 ◽

2012 ◽

Vol 120 (21) ◽

pp. 653-653 ◽

Cited By ~ 2

Author(s):

Ying Qu ◽

Andreas Lennartsson ◽

Verena I. Gaidzik ◽

Stefan Deneberg ◽

Sofia Bengtzén ◽

...

Keyword(s):

Gene Expression ◽

Dna Methylation ◽

Cpg Island ◽

Cpg Islands ◽

Differential Methylation ◽

Methylation Analysis ◽

Cpg Sites ◽

Genome Wide ◽

Genomic Regions ◽

Methylation Patterns

Abstract Abstract 653 DNA methylation is involved in multiple biologic processes including normal cell differentiation and tumorigenesis. In AML, methylation patterns have been shown to differ significantly from normal hematopoietic cells. Most studies of DNA methylation in AML have previously focused on CpG islands within the promoter of genes, representing only a very small proportion of the DNA methylome. In this study, we performed genome-wide methylation analysis of 62 AML patients with CN-AML and CD34 positive cells from healthy controls by Illumina HumanMethylation450K Array covering 450.000 CpG sites in CpG islands as well as genomic regions far from CpG islands. Differentially methylated CpG sites (DMS) between CN-AML and normal hematopoietic cells were calculated and the most significant enrichment of DMS was found in regions more than 4kb from CpG Islands, in the so called open sea where hypomethylation was the dominant form of aberrant methylation. In contrast, CpG islands were not enriched for DMS and DMS in CpG islands were dominated by hypermethylation. DMS successively further away from CpG islands in CpG island shores (up to 2kb from CpG Island) and shelves (from 2kb to 4kb from Island) showed increasing degree of hypomethylation in AML cells. Among regions defined by their relation to gene structures, CpG dinucleotide located in theoretic enhancers were found to be the most enriched for DMS (Chi χ2<0.0001) with the majority of DMS showing decreased methylation compared to CD34 normal controls. To address the relation to gene expression, GEP (gene expression profiling) by microarray was carried out on 32 of the CN-AML patients. Totally, 339723 CpG sites covering 18879 genes were addressed on both platforms. CpG methylation in CpG islands showed the most pronounced anti-correlation (spearman ρ =-0.4145) with gene expression level, followed by CpG island shores (mean spearman rho for both sides' shore ρ=-0.2350). As transcription factors (TFs) have shown to be crucial for AML development, we especially studied differential methylation of an unbiased selection of 1638 TFs. The most enriched differential methylation between CN-AML and normal CD34 positive cells were found in TFs known to be involved in hematopoiesis and with Wilms tumor protein-1 (WT1), activator protein 1 (AP-1) and runt-related transcription factor 1 (RUNX1) being the most differentially methylated TFs. The differential methylation in WT 1 and RUNX1 was located in intragenic regions which were confirmed by pyro-sequencing. AML cases were characterized with respect to mutations in FLT3, NPM1, IDH1, IDH2 and DNMT3A. Correlation analysis between genome wide methylation patterns and mutational status showed statistically significant hypomethylation of CpG Island (p<0.0001) and to a lesser extent CpG island shores (p<0.001) and the presence of DNMT3A mutations. This links DNMT3A mutations for the first time to a hypomethylated phenotype. Further analyses correlating methylation patterns to other clinical data such as clinical outcome are ongoing. In conclusion, our study revealed that non-CpG island regions and in particular enhancers are the most aberrantly methylated genomic regions in AML and that WT 1 and RUNX1 are the most differentially methylated TFs. Furthermore, our data suggests a hypomethylated phenotype in DNMT3A mutated AML. Disclosures: No relevant conflicts of interest to declare.

Download Full-text

Multi-cancer classification; an analysis of neural network complexity

10.1101/2022.01.10.475759 ◽

2022 ◽

Author(s):

James W. Webber ◽

Kevin M. Elias

Keyword(s):

Neural Network ◽

The Body ◽

Cancer Classification ◽

Control Group ◽

Specific Information ◽

Cancer Type ◽

Feed Forward ◽

Cancer Prediction ◽

Cancer Types ◽

Pan Cancer

Background: Cancer identification is generally framed as binary classification, normally discrimination of a control group from a single cancer group. However, such models lack any cancer-specific information, as they are only trained on one cancer type. The models fail to account for competing cancer risks. For example, an ostensibly healthy individual may have any number of different cancer types, and a tumor may originate from one of several primary sites. Pan-cancer evaluation requires a model trained on multiple cancer types, and controls, simultaneously, so that a physician can be directed to the correct area of the body for further testing. Methods: We introduce novel neural network models to address multi-cancer classification problems across several data types commonly applied in cancer prediction, including circulating miRNA expression, protein, and mRNA. In particular, we present an analysis of neural network depth and complexity, and investigate how this relates to classification performance. Comparisons of our models with state-of-the-art neural networks from the literature are also presented. Results: Our analysis evidences that shallow, feed-forward neural net architectures offer greater performance when compared to more complex deep feed-forward, Convolutional Neural Network (CNN), and Graph CNN (GCNN) architectures considered in the literature. Conclusion: The results show that multiple cancers and controls can be classified accurately using the proposed models, across a range of expression technologies in cancer prediction. Impact: This study addresses the important problem of pan-cancer classification, which is often overlooked in the literature. The promising results highlight the urgency for further research.

Download Full-text