Integration of public DNA methylation and expression networks via eQTMs improves prediction of functional gene-gene associations

2021 ◽  
Author(s):  
Shuang Li ◽  
Cancan Qi ◽  
Patrick Deelen ◽  
Floranne Boulogne ◽  
Niek de Klein ◽  
...  

Gene co-expression networks can be used to infer functional relationships between genes, but they do not work well for all genes. We investigated whether DNA methylation can provide complementary information for such genes. We first carried out an eQTM meta-analysis of 3,574 gene expression and methylation samples from blood, brain and nasal epithelial brushed cells to identify links between methylated CpG sites and genes. This revealed 6,067 significant eQTM genes, and we observed that histone modification information is predictive of both eQTM direction and presence, enabling us to link many CpG sites to genes. We then generated a co-methylation network - MethylationNetwork - using 27,720 publicly available methylation profiles and integrated it with a public RNA-seq co-expression dataset of 31,499 samples. Here, we observed that MethylationNetwork can identify experimentally validated interacting pairs of genes that could not be identified in the RNA-seq datasets. We then developed a novel integration pipeline based on CCA and used the integrated methylation and gene networks to predict gene pairs reported in the STRING database. The integrated network showed significantly improved prediction performance compared to using a DNA co-methylation or a gene co-expression network alone. This is the first study to integrate data from two -omics layers from unmatched public samples across different tissues and diseases, and our results highlight the issues and potential of integrating public datasets from multiple molecular phenotypes. The eQTMs we identified can be used as an annotation resource for epigenome-wide association, and we believe that our integration pipeline can be used as a framework for future -omics integration analyses of public datasets. We provide supporting materials and results, including the harmonized DNA methylation data from multiple tissues and diseases in https://data.harmjanwestra.nl/comethylation/, the discovered and predicted eQTMs, the corresponding CCA components and the trained prediction models in a Zenodo repository (https://zenodo.org/record/4666994). We provide notebooks to facilitate use of the proposed pipeline in a GitHub repository (https://github.com/molgenis/methylationnetwork).

2019 ◽  
Vol 21 (Supplement_3) ◽  
pp. iii65-iii66
Author(s):  
M Q S Mosella ◽  
T S Sabedot ◽  
T M Malta ◽  
J Rock ◽  
M Felicella ◽  
...  

Abstract BACKGROUND Despite histologically benign, pituitary tumors (PT) may invade important adjacent neurovascular structures which can incur in significant comorbidities preventing a complete surgical resection and contributing to resistance to medical treatment. DNA methylation clearly stratified PT based on their functional status i.e. nonfunctioning PTs (NFPTs) from functioning PT (FPTs). However associations of methylation aberrations with invasive behavior is less clear. MATERIAL AND METHODS In order to evaluate whether DNA methylation alterations in regulatory regions other than promoter and coding regions are associated with invasive behavior we performed a meta-analysis of the genome-wide methylome of three public available PT cohorts plus our own (Illumina HumanMethylation platforms- 450K/EPIC). Pituitary specimens comprised of 43 invasive pituitary tumors (InvPT) and 37 noninvasive (NInvPT); 12 FPT and 68 NFPTs, in addition to 20 non-tumor pituitaries. RNA-seq data were available for one cohort (n=23, 12 InvPT,11NInvPT) and integrated with DNA methylation. Invasiveness criteria was based on Knosp grade >= 2 and/or sphenoid or dural invasion. RESULTS Wilcoxon Rank-sum test; Δβ=0.15; p-value <0.001 identified 58 differentially methylated CpG sites in InvPT that were mainly hypomethylated (95%) in relation to NInvPT. NInvPT methylation profile was similar to non-tumor specimens, despite its heterogeneity. Thirty-four percent (n=20) of the differentially methylated CpG sites were located within predicted enhancer regions distributed in intronic (40%), intergenic (40%) and promoter (20%) regions. Predicted enhancer-target genes were enriched for actin filament cell movement, response to starvation, growth factor stimulus and protein autophosporilation pathways. Among them, ZNF625 and INO80E were found mostly negative correlated among methylation and expression data (-0.50 and -0.48, respectively), besides DOC2A found to be one potentially differentially expressed gene under enhancer control (log2FC > 0.2, pvalue <0.05). CONCLUSION Our results suggest that methylation alterations in predicted regulatory regions, such as enhancers, annotated in non-promoter regions (introns and intergenic) may contribute to the invasive behavior of PT.


2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Maaike de Vries ◽  
◽  
Ivana Nedeljkovic ◽  
Diana A. van der Plaat ◽  
Alexandra Zhernakova ◽  
...  

Abstract Background Active smoking is the main risk factor for COPD. Here, epigenetic mechanisms may play a role, since cigarette smoking is associated with differential DNA methylation in whole blood. So far, it is unclear whether epigenetics also play a role in subjects with COPD who never smoked. Therefore, we aimed to identify differential DNA methylation associated with lung function in never smokers. Methods We determined epigenome-wide DNA methylation levels of 396,243 CpG-sites (Illumina 450 K) in blood of never smokers in four independent cohorts, LifeLines COPD&C (N = 903), LifeLines DEEP (N = 166), Rotterdam Study (RS)-III (N = 150) and RS-BIOS (N = 206). We meta-analyzed the cohort-specific methylation results to identify differentially methylated CpG-sites with FEV1/FVC. Expression Quantitative Trait Methylation (eQTM) analysis was performed in the Biobank-based Integrative Omics Studies (BIOS). Results A total of 36 CpG-sites were associated with FEV1/FVC in never smokers at p-value< 0.0001, but the meta-analysis did not reveal any epigenome-wide significant CpG-sites. Of interest, 35 of these 36 CpG-sites have not been associated with lung function before in studies including subjects irrespective of smoking history. Among the top hits were cg10012512, cg02885771, annotated to the gene LTV1 Ribosome Biogenesis factor (LTV1), and cg25105536, annotated to Kelch Like Family Member 32 (KLHL32). Moreover, a total of 11 eQTMS were identified. Conclusions With the identification of 35 CpG-sites that are unique for never smokers, our study shows that DNA methylation is also associated with FEV1/FVC in subjects that never smoked and therefore not merely related to smoking.


2021 ◽  
Author(s):  
Daniel Osorio ◽  
Marieke Lydia Kuijjer ◽  
James J. Cai

Motivation: Characterizing cells with rare molecular phenotypes is one of the promises of high throughput single-cell RNA sequencing (scRNA-seq) techniques. However, collecting enough cells with the desired molecular phenotype in a single experiment is challenging, requiring several samples preprocessing steps to filter and collect the desired cells experimentally before sequencing. Data integration of multiple public single-cell experiments stands as a solution for this problem, allowing the collection of enough cells exhibiting the desired molecular signatures. By increasing the sample size of the desired cell type, this approach enables a robust cell type transcriptome characterization. Results: Here, we introduce rPanglaoDB, an R package to download and merge the uniformly processed and annotated scRNA-seq data provided by the PanglaoDB database. To show the potential of rPanglaoDB for collecting rare cell types by integrating multiple public datasets, we present a biological application collecting and characterizing a set of 157 fibrocytes. Fibrocytes are a rare monocyte-derived cell type, that exhibits both the inflammatory features of macrophages and the tissue remodeling properties of fibroblasts. This constitutes the first fibrocytes' unbiased transcriptome profile report. We compared the transcriptomic profile of the fibrocytes against the fibroblasts collected from the same tissue samples and confirm their associated relationship with healing processes in tissue damage and infection through the activation of the prostaglandin biosynthesis and regulation pathway. Availability and Implementation: rPanglaoDB is implemented as an R package available through the CRAN repositories https://CRAN.R-project.org/package=rPanglaoDB.


2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Nicholas D. Johnson ◽  
Xiumei Wu ◽  
Christopher D. Still ◽  
Xin Chu ◽  
Anthony T. Petrick ◽  
...  

Abstract Background Non-alcoholic fatty liver disease (NAFLD) is characterized by changes in cell composition that occur throughout disease pathogenesis, which includes the development of fibrosis in a subset of patients. DNA methylation (DNAm) is a plausible mechanism underlying these shifts, considering that DNAm profiles differ across tissues and cell types, and DNAm may play a role in cell-type differentiation. Previous work investigating the relationship between DNAm and fibrosis in NAFLD has been limited by sample size and the number of CpG sites interrogated. Results Here, we performed an epigenome-wide analysis using Infinium MethylationEPIC array data from 325 individuals with NAFLD, including 119 with severe fibrosis and 206 with no histological evidence of fibrosis. After adjustment for latent confounders, we identified 7 CpG sites whose DNAm associated with fibrosis (p < 5.96 × 10–8). Analysis of RNA-seq data collected from a subset of individuals (N = 56) revealed that gene expression at 288 genes associated with DNAm at one or more of the 7 fibrosis-related CpGs. DNAm-based estimates of cell-type proportions showed that estimated proportions of natural killer cells increased, while epithelial cell proportions decreased with disease stage. Finally, we used an elastic net regression model to assess DNAm as a biomarker of fibrotic stage and found that our model predicted fibrosis with a sensitivity of 0.93 and provided information beyond a model based solely on cell-type proportions. Conclusion These findings are consistent with DNAm as a mechanism underpinning or marking fibrosis-related shifts in cell composition and demonstrate the potential of DNAm as a possible biomarker of NAFLD fibrosis.


2019 ◽  
Author(s):  
Cong Liang ◽  
Xiaoqing Yu ◽  
Bo Li ◽  
Y. Ann Chen ◽  
Jose R. Conejo-Garcia ◽  
...  

AbstractUnderstanding of the tumor microenvironment (TME) structure is likely to have a profound and immediate impact on therapeutic interventions as well as the development of signatures for diagnostic and prognostic evaluations. DNA methylation arrays represent one of the most reproducible molecular assays across replicates and studies, but its value of profiling tumor-infiltrating immune lymphocytes (TILs) hasn’t been intensively investigated. Here we report a model-based evaluation of tumor TIL levels using DNA methylation profiles. By employing a hybrid method of stability selection and elastic net, we show that methylation array data in ten TCGA cancer types provide a strikingly accurate prediction of immune cell abundance, in particular the levels of T cells, B cells and cytotoxic cells in skin cutaneous melanoma (SKCM). The immune-informative CpG sites showed significant prognostic values, representing important candidates for further functional validation. Further, we present regression models each using only ten CpG sites to estimate the levels of infiltrated immune cell types in melanoma. To validate these models, we performed matched methylation EPIC array and RNA-seq on 30 new melanoma samples. We observed high concordance on methylation and gene expression predicted tumor immune infiltration levels in our new dataset. Our study demonstrated that DNA methylation data is a valuable resource in reliably evaluating tumor immune responses. The selected methylation panels provide candidate targets for future clinical researches. Our prediction models are easy to implement and will provide reference for future clinical practices.


2019 ◽  
Author(s):  
Alicia K Smith ◽  
Andrew Ratanatharathorn ◽  
Adam X Maihofer ◽  
Robert K Naviaux ◽  
Allison E Aiello ◽  
...  

AbstractDifferences in susceptibility to posttraumatic stress disorder (PTSD) may be related to epigenetic differences between PTSD cases and trauma-exposed controls. Such epigenetic differences may provide insight into the biological processes underlying the disorder. Here we describe the results of the largest DNA methylation meta-analysis of PTSD to date with data from the Psychiatric Genomics Consortium (PGC) PTSD Epigenetics Workgroup. Ten cohorts, military and civilian, contributed blood-derived DNA methylation data (HumanMethylation450 BeadChip) from 1,896 PTSD cases (42%) and trauma-exposed controls (58%). Utilizing a common QC and analysis strategy, we identified ten CpG sites associated with PTSD (9.61E-07<p<4.72E-11) after adjustment for multiple comparisons (FDR<.05). Several CpGs were located in genes previously implicated in PTSD and other psychiatric disorders. The top four CpG sites fell within the aryl-hydrocarbon receptor repressor (AHRR) locus and were associated with lower DNA methylation in PTSD cases relative to controls. Interestingly, this association appeared to uncorrelated with smoking status and was most pronounced in non-smokers with PTSD. Additional evaluation of metabolomics data supported our findings and revealed that AHRR methylation associated with kynurenine levels, which were lower among subjects with PTSD relative to controls. Overall, this study supports epigenetic differences in those with PTSD and suggests a role for decreased kynurenine as a contributor to immune dysregulation in PTSD.


2018 ◽  
Author(s):  
Shan V. Andrews ◽  
Brooke Sheppard ◽  
Gayle C. Windham ◽  
Laura A. Schieve ◽  
Diana E. Schendel ◽  
...  

AbstractBackgroundSeveral reports have suggested a role for epigenetic mechanisms in ASD etiology. Epigenome-wide association studies (EWAS) in autism spectrum disorder (ASD) may shed light on particular biological mechanisms. However, studies of ASD cases versus controls have been limited by post-mortem timing and severely small sample sizes. Reports from in-life sampling of blood or saliva have also been very limited in sample size, and/or genomic coverage. We present the largest case-control EWAS for ASD to date, combining data from population-based case-control and case-sibling pair studies.MethodsDNA from 968 blood samples from children in the Study to Explore Early Development (SEED 1) was used to generate epigenome-wide array DNA methylation (DNAm) data at 485,512 CpG sites for 453 cases and 515 controls, using the Illumina 450K Beadchip. The Simons Simplex Collection (SSC) provided 450K array DNAm data on an additional 343 cases and their unaffected siblings. We performed EWAS meta-analysis across results from the two data sets, with adjustment for sex and surrogate variables that reflect major sources of biological variation and technical confounding such as cell type, batch, and ancestry. We compared top EWAS results to those from a previous brain-based analysis. We also tested for enrichment of ASD EWAS CpGs for being targets of meQTL associations using available SNP genotype data in the SEED sample.FindingsIn this meta-analysis of blood-based DNA from 796 cases and 858 controls, no single CpG met a Bonferroni discovery threshold of p < 1.12×10−7. Seven CpGs showed differences at p < 1×10−5 and 48 at 1×10−4. Of the top 7, 5 showed brain-based ASD associations as well, often with larger effect sizes, and the top 48 overall showed modest concordance (r = 0.31) in direction of effect with cerebellum samples. Finally, we observed suggestive evidence for enrichment of CpG sites controlled by SNPs (meQTL targets) among the EWAS CpGs hits, which was consistent across EWAS and meQTL discovery p-value thresholds.ConclusionsWe report the largest case-control EWAS study of ASD to date. No single CpG site showed a large enough DNAm difference between cases and controls to achieve epigenome-wide significance in this sample size. However, our results suggest the potential to observe disease associations from blood-based samples. Among the 7 sites achieving suggestive statistical significance, we observed consistent, and stronger, effects at the same sites among brain samples. Discovery-oriented EWAS for ASD using blood samples will likely need even larger samples and unified genetic data to further understand DNAm differences in ASD.


2020 ◽  
Author(s):  
Doretta Caramaschi ◽  
James Jungius ◽  
Christian M. Page ◽  
Boris Novakovic ◽  
Richard Saffery ◽  
...  

AbstractStudy questionIs DNA methylation at birth associated with having been conceived by assisted reproductive technologies (ART)?Summary answerThis study shows does not provide strong evidence of an association of conception by ART with variation in infant blood cell DNA methylation.What is known alreadyAssisted reproductive technologies (ART) are procedures used to help infertile/subfertile couples conceive. Due to its importance in gene regulation during early development programming, DNA methylation and its perturbations associated with ART could reveal new insights into the biological effects of ART and potential adverse offspring outcomes.Study designWe investigated the association of DNA methylation and ART using a case-control study design (N=205 ART cases and N=2439 non-ART controls in discovery cohorts; N=149 ART cases and N=58 non-ART controls in replication cohort).Participants/materials, settings, methodWe assessed the association between ART and DNA methylation at birth in cord blood (205 ART conceptions and 2439 naturally conceived controls) at >450000 CpG sites across the genome in two sub-samples of the UK Avon Longitudinal Study of Parents and Children (ALSPAC) and two sub-samples of the Norwegian Mother, Father and Child Cohort Study (MoBa) by meta-analysis. We explored replication of findings in the Australian Clinical review of the Health of adults conceived following Assisted Reproductive Technologies (CHART) study (N=149 ART conceptions and N=58 controls).Main results and the role of chanceThe ALSPAC and MoBa meta-analysis revealed evidence of association between conception by ART and DNA methylation (false-discovery-rate-corrected p-value < 0.05) at 5 CpG sites which are annotated to 2 genes. Methylation at 3 of these sites has been previously linked to cancer, aging, HIV infection and neurological diseases. None of these associations replicated in the CHART cohort. There was evidence of a functional role of ART-induced hypermethylation at CpG sites located within regulatory regions as shown by putative transcription factor binding and chromatin remodelling.Limitations, reasons for cautionsWhile insufficient power is likely, heterogeneity in types of ART and between populations may also contribute. Larger studies might identify replicable variation in DNA methylation at birth due to ART.Wider implications of the findingsART-conceived newborns present with divergent DNA methylation in cord blood white cells. If these associations are true and causal, they might have long-term consequences for offspring health.


Hypertension ◽  
2020 ◽  
Vol 76 (1) ◽  
pp. 195-205 ◽  
Author(s):  
Yisong Huang ◽  
Miina Ollikainen ◽  
Maheswary Muniandy ◽  
Tao Zhang ◽  
Jenny van Dongen ◽  
...  

We conducted an epigenome-wide association study meta-analysis on blood pressure (BP) in 4820 individuals of European and African ancestry aged 14 to 69. Genome-wide DNA methylation data from peripheral leukocytes were obtained using the Infinium Human Methylation 450k BeadChip. The epigenome-wide association study meta-analysis identified 39 BP-related CpG sites with P <1×10 −5 . In silico replication in the CHARGE consortium of 17 010 individuals validated 16 of these CpG sites. Out of the 16 CpG sites, 13 showed novel association with BP. Conversely, out of the 126 CpG sites identified as being associated ( P <1×10 −7 ) with BP in the CHARGE consortium, 21 were replicated in the current study. Methylation levels of all the 34 CpG sites that were cross-validated by the current study and the CHARGE consortium were heritable and 6 showed association with gene expression. Furthermore, 9 CpG sites also showed association with BP with P <0.05 and consistent direction of the effect in the meta-analysis of the Finnish Twin Cohort (199 twin pairs and 4 singletons; 61% monozygous) and the Netherlands Twin Register (266 twin pairs and 62 singletons; 84% monozygous). Bivariate quantitative genetic modeling of the twin data showed that a majority of the phenotypic correlations between methylation levels of these CpG sites and BP could be explained by shared unique environmental rather than genetic factors, with 100% of the correlations of systolic BP with cg19693031 ( TXNIP ) and cg00716257 ( JDP2 ) determined by environmental effects acting on both systolic BP and methylation levels.


2020 ◽  
Vol 11 (1) ◽  
Author(s):  
Alicia K. Smith ◽  
◽  
Andrew Ratanatharathorn ◽  
Adam X. Maihofer ◽  
Robert K. Naviaux ◽  
...  

AbstractEpigenetic differences may help to distinguish between PTSD cases and trauma-exposed controls. Here, we describe the results of the largest DNA methylation meta-analysis of PTSD to date. Ten cohorts, military and civilian, contribute blood-derived DNA methylation data from 1,896 PTSD cases and trauma-exposed controls. Four CpG sites within the aryl-hydrocarbon receptor repressor (AHRR) associate with PTSD after adjustment for multiple comparisons, with lower DNA methylation in PTSD cases relative to controls. Although AHRR methylation is known to associate with smoking, the AHRR association with PTSD is most pronounced in non-smokers, suggesting the result was independent of smoking status. Evaluation of metabolomics data reveals that AHRR methylation associated with kynurenine levels, which are lower among subjects with PTSD. This study supports epigenetic differences in those with PTSD and suggests a role for decreased kynurenine as a contributor to immune dysregulation in PTSD.


Sign in / Sign up

Export Citation Format

Share Document