scholarly journals Updates to data versions and analytic methods influence the reproducibility of results from epigenome-wide association studies

2021 ◽  
Author(s):  
Alexandre A. Lussier ◽  
Yiwen Zhu ◽  
Brooke J. Smith ◽  
Andrew J. Simpkin ◽  
Andrew D.A.C. Smith ◽  
...  

ABSTRACTIntroductionBiomedical research has grown increasingly cooperative, with several large consortia compiling and sharing epigenomic data. Since data are typically preprocessed by consortia prior to distribution, the implementation of new pipelines can lead to different versions of the same dataset. Analytic frameworks also constantly evolve to incorporate cutting-edge methods and shifting best practices. However, it remains unknown how differences in data and analytic versions alter the results of epigenome-wide analyses, which has broad implications for the replicability of epigenetic associations. Thus, we assessed the impact of these changes using a subsample of the Avon Longitudinal Study of Parents and Children (ALSPAC) cohort.MethodsWe analyzed two versions of DNA methylation data, processed using separate preprocessing and analytic pipelines, to examine associations between childhood adversity and prenatal smoking exposure on DNA methylation at age 7. We performed two sets of analyses: (1) epigenome-wide association studies (EWAS); (2) Structured Life Course Modeling Approach (SLCMA), a two-stage method that models time-dependent effects. We also compared results from the SLCMA using more recent methodological recommendations.ResultsDifferences between ALSPAC data versions impacted both EWAS and SLCMA analyses, yielding different sets of associations at conventional p-value thresholds. However, the magnitude and direction of associations was generally consistent between data versions, regardless of significance thresholds. Updating the SLCMA analytic version similarly altered top associations, but time-dependent effects remained concordant.ConclusionsChanges to data and analytic versions influenced the results of epigenome-wide studies, particularly when using p-value thresholds as reference points for successful replication and stability.

2021 ◽  
Author(s):  
Thomas Battram ◽  
Paul Yousefi ◽  
Gemma Crawford ◽  
Claire Prince ◽  
Mahsa Sheikhali Babei ◽  
...  

Epigenome-wide association studies (EWAS) seek to quantify associations between traits/exposures and DNA methylation measured at thousands or millions of CpG sites across the genome. In recent years, the increase in availability of DNA methylation measures in population-based cohorts and case-control studies has resulted in a dramatic expansion of the number of EWAS being performed and pub-lished. To make this rich source of results more accessible, we have manually curated a database of CpG-trait associations (with p<1x10-4) from published EWAS, each assaying over 100,000 CpGs in at least 100 individuals. From 2021-01-29, The EWAS Catalog contained 1,045,303 associations from over 1000 EWAS. This includes 652,530 associations from 264 peer-reviewed publications. In addi-tion, it also contains summary statistics for 392,773 associations from 428 EWAS, performed in data from the Avon Longitudinal Study of Parents and Children (ALSPAC) and the Gene Expression Om-nibus (GEO). The database is accompanied by a web-based tool and R package, giving researchers the opportunity to quickly and easily query EWAS associations and gain insight into the molecular under-pinnings of disease as well as the impact of traits and exposures on the DNA methylome. The EWAS Catalog is available at: http://www.ewascatalog.org.


2021 ◽  
Author(s):  
Alexandre A Lussier ◽  
Yiwen Zhu ◽  
Brooke J Smith ◽  
Janine Cerutti ◽  
Andrew Simpkin ◽  
...  

Background: Childhood adversity influences long-term health, particularly if experienced during sensitive periods in development when physiological systems are more responsive to environmental influences. Although the underlying mechanisms remain unclear, prior studies suggest that DNA methylation (DNAm) may capture these time-dependent effects of childhood adversity. However, it remains unknown whether DNAm alterations persist into adolescence and how the timing of adversity might influence DNAm trajectories across development. Methods: We examined the relationship between time-dependent adversity and genome-wide DNAm measured at three waves from birth to adolescence using prospective data from the Avon Longitudinal Study of Parents and Children. We first assessed the relationship between the timing of exposure to seven types of adversity (measured 5-8 times between ages 0-11) and DNAm at age 15 using a structured life course modeling approach. We also characterized the persistence into adolescence of associations identified from age 7 DNAm, as well as the influence of adversity on DNAm trajectories from ages 0-15. Results: Adversity was associated with differences in age 15 DNAm at 24 loci (FDR<0.05). Most loci (19 of 24) were associated with adversity (i.e., physical/sexual abuse, one-adult households, caregiver abuse) that occurred between ages 3-5. Although no DNAm differences present at age 7 persisted into adolescence, we identified seven unique types of DNAm trajectories across development, which highlighted diverse effects of childhood adversity on DNAm. Conclusions: Our results suggest that childhood adversity, particularly between ages 3-5, can influence the trajectories of DNAm across development, exerting both immediate and latent effects on the epigenome.


2017 ◽  
Author(s):  
R.C. Richmond ◽  
M. Suderman ◽  
R. Langdon ◽  
C.L. Relton ◽  
Smith G. Davey

AbstractPrenatal cigarette smoke is an environmental stressor that has a profound effect on DNA methylation in the exposed offspring. We have previously shown that some of these effects persist throughout childhood and into adolescence. Of interest is whether these signals persist into adulthood.We conducted an analysis to investigate associations between reported maternal smoking in pregnancy and DNA methylation in peripheral blood of women in the Avon Longitudinal Study of Parents and Children (ALSPAC) (n=754; mean age 30 years). We observed associations at 15 CpG sites in 11 gene regions, MYO1G, FRMD4A, CYP1A1, CNTNAP2, ARL4C, AHRR, TIFAB, MDM4, AX748264, DRD1, FTO (FDR < 5%). All but two of these CpG sites have previously been identified in relation to prenatal smoke exposure in the offspring at birth and the majority showed persistent hypermethylation among the offspring of smokers.We confirmed that most of these associations were not driven by own smoking and that they were still present 18 years later (N = 656; mean age 48 years). In addition, we replicated findings of a persistent methylation signal related to prenatal smoke exposure in peripheral blood among men in the ALSPAC cohort (N = 230; mean age 53 years). For both participant groups, there was a strong signal of association above that expected by chance at CpG sites previously associated with prenatal smoke exposure in newborns (Wilcoxon rank sum p-value < 2.2 × 10−4). Furthermore, we found that a prenatal smoking score, derived by combining methylation values at these CpG sites, could predict whether the mothers of the ALSPAC women smoked during pregnancy with an AUC 0.69 (95% 0.67, 0.73).


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Darina Czamara ◽  
Elleke Tissink ◽  
Johanna Tuhkanen ◽  
Jade Martins ◽  
Yvonne Awaloff ◽  
...  

AbstractLasting effects of adversity, such as exposure to childhood adversity (CA) on disease risk, may be embedded via epigenetic mechanisms but findings from human studies investigating the main effects of such exposure on epigenetic measures, including DNA methylation (DNAm), are inconsistent. Studies in perinatal tissues indicate that variability of DNAm at birth is best explained by the joint effects of genotype and prenatal environment. Here, we extend these analyses to postnatal stressors. We investigated the contribution of CA, cis genotype (G), and their additive (G + CA) and interactive (G × CA) effects to DNAm variability in blood or saliva from five independent cohorts with a total sample size of 1074 ranging in age from childhood to late adulthood. Of these, 541 were exposed to CA, which was assessed retrospectively using self-reports or verified through social services and registries. For the majority of sites (over 50%) in the adult cohorts, variability in DNAm was best explained by G + CA or G × CA but almost never by CA alone. Across ages and tissues, 1672 DNAm sites showed consistency of the best model in all five cohorts, with G × CA interactions explaining most variance. The consistent G × CA sites mapped to genes enriched in brain-specific transcripts and Gene Ontology terms related to development and synaptic function. Interaction of CA with genotypes showed the strongest contribution to DNAm variability, with stable effects across cohorts in functionally relevant genes. This underscores the importance of including genotype in studies investigating the impact of environmental factors on epigenetic marks.


2021 ◽  
Vol 17 (3) ◽  
pp. e1008819
Author(s):  
Héctor Climente-González ◽  
Christine Lonjou ◽  
Fabienne Lesueur ◽  
Dominique Stoppa-Lyonnet ◽  
Nadine Andrieu ◽  
...  

Genome-wide association studies (GWAS) explore the genetic causes of complex diseases. However, classical approaches ignore the biological context of the genetic variants and genes under study. To address this shortcoming, one can use biological networks, which model functional relationships, to search for functionally related susceptibility loci. Many such network methods exist, each arising from different mathematical frameworks, pre-processing steps, and assumptions about the network properties of the susceptibility mechanism. Unsurprisingly, this results in disparate solutions. To explore how to exploit these heterogeneous approaches, we selected six network methods and applied them to GENESIS, a nationwide French study on familial breast cancer. First, we verified that network methods recovered more interpretable results than a standard GWAS. We addressed the heterogeneity of their solutions by studying their overlap, computing what we called the consensus. The key gene in this consensus solution was COPS5, a gene related to multiple cancer hallmarks. Another issue we observed was that network methods were unstable, selecting very different genes on different subsamples of GENESIS. Therefore, we proposed a stable consensus solution formed by the 68 genes most consistently selected across multiple subsamples. This solution was also enriched in genes known to be associated with breast cancer susceptibility (BLM, CASP8, CASP10, DNAJC1, FGFR2, MRPS30, and SLC4A7, P-value = 3 × 10−4). The most connected gene was CUL3, a regulator of several genes linked to cancer progression. Lastly, we evaluated the biases of each method and the impact of their parameters on the outcome. In general, network methods preferred highly connected genes, even after random rewirings that stripped the connections of any biological meaning. In conclusion, we present the advantages of network-guided GWAS, characterize their shortcomings, and provide strategies to address them. To compute the consensus networks, implementations of all six methods are available at https://github.com/hclimente/gwas-tools.


2021 ◽  
Author(s):  
Sangeetha Muthamilselvan ◽  
Abirami Raghavendran ◽  
Ashok Palaniappan

Abstract Background: Aberrant DNA methylation acts epigenetically to skew the gene transcription rate up or down, with causative roles in the etiology of cancers. However research on the role of DNA methylation in driving the progression of cancers is limited. In this study, we have developed a comprehensive computational framework for the stage-differentiated modelling of DNA methylation landscapes in colorectal cancer (CRC), and unravelled significant stagewise signposts of CRC progression. Methods: The methylation β - matrix was derived from the public-domain TCGA data, converted into M-value matrix, annotated with AJCC stages, and analysed for stage-salient genes using multiple approaches involving stage-differentiated linear modelling of methylation patterns and/or expression patterns. Differentially methylated genes (DMGs) were identified using a contrast against controls (adjusted p-value <0.001 and |log fold-change of M-value| >2). These results were filtered using a series of all possible pairwise stage contrasts (p-value <0.05) to obtain stage-salient DMGs. These were then subjected to a consensus analysis, followed by Kaplan–Meier survival analysis to evaluate the impact of methylation patterns of consensus stage-salient biomarkers on disease prognosis.Results: We found significant genome-wide changes in methylation patterns in cancer cases relative to controls agnostic of stage. Our stage-differentiated analysis yielded the following stage-salient genes: one stage-I gene (FBN1), one stage-II gene (FOXG1), one stage-III gene (HCN1) and four stage-IV genes (NELL1, ZNF135, FAM123A, LAMA1). All the biomarkers were hypermethylated, indicating down-regulation and signifying a CpG island Methylator Phenotype (CIMP) manifestation. A significant prognostic signature consisting of FBN1 and FOXG1 survived all the steps of our analysis pipeline, and represents a novel early-stage biomarker. Conclusions: We have designed a workflow for stage-differentiated consensus analysis, and identified stage-salient diagnostic biomarkers and an early-stage prognostic biomarker panel. Our studies further yield a novel CIMP-like signature of potential clinical import underlying CRC progression.


Blood ◽  
2004 ◽  
Vol 104 (11) ◽  
pp. 926-926 ◽  
Author(s):  
Guido Tricot ◽  
Maureen Reiner ◽  
Jeffrey Sawyer ◽  
John Crowley ◽  
Bart Barlogie

Abstract In acute leukemia prolonged survival is impossible without obtaining a CR. Based on the acute leukemia model, myeloma therapy has gradually been intensified with the aim to increase the CR rate as a first important step to improve overall survival (OS). Although patients with abnormal metaphase cytogenetics have a significantly inferior outcome in terms of event-free and overall survival, the CR rate is similar for patients with and without cytogenetic abnormalities, indicating that CR may not be a good prognostic indicator of ultimate outcome. To address the importance of obtaining a CR for OS, we analyzed our Total Therapy I (VADx3-high dose cyclophosphamide 6g/m2 with stem cell collection-EDAP-melphalan-based tandem transplants-α interferon maintenance) data in those patients who had not received any treatment prior to enrollment (N=155), received at least one transplant (N=135) and were alive one year after the first transplant (N=132). Kaplan-Meier curves were generated using a 1 year landmark to compensate for the guaranteed time of CR patients, but thereby excluding patients who died within the first year after the first autotransplant (N=3). The 1-year landmark was chosen because the large majority of CR patients (75%) had achieved their CR at 1 year after the first transplant. In addition, a time-dependent co-variate analysis for CR was performed, including the 135 patients. The median follow-up of these patients was 10.5 years. The 9 year OS after the landmark, i.e., 10 years after the first transplant, was 41% (95% confidence interval: 26, 55) for CR patients versus 37% (26, 47) for no CR patients (i.e., PR and <PR) with a logrank p value of 0.71 (Figure 1). Using a time-dependent co-variate analysis for CR, achieving a CR was not significantly related to OS (Hazard Ratio: 0.83; p value 0.39). Only the presence of metaphase cytogenetic abnormalities (HR: 2.0; p=0.005), LDH > 190 U/L (upper limit of normal) (HR: 2.0; p=0.01) and CRP >4.0mg/L (HR: 1.6; p=0.03) were significant for OS. When the importance of CR was assessed separately for patients with (N=43) and without (N=84) abnormal cytogenetic (cytogenetic information was missing on 5 patients), no survival benefit for CR patients was seen in either subgroup (p values 0.52 and 0.32, respectively) and similarly, using the time-dependent co-variate analysis for CR, there was no significant benefit for OS of attaining a CR in either group (p value: 0.7 and 0.5, respectively). We conclude that prolonged survival (>10 years) is observed in a substantial proportion of myeloma patients receiving a tandem autotransplant-based regimen, irrespective of the completeness of response to tandem transplants. The inherent genetic features of the myeloma and the impact on the micro-environment of the myeloma cells appear to be more important than the absolute tumor burden reduction accomplished by tandem transplants. Our findings may also be a reflection of the insensitivity of CR as an assessment of remaining tumor burden in myeloma and a new definition of CR may be required. Figure Figure


Blood ◽  
2011 ◽  
Vol 118 (21) ◽  
pp. 3802-3802
Author(s):  
Elias Jabbour ◽  
Hagop M. Kantarjian ◽  
Xuemei Wang ◽  
Lynne V. Abruzzo ◽  
A Megan Cornelison ◽  
...  

Abstract Abstract 3802 Background and Aim: The impact of the CA on prognosis and transformation into acute myeloid leukemia among pts with low and int-1 risk MDS is not known. The aims of the study were to assess the impact of CA on the natural history of pts with lower risk MDS and to identify factors associated with its development. Methods: We reviewed 721 pts clinical records of low and intermediate risk MDS pts from 2000–2010 and conducted a retrospective analysis of all pts with at least two consecutive cytogenetic analysis (365 patients, 51%). The acquisition of CA was defined by structural change or gain in at least 2 metaphases and loss in 3 metaphases, or otherwise confirmed by FISH. Cox proportional hazards regression models were fit to assess the association between transformation-free survival (TFS) or overall survival (OS) and pt characteristics. The acquisition of CA was fitted in the Cox model as a time-dependent covariate. The association between the acquisition of CA and pt characteristics was assessed through univariate and multiple logistic regression models. Results: CA was detected in 107 pts (29%) after a median follow-up of 34 months (mos). CA was observed in a median number of 4 metaphases (range, 2–30). At diagnosis, 21% and 79% of pts who acquired CA were low-and int-1risk MDS; 50% were diploid, 22% harbored chromosome 5 /7 abnormalities. At the time of acquisition of CA, the median percentage of bone marrow blasts was 4% (range, 0% to 89%), the median WBC, hemoglobin and platelets were 3.1 × 109/L, 9.5 g/dL, and 65 × 109/L, respectively; pts were low, int-1, int-2, and high-risk MDS in 3%, 42%, 26%, 29%, respectively. The median TFS and OS were 31 (95% CI: 27– 37) and 34 (95% CI: 30 – 44) mos respectively. Assessing CA as time-dependent covariate, patients with CA had a worse TFS and OS, with a median TFS and OS of 16 and 18 mos compared to 56 and 60 mos, respectively in pts without CA. Based on the multivariable Cox model and after adjusting for effects of all other covariates, pts who had acquired CA had an increased risk of transformation (HR=1.46; p-value = 0.01) or death (HR=1.50; p-value = 0.01). By multivariate analysis, female pts with prior chemotherapy had an increased risk of developing CA (OR= 5.26; p-value <0.0001). 96 pts had history of previous malignancy treated with chemotherapy +/− radiation therapy. Of those, 34 (35%) patients acquired CA. Median time from previous chemotherapy to the acquisition of CA was 61 mos (range, 11 to 180). Pts previously treated who did not acquire CA had similar outcomes to those who had never been treated and did not develop CA, while those who did develop CA whether they were previously treated or not had worse TFS and OS. Conclusion: CA occurs at a rate of 29% of pts with lower risk MDS, more common among pts with previously treated malignancy, and has a significant impact on TFS and OS, possibly reflecting genomic instability in the natural history of MDS. Disclosures: Cortes: BMS: Consultancy, Research Funding; Novartis: Consultancy, Research Funding; Pfizer: Consultancy, Research Funding.


Blood ◽  
2013 ◽  
Vol 122 (21) ◽  
pp. 2406-2406
Author(s):  
Mira Jeong ◽  
Deqiang Sun ◽  
Min Luo ◽  
Yun Huang ◽  
Myunggon Ko ◽  
...  

Abstract Identification of recurrent leukemia-associated mutations in genes encoding regulators of DNA methylation such as DNMT3A and TET2 have underscored the critical importance of DNA methylation in maintenance of normal physiology. To gain insight into how DNA methylation exerts the central role, we sought to determine the genome-wide pattern of DNA methylation in the normal precursors of leukemia cells: the hematopoietic stem cell (HSC), and investigate the factors that affect alterations in DNA methylation and gene expression. We performed whole genome bisulfite sequencing (WGBS) on purified murine HSCs achieving a total of 1,121M reads, resulting in a combined average of 40X coverage. Using Hidden Markov Model we identified 32,325 under-methylated regions (UMRs) with average proportion of methylation ≤ 10% and by inspecting the UMR size distribution, we discovered exceptionally large “methylation Canyons” which span highly conserved domains frequently containing transcription factors and are quite distinct from CpG islands and shores. Methylation Canyons are a distinct genomic feature that is stable, albeit with subtle differences, across cell-types and species. Canyon-associated genes showed a striking pattern of enrichment for genes involved in transcriptional regulation (318 genes, P=6.2 x 10-123), as well as genes containing a homeobox domain (111 genes, P=3.9 x 10-85). We compared Canyons with TF binding sites as identified from more than 150 ChIP-seq data sets across a variety of blood lineages (>10)19 and found that TF binding peaks for 10 HSC pluripotency TFs are significantly enriched in entirety of Canyons compared with their surrounding regions. Low DNA methylation is usually associated with active gene expression. However, half of Canyon genes associated with H3K27me3 showed low or no expression regardless of their H3K4me3 association while H3K4me3-only Canyon genes were highly expressed. Because DNMT3A is mutated in a high frequency of human leukemias24, we examined the impact of loss of Dnmt3a on Canyon size. Upon knockout of Dnmt3a, the edges of the Canyons are hotspots of differential methylation while regions inside of Canyon are relatively resistant. The methylation loss in Dnmt3a KO HSCs led Canyon edge erosion, Canyon size expansion and addition of 861 new Canyons for a total of 1787 Canyons. Canyons marked with H3K4me3 only were most likely to expand after Dnmt3a KO and the canyons marked only with H3K27me3 or with both marks were more likely to contract. This suggests Dnmt3a specifically is acting to restrain Canyon size where active histone marks (and active transcription) are already present. WGBS cannot distinguish between 5mC and 5hmC, so we determined the genome-wide distribution of 5hmC in WT and Dnmt3a KO HSCs using the cytosine-5-methylenesulphonate (CMS)-Seq method in which sodium bisulfate treatment convert 5hmC to CMS; CMS-containing DNA fragments are then immunoprecipitated using a CMS specific antiserum. Strikingly, 5hmC peaks were enriched specifically at the borders of Canyons. In particular, expanding Canyons, typically associated with highest H3K4me3 marking, were highly enriched at the edges for the 5hmC signal suggesting a model in which Tet proteins and Dnmt3a act concomitantly on Canyon borders opposing each other in alternately effacing and restoring methylation at the edges, particularly at sites of active chromatin marks. Using Oncomine data, we tested whether Canyon-associated genes were likely to be associated with hematologic malignancy development and found Canyon genes were highly enriched in seven signatures of genes over-expressed in Leukemia patients compared to normal bone marrow; in contrast, four sets of control genes were not similarly enriched. Further using TCGA data, we found that expressed canyon genes are significantly enriched for differentially expressed genes between patients with and without DNMT3A mutation (p value<0.05) Overall, 76 expressed canyon genes, including multiple HOX genes, are significantly changed in patients with DNMT3A mutation (p=0.0031). Methylation Canyons, the novel epigenetic landscape we describe may provide a mechanism for the regulation of hematopoiesis and may contribute to leukemia development. Disclosures: No relevant conflicts of interest to declare.


2020 ◽  
Vol 12 (1) ◽  
Author(s):  
H. Toinét Cronjé ◽  
Hannah R. Elliott ◽  
Cornelie Nienaber-Rousseau ◽  
Marlien Pieters

Abstract Background DNA methylation is associated with non-communicable diseases (NCDs) and related traits. Methylation data on continental African ancestries are currently scarce, even though there are known genetic and epigenetic differences between ancestral groups and a high burden of NCDs in Africans. Furthermore, the degree to which current literature can be extrapolated to the understudied African populations, who have limited resources to conduct independent large-scale analysis, is not yet known. To this end, this study examines the reproducibility of previously published epigenome-wide association studies of DNA methylation conducted in different ethinicities, on factors related to NCDs, by replicating findings in 120 South African Batswana men aged 45 to 88 years. In addition, novel associations between methylation and NCD-related factors are investigated using the Illumina EPIC BeadChip. Results Up to 86% of previously identified epigenome-wide associations with NCD-related traits (alcohol consumption, smoking, body mass index, waist circumference, C-reactive protein, blood lipids and age) overlapped with those observed here and a further 13% were directionally consistent. Only 1% of the replicated associations presented with effects opposite to findings in other ancestral groups. The majority of these inconcistencies were associated with population-specific genomic variance. In addition, we identified eight new 450K array CpG associations not previously reported in other ancestries, and 11 novel EPIC CpG associations with alcohol consumption. Conclusions The successful replication of existing EWAS findings in this African population demonstrates that blood-based 450K EWAS findings from commonly investigated ancestries can largely be extrapolated to ethnicities for which epigenetic data are not yet available. Possible population-specific differences in 14% of the tested associations do, however, motivate the need to include a diversity of ethnic groups in future epigenetic research. The novel associations found with the enhanced coverage of the Illumina EPIC array support its usefulness to expand epigenetic literature.


Sign in / Sign up

Export Citation Format

Share Document