scholarly journals Data-Driven-Based Approach to Identifying Differentially Methylated Regions Using Modified 1D Ising Model

2018 ◽  
Vol 2018 ◽  
pp. 1-8 ◽  
Author(s):  
Yuanyuan Zhang ◽  
Shudong Wang ◽  
Xinzeng Wang

Background. DNA methylation is essential for regulating gene expression, and the changes of DNA methylation status are commonly discovered in disease. Therefore, identification of differentially methylation patterns, especially differentially methylated regions (DMRs), in two different groups is important for understanding the mechanism of complex diseases. Few tools exist for DMR identification through considering features of methylation data, but there is no comprehensive integration of the characteristics of DNA methylation data in current methods. Results. Accounting for the characteristics of methylation data, such as the correlation characteristics of neighboring CpG sites and the high heterogeneity of DNA methylation data, we propose a data-driven approach for DMR identification through evaluating the energy of single site using modified 1D Ising model. Applied to both simulated and publicly available datasets, our approach is compared with other popular methods in terms of performance. Simulated results show that our method is more sensitive than competing methods. Applied to the real data, our method can identify more common DMRs than DMRcate, ProbeLasso, and Wang’s methods with a high overlapping ratio. Also, the necessity of integrating the heterogeneity and correlation characteristics in identifying DMR is shown through comparing results with only considering mean or variance signals and without considering relationship of neighboring CpG sites, respectively. Through analyzing the number of DMRs identified in real data located in different genomic regions, we find that about 90% DMRs are located in CGI which always regulates the expression of genes. It may help us understand the functional effect of DNA methylation on disease.

Epigenomics ◽  
2020 ◽  
Vol 12 (9) ◽  
pp. 747-755
Author(s):  
Veronika Suni ◽  
Fatemeh Seyednasrollah ◽  
Bishwa Ghimire ◽  
Sini Junttila ◽  
Asta Laiho ◽  
...  

Aim: DNA methylation is a key epigenetic mechanism regulating gene expression. Identifying differentially methylated regions is integral to DNA methylation analysis and there is a need for robust tools reliably detecting regions with significant differences in their methylation status. Materials & methods: We present here a reproducibility-optimized test statistic (ROTS) for detection of differential DNA methylation from high-throughput sequencing or array-based data. Results: Using both simulated and real data, we demonstrate the ability of ROTS to identify differential methylation between sample groups. Conclusion: Compared with state-of-the-art methods, ROTS shows competitive sensitivity and specificity in detecting consistently differentially methylated regions.


2016 ◽  
Vol 2016 ◽  
pp. 1-7 ◽  
Author(s):  
Jun Li ◽  
Siyuan Li ◽  
Ying Hu ◽  
Guolei Cao ◽  
Siyao Wang ◽  
...  

Objective. We investigated the expression levels of both FOSL2 mRNA and protein as well as evaluating DNA methylation in the blood of type 2 diabetes mellitus (T2DM) Uyghur patients from Xinjiang. This study also evaluated whether FOSL2 gene expression had demonstrated any associations with clinical and biochemical indicators of T2DM. Methods. One hundred Uyghur subjects where divided into two groups, T2DM and nonimpaired glucose tolerance (NGT) groups. DNA methylation of FOSL2 was also analyzed by MassARRAY Spectrometry and methylation data of individual units were generated by the EpiTyper v1.0.5 software. The expression levels of FOS-like antigen 2 (FOSL2) and the protein expression levels were analyzed. Results. Significant differences were observed in mRNA and protein levels when compared with the NGT group, while methylation rates of eight CpG units within the FOSL2 gene were higher in the T2DM group. Methylation of CpG sites was found to inversely correlate with expression of other markers. Conclusions. Results show that a correlation between mRNA, protein, and DNA methylation of FOSL2 gene exists among T2DM patients from Uyghur. FOSL2 protein and mRNA were downregulated and the DNA became hypermethylated, all of which may be involved in T2DM pathogenesis in this population.


Circulation ◽  
2017 ◽  
Vol 135 (suppl_1) ◽  
Author(s):  
Xiaoling Wang ◽  
Yue Pan ◽  
Haidong Zhu ◽  
Guang Hao ◽  
Xin Wang ◽  
...  

Background: Several large-scale epigenome wide association studies on obesity-related DNA methylation changes have been published and in total identified 46 CpG sites. These studies were conducted in middle-aged and older adults of Caucasians and African Americans (AAs) using leukocytes. To what extend these signals are independent of cell compositions as well as to what extend they may influence gene expression have not been systematically investigated. Furthermore, the high prevalence of obesity comorbidities in middle-aged or older population may hide or bias obesity itself related DNA methylation changes. Methods: In this study of healthy AA youth and young adults, genome wide DNA methylation data from leukocytes were obtained from three independent studies: EpiGO study (96 obese cases vs. 92 lean controls, aged 14-21, 50% females, test of interest is obesity status), LACHY study (284 participants from general population, aged 14-18, 50% females, test of interest is BMI), and Georgia Stress and Heart study (298 participants from general population, aged 18-38, 52% females, test of interest is BMI) using the Infinium HumanMethylation450 BeadChip. Genome wide DNA methylation data from purified neutrophils as well as genome wide gene expression data from leukocytes using Illumina HT12 V4 array were also obtained for the EpiGO samples. Results: The meta-analysis on the 3 cohorts identified 76 obesity related CpG sites in leukocytes with p<1х10 -7 . Out of the 46 previously identified CpG sites, 36 can be replicated in this AA youth and young adult sample with same direction and p<0.05. Out of the 107 CpG sites including the 36 replicated ones and the 71 newly identified ones, 71 CpG sites (66%) had their relationship with obesity replicated in purified neutrophils (p<0.05). The analysis on the cis regulation of the 107 CpG sites on gene expression showed that 59 CpG sites had at least one gene within 250kb having expression difference between obese cases and lean controls. Furthermore, out of the 59 CpG sites, 6 showed significantly negative correlations and 1 showed significantly positive correlation with the differentially expressed genes. These CpG sites located in SOCS3, CISH, ABCG1, PIM3 and PTGDS genes. Conclusion: In this study of AA youth and young adults, we identified novel CpG sites associated with obesity and replicated majority of the CpG sites previously identified in middle-aged and older adults. For the first time, we showed that majority of the obesity related CpG sites identified from leukocytes are not driven by cell compositions and provided the direct link between DNA methylation-gene expression-obesity status for 7 CpG sites in 5 genes.


Author(s):  
Xiangyu Luo ◽  
Joel Schwartz ◽  
Andrea Baccarelli ◽  
Zhonghua Liu

Abstract Epigenome-wide mediation analysis aims to identify DNA methylation CpG sites that mediate the causal effects of genetic/environmental exposures on health outcomes. However, DNA methylations in the peripheral blood tissues are usually measured at the bulk level based on a heterogeneous population of white blood cells. Using the bulk level DNA methylation data in mediation analysis might cause confounding bias and reduce study power. Therefore, it is crucial to get fine-grained results by detecting mediation CpG sites in a cell-type-specific way. However, there is a lack of methods and software to achieve this goal. We propose a novel method (Mediation In a Cell-type-Specific fashion, MICS) to identify cell-type-specific mediation effects in genome-wide epigenetic studies using only the bulk-level DNA methylation data. MICS follows the standard mediation analysis paradigm and consists of three key steps. In step1, we assess the exposure-mediator association for each cell type; in step 2, we assess the mediator-outcome association for each cell type; in step 3, we combine the cell-type-specific exposure-mediator and mediator-outcome associations using a multiple testing procedure named MultiMed [Sampson JN, Boca SM, Moore SC, et al. FWER and FDR control when testing multiple mediators. Bioinformatics 2018;34:2418–24] to identify significant CpGs with cell-type-specific mediation effects. We conduct simulation studies to demonstrate that our method has correct FDR control. We also apply the MICS procedure to the Normative Aging Study and identify nine DNA methylation CpG sites in the lymphocytes that might mediate the effect of cigarette smoking on the lung function.


Blood ◽  
2008 ◽  
Vol 112 (11) ◽  
pp. 4466-4466
Author(s):  
Margaret Dellett ◽  
Michelle Lazenby ◽  
Alan K Burnett ◽  
Ken I Mills

Abstract Acute myeloid leukemia (AML) accounts for ~30% of adult leukaemia cases and is expected to increase as the population ages, due to median age of onset at ~60 years old. Recent evidence suggests that DNA methylation is actively involved in AML and myelodysplastic syndrome (MDS). Tumor suppressor genes, such as p16, have been shown to be silenced by methylation in AML. However, epigenetic events such as DNA methylation are reversible and therefore targets for chemotherapeutic intervention. It has been reported that ~30% of MDS patients with an abnormal karyotype show normalization of their methylation status after receiving a demethylating drug during early stages of their therapy. The UK NCRI AML16 programme for elderly patients (&gt;60 years old at diagnosis) with AML and high risk MDS has several therapeutic questions for patients considered fit for intensive treatment, one of which is to compare the use of azacytidine demethylation maintenance treatment with no maintenance therapy. Samples were obtained from patients entered into the AML16 trial, at diagnosis and from patients entered into the intensive arm of the trial who were randomized to receive azacytidine maintenance therapy were analyzed for the alterations for genomic methylation. Pyrosequencing was used to determine methylation within 17 CpG sites within p16, MLH1, and MGMT whilst LINE1 was used as a measure of global methylation. To date, approximately 714 patients have been entered into AML16. Of these 195 diagnostic samples have been analyzed, of which 103 were in the intensive arm of the trial. At the second randomization stage, 34 patient samples were analyzed and a further 26 samples were obtained following 3, 6 or 9 courses of azacytidine therapy. Statistical comparison of the methylation levels at each individual CpG or for the averaged CpG in each gene studies indicated that there was no difference whether the sample was derived from bone marrow or peripheral blood. This allowed the direct comparison of peripheral blood samples obtained at 2nd randomization and during azacytidine maintenance courses. Differential levels of methylation at individual CpG within the gene were seen at diagnosis. Higher levels of average p16 methylation were observed in the AML patients when compared to a small cohort of “well elderly” individuals. No difference was noted in the individual or averaged CpG methylation status for MGMT or LINE1 during the maintenance course of azacytidine. However, the methylation status of the CpG sites within the p16 and MLH1 genes reduced during maintenance by a median of 19% and 25% respectively. However, the number of patients completing three courses of azacytidine was only about 20% of those entering the intensive arm of AML16, however sequential samples from the same individual also showed demethylation of the CpG sites in p16 and MLH1. This study shows that azacytidine maintenance therapy in elderly AML patients does reduce the methylation status of some genes whilst others genes show no response. This is being investigated further using arrays containing 12,000 CpG sites which will be correlated with gene expression microarrays on the diagnostic samples from AML16.


Blood ◽  
2014 ◽  
Vol 124 (21) ◽  
pp. 3549-3549
Author(s):  
Yang Xi ◽  
Velizar Shivarov ◽  
Gur Yaari ◽  
Steven Kleinstein ◽  
Matthew P. Strout

Abstract DNA methylation and demethylation at cytosine residues are epigenetic modifications that regulate gene expression associated with early cell development, somatic cell differentiation, cellular reprogramming and malignant transformation. While the process of DNA methylation and maintenance by DNA methyltransferases is well described, the nature of DNA demethylation remains poorly understood. The current model of DNA demethylation proposes modification of 5-methylcytosine followed by DNA repair-dependent cytosine substitution. Although there is debate on the extent of its involvement in DNA demethylation, activation-induced cytidine deaminase (AID) has recently emerged as an enzyme that is capable of deaminating 5-methylcytosine to thymine, creating a T:G mismatch which can be repaired back to cytosine through DNA repair pathways. AID is expressed at low levels in many tissue types but is most highly expressed in germinal center B cells where it deaminates cytidine to uracil during somatic hypermutation and class switch recombination of the immunoglobulin genes. In addition to this critical role in immune diversification, aberrant targeting of AID contributes to oncogenic point mutations and chromosome translocations associated with B cell malignancies. Thus, to explore a role for AID in DNA demethylation in B cell lymphoma, we performed genome-wide methylation profiling in BL2 and AID-deficient (AID-/-) BL2 cell lines (Burkitt lymphoma that can be induced to express high levels of AID). Using Illumina’s Infinium II DNA Methylation assay combined with the Infinium Human Methylation 450 Bead Chip, we analyzed over 450,000 methylation (CpG) sites at single nucleotide resolution in each line. BL2 AID-/- cells had a median average beta value (ratio of the methylated probe intensity to overall intensity) of 0.76 compared with 0.73 in AID-expressing BL2 cells (P < 0.00001), indicating a significant reduction in global methylation in the presence of AID. Using a delta average beta value of ≥ 0.3 (high stringency cut-off whereby a difference of 0.3 or more defines a CpG site as hypomethylated), we identified 5883 CpG sites in 3347 genes that were hypomethylated in BL2 versus BL2 AID-/- cells. Using the Illumina HumanHT-12 v4 Expression BeadChip and Genome Studio software, we then integrated gene expression and methylation profiles from both lines to generate a list of genes that met the following criteria: 1) contained at least 4 methylation sites within the first 1500 bases downstream of the primary transcriptional start site (TSS 1500; AID is most active in this region during somatic hypermutation); 2) average beta value increased by >0.1 in the TSS 1500 region in BL2 compared with BL2 AID-/- cells; and 3) down-regulated by >50% in BL2 compared with BL2 AID-/- cells. This analysis identified 31 candidate genes targeted for AID-dependent demethylation with consequent changes in gene expression. Interestingly, 15 of these genes have been reported to be bound by AID in association with stalled RNA polymerase II in activated mouse B cells. After validating methylation status in a subset of genes (APOBEC3B, BIN1, DEM1, GRN, GNPDA1) through bisulfite sequencing, we selected DEM1 for further analysis. DEM1 encodes an exonuclease involved in DNA repair and contains 16 CpG sites within its TSS1500, with only one site >50% methylated in BL2 cells compared with 8 of 16 in BL2 AID-/- cells. To assess a direct role for AID in DEM1 methylation status, a retroviral construct (AIDΔL189-L198ER) driving tamoxifen-inducible expression of a C-terminal deletion mutant of AID (increases time spent in the nucleus) was introduced into BL2 AID-/- cells. BL2, BL2 AID-/-, and BL2 AIDΔL189-L198ER cells were cultured continuously for 21 days in the presence of tamoxifen, 100 nM. Bisulfite sequencing of DEM1 TSS 1500 did not demonstrate any significant changes in methylation at day 7. However, at day 21, 13 of the 16 DEM1 TSS 1500 methylation sites in BL2 AIDΔL189-L198ER cells were found to have an increase in the ratio of unmethylated to methylated clones ~10-25% above that of BL2 AID-/- cells. By qPCR, this correlated with a 1.75-fold increase in DEM1 gene expression to levels that were equivalent to that seen in BL2 cells (P = 0.003). Although further investigations are needed, this data supports the notion that AID is able to regulate target gene expression in B cell malignancy through active DNA demethylation. Disclosures No relevant conflicts of interest to declare.


Author(s):  
Lajmi Lakhal-Chaieb ◽  
Celia M.T. Greenwood ◽  
Mohamed Ouhourane ◽  
Kaiqiong Zhao ◽  
Belkacem Abdous ◽  
...  

AbstractWe consider the assessment of DNA methylation profiles for sequencing-derived data from a single cell type or from cell lines. We derive a kernel smoothed EM-algorithm, capable of analyzing an entire chromosome at once, and to simultaneously correct for experimental errors arising from either the pre-treatment steps or from the sequencing stage and to take into account spatial correlations between DNA methylation profiles at neighbouring CpG sites. The outcomes of our algorithm are then used to (i) call the true methylation status at each CpG site, (ii) provide accurate smoothed estimates of DNA methylation levels, and (iii) detect differentially methylated regions. Simulations show that the proposed methodology outperforms existing analysis methods that either ignore the correlation between DNA methylation profiles at neighbouring CpG sites or do not correct for errors. The use of the proposed inference procedure is illustrated through the analysis of a publicly available data set from a cell line of induced pluripotent H9 human embryonic stem cells and also a data set where methylation measures were obtained for a small genomic region in three different immune cell types separated from whole blood.


2017 ◽  
Vol 35 (15_suppl) ◽  
pp. 11034-11034
Author(s):  
Shengyang Wu ◽  
Benjamin Thomas Cooper ◽  
Fang Bu ◽  
Christopher Bowman ◽  
Keith Killian ◽  
...  

11034 Background: Bone sarcomas present a unique diagnostic challenge because of the considerable morphologic overlap between different entities. The choice of optimal treatment, however, is dependent upon accurate diagnosis. Genome-wide DNA methylation profiling has emerged as a new approach to aid in the diagnosis of brain tumors, with diagnostic accuracy exceeding standard histopathology. In this work we developed and validated a methylation based classifier to differentiate between osteosarcoma, Ewing’s sarcoma, and synovial sarcoma. Methods: DNA methylation status of 482,421 CpG sites in 15 osteosarcoma, 10 Ewing’s sarcoma, and 11 synovial sarcoma samples were measured using the Illumina HumanMethylation450 array. From this training set of 36 samples we developed a random forest classifier using the 400 most differentially methylated CpG sites (FDR q value < 0.001). This classifier was then validated on 10 synovial sarcoma samples from TCGA, 86 osteosarcoma samples from TARGET-OS, and 15 Ewing’s sarcoma from a recently published series (Huertas-Martinez et al., Cancer Letters 2016). Results: Methylation profiling revealed three distinct molecular clusters, each enriched with a single sarcoma subtype. Within the validation cohorts, all samples from TCGA were correctly classified as synovial sarcoma (10/10, sensitivity and specificity 100%). All but one sample from TARGET-OS were classified as osteosarcoma (85/86, sensitivity 98%, specificity 100%) and all but one sample from the Ewing’s sarcoma series was classified as Ewing’s sarcoma (14/15, sensitivity 93%, specificity 100%). The single misclassified osteosarcoma sample was classified as Ewing’s sarcoma, and was later determined to be a misdiagnosed Ewing’s sarcoma based on RNA-Seq demonstrating high EWRS1 and ETV1 expression. An additional clinical sample that was misdiagnosed as a synovial sarcoma by initial histolopathology, was accurately recognized as osteosarcoma by the methylation classifier. Conclusions: Osteosarcoma, Ewing’s sarcoma and synovial sarcoma have distinct epigenetic profiles. Our validated methylation-based classifier can be used to provide an accurate diagnosis when histological and standard techniques are inconclusive.


2019 ◽  
Author(s):  
Lara Nonell ◽  
Juan R González

AbstractDNA methylation plays an important role in the development and progression of disease. Beta-values are the standard methylation measures. Different statistical methods have been proposed to assess differences in methylation between conditions. However, most of them do not completely account for the distribution of beta-values. The simplex distribution can accommodate beta-values data. We hypothesize that simplex is a quite flexible distribution which is able to model methylation data.To test our hypothesis, we conducted several analyses using four real data sets obtained from microarrays and sequencing technologies. Standard data distributions were studied and modelled in comparison to the simplex. Besides, some simulations were conducted in different scenarios encompassing several distribution assumptions, regression models and sample sizes. Finally, we compared DNA methylation between females and males in order to benchmark the assessed methodologies under different scenarios.According to the results obtained by the simulations and real data analyses, DNA methylation data are concordant with the simplex distribution in many situations. Simplex regression models work well in small sample size data sets. However, when sample size increases, other models such as the beta regression or even the linear regression can be employed to assess group comparisons and obtain unbiased results. Based on these results, we can provide some practical recommendations when analyzing methylation data: 1) use data sets of at least 10 samples per studied condition for microarray data sets or 30 in NGS data sets, 2) apply a simplex or beta regression model for microarray data, 3) apply a linear model in any other case.


2020 ◽  
Vol 4 (Supplement_1) ◽  
Author(s):  
Amit Tirosh ◽  
Jonathan Keith Killian ◽  
Petersen David ◽  
Yuelin Jack Zhu ◽  
Jenny Blau ◽  
...  

Abstract Objective There is scant data of the genome-wide methylome alterations in neuroendocrine tumors (NET). Thus, the goal of this study was to compare the DNA methylation signature of NETs with respect to various primary sites and inherited genetic predisposition syndromes including von Hippel-Lindau (VHL) and multiple endocrine neoplasia type 1 (MEN1). Methods Genome-wide DNA methylation analysis of 96 NETs (primary and metastatic) was performed by using the Illumina Infinium EPIC Array. Principal component analysis (PCA) and unsupervised clustering analyses were performed to identify distinct methylome signatures. The methylation status of genetic drivers such as APC were assessed by primary site. Results A total of 835,424 CpGs methylation sites were quantified. Hypermethylated CpG sites were detected more frequently in sporadic vs. MEN1-related vs. VHL-related NETs, respectively (p &lt; 0.001 for all comparisons), while hypomethylated CpGs sites were more common in VHL-related NETs vs. sporadic and MEN1-related NETs (p&lt;0.001 for both comparisons). Small-intestinal NETs (SINETs) had the most differences at CpGs with the highest number of hyper- and hypomethylated CpG sites, followed by duodenal NETs (DNETs) and pancreatic NETs (PNETs, p&lt;0.001 for all comparisons). PCA showed distinct clustering of SINETs and three NETs of unknown primary. Sporadic, VHL-related and MEN1-related PNETs formed distinct groups on PCA. VHL-related NETs clustered separately showing pronounced CpG hypomethylation, while sporadic and MEN1-related NETs clustered together showing relative CpG hypermethylation. In a subgroup analysis, MEN1-related SINETs, DNETs and gastric NETs had distinct methylome signatures, respectively, with complete separation by PCA and unsupervised hierarchical clustering. Furthermore, we found CpG hypermethylation in the APC (adenomatous polyposis coli) gene, specifically in the 1A promoter, with higher methylation levels in gastric- and DNETs vs. SINETs, PNETs and NETs of unknown primary (p &lt; 0.001 for all comparisons). Conclusion Various primary NET sites and genetically predisposed MEN1-related NETs have distinct DNA CpG methylation signatures. The methylome signatures identified in this study may be useful for non-invasive molecular characterization of NETs, through DNA methylation profiling of biopsy samples or circulating tumor DNA.


Sign in / Sign up

Export Citation Format

Share Document