scholarly journals Comparison of Methylation Capture Sequencing and Infinium EPIC Methylation Array in Peripheral Blood Mononuclear Cells

2020 ◽  
Author(s):  
Chang Shu ◽  
Xinyu Zhang ◽  
Bradley E. Aouizerat ◽  
Ke Xu

Abstract Background: Epigenome-wide association studies (EWAS) have been widely applied to identify methylation CpG sites associated with human disease. To date, the Infinium Methylation EPIC array (EPIC) is commonly used for high-throughput DNA methylation profiling. However, the EPIC array covers only 30% of the human methylome. Methylation Capture bisulfite sequencing (MC-seq) captures target regions of methylome and has advantages of extensive coverage in the methylome at an affordable price. Methods: Epigenome-wide DNA methylation in four peripheral blood mononuclear cell samples was profiled by using SureSelectXT Methyl-Seq for MC-seq and EPIC platforms separately. CpG site-based reproducibility of MC-seq was assessed with DNA sample inputs ranging in quantity of high (> 1000ng), medium (300-1000ng), and low (150ng-300ng). To compare the performance of MC-seq and the EPIC arrays, we conducted a Pearson correlation and methylation value difference at each CpG site that was detected by both MC-seq and EPIC. We compared the percentage and counts in each CpG island and gene annotation between MC-seq and the EPIC array. Results: After quality control, an average of 3,708,550 CpG sites per sample was detected by MC-seq with DNA quantity >1000ng. Reproducibility of MC-seq detected CpG sites was high with strong correlation estimates for CpG methylation among samples with high, medium, and low DNA inputs (r > 0.96). The EPIC array captured an average of 846,464 CpG sites per sample. Compared with the EPIC array, MC-seq detected more CpGs in coding regions and CpG islands. Among the 472,540 CpG sites captured by both platforms, methylation of a majority of CpG sites was highly correlated in the same sample (r: 0.98~0.99). However, methylation for a small proportion of CpGs (N=235) differed significantly between the two platforms, with differences in beta values of greater than 0.5. Conclusions: Our results show that MC-seq is an efficient and reliable platform for methylome profiling with a broader coverage of the methylome than the array-based platform. Although methylation measurements in majority of CpGs are highly correlated, a number of CpG sites show large discrepancy between the two platforms, which warrants further investigation and needs cautious interpretation.

2020 ◽  
Vol 13 (1) ◽  
Author(s):  
Chang Shu ◽  
Xinyu Zhang ◽  
Bradley E. Aouizerat ◽  
Ke Xu

Abstract Background Epigenome-wide association studies (EWAS) have been widely applied to identify methylation CpG sites associated with human disease. To date, the Infinium MethylationEPIC array (EPIC) is commonly used for high-throughput DNA methylation profiling. However, the EPIC array covers only 30% of the human methylome. Methylation Capture bisulfite sequencing (MC-seq) captures target regions of methylome and has advantages of extensive coverage in the methylome at an affordable price. Methods Epigenome-wide DNA methylation in four peripheral blood mononuclear cell samples was profiled by using SureSelectXT Methyl-Seq for MC-seq and EPIC platforms separately. CpG site-based reproducibility of MC-seq was assessed with DNA sample inputs ranging in quantity of high (> 1000 ng), medium (300–1000 ng), and low (150 ng–300 ng). To compare the performance of MC-seq and the EPIC arrays, we conducted a Pearson correlation and methylation value difference at each CpG site that was detected by both MC-seq and EPIC. We compared the percentage and counts in each CpG island and gene annotation between MC-seq and the EPIC array. Results After quality control, an average of 3,708,550 CpG sites per sample were detected by MC-seq with DNA quantity > 1000 ng. Reproducibility of DNA methylation in MC-seq-detected CpG sites was high among samples with high, medium, and low DNA inputs (r > 0.96). The EPIC array captured an average of 846,464 CpG sites per sample. Compared with the EPIC array, MC-seq detected more CpGs in coding regions and CpG islands. Among the 472,540 CpG sites captured by both platforms, methylation of a majority of CpG sites was highly correlated in the same sample (r: 0.98–0.99). However, methylation for a small proportion of CpGs (N = 235) differed significantly between the two platforms, with differences in beta values of greater than 0.5. Conclusions Our results show that MC-seq is an efficient and reliable platform for methylome profiling with a broader coverage of the methylome than the array-based platform. Although methylation measurements in majority of CpGs are highly correlated, a number of CpG sites show large discrepancy between the two platforms, which warrants further investigation and needs cautious interpretation.


Author(s):  
Chang Shu ◽  
Xinyu Zhang ◽  
Bradley E. Aouizerat ◽  
Ke Xu

Abstract Background: Epigenome-wide association studies (EWAS) have been widely applied to identify methylation CpG sites associated with human disease. To date, the Infinium Methylation EPIC array (EPIC) is commonly used for high-throughput DNA methylation profiling. However, the EPIC array covers only 30% of the human methylome. Methylation Capture bisulfite sequencing (MC-seq) captures target regions of methylome and has advantages of extensive coverage in the methylome at an affordable price. Methods: Epienome-wide DNA methylation in four peripheral blood mononuclear cell samples was profiled by using SureSelectXT Methyl-Seq for MC-seq and EPIC platforms separately. CpG site-based reproducibility of MC-seq was assessed with DNA sample inputs ranging in quantity of high (> 1000ng), medium (300-1000ng), and low (150ng-300ng). To compare the performance of MC-seq and the EPIC arrays, we conducted a Pearson correlation and methylation value difference at each CpG site that was detected by both MC-seq and EPIC. We compared the percentage and counts in each CpG island and gene annotation between MC-seq and the EPIC array.Results: After quality control, an average of 3,708,550 CpG sites per sample was detected by MC-seq with DNA quantity >1000ng. Reproducibility of MC-seq detected CpG sites was high with strong correlation estimates for CpG methylation among samples with high, medium, and low DNA inputs (r > 0.96). The EPIC array captured an average of 846,464 CpG sites per sample. Compared with the EPIC array, MC-seq detected more CpGs in coding regions and CpG islands. Among the 472,540 CpG sites captured by both platforms, methylation of a majority of CpG sites was highly correlated in the same sample (r: 0.98~0.99). However, methylation for a small proportion of CpGs (N=235) differed significantly between the two platforms, with differences in beta values of greater than 0.5.Conclusions: Our results show that MC-seq is an efficient and reliable platform for methylome profiling with a broader coverage of the methylome than the array-based platform. Although methylation measurements in majority of CpGs are highly correlated, a number of CpG sites show large discrepancy between the two platforms, which warrants further investigation and needs cautious interpretation.


2019 ◽  
Author(s):  
Kathleen Cheung ◽  
Marjolein J. Burgers ◽  
David A. Young ◽  
Simon Cockell ◽  
Louise N. Reynard

AbstractBackgroundDNA methylation of CpG sites is commonly measured using Illumina Infinium BeadChip platforms. The Infinium MethylationEPIC array has replaced the Infinium Methylation450K array. The two arrays use the same technology, with the EPIC array assaying 865859 CpG sites, almost double the number of sites present on the 450K array. In this study, we compare DNA methylation values of shared CpGs of the same human cartilage samples assayed using both platforms.MethodsDNA methylation was measured in 21 human cartilage samples using the Illumina Infinium Methylation450K BeadChip and the Infinium methylationEPIC array. Additional matched 450K and EPIC data in whole tumour and whole blood were downloaded from GEO GSE92580 and GSE86833 respectively. Data were processed using the Bioconductor package Minfi. Additionally, DNA methylation of six CpG sites was validated for the same 21 cartilage samples by use of pyrosequencing.ResultsIn cartilage samples, overall sample correlations between methylation values generated by the two arrays were high (Pearson correlation coefficient r > 0.96). However, 50.5% of CpG sites showed poor correlation (r < 0.2) between arrays. Sites with limited variance and with either very high or very low methylation levels in cartilage exhibited lower correlation values, corroborating prior studies in whole blood. Bisulfite pyrosequencing did not highlight one array as generating more accurate methylation values that the other. For a specific CpG site, the array methylation correlation coefficient differed between cartilage, tumour and whole blood, reflecting the difference in methylation variance between cell types. These patterns can be observed across different tissues with different CpG site variances. When performing differential methylation analysis, the mean probe correlation co-efficient increased with increasing Δβ threshold used.ConclusionCpG sites with low variability within a tissue showed poor reproducibility between arrays. However, variance and thus reproducibility differs across different tissue types. Therefore, researchers should be cautious when analysing methylation of CpG sites that show low methylation variance within the cell type of interest, regardless of platform or method used to assay methylation.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Xiaolei Wang ◽  
Jin Huang ◽  
Yixiang Zheng ◽  
Sisi Long ◽  
Huijun Lin ◽  
...  

AbstractGenome-wide DNA methylation profiling have been used to find maternal CpG sites related to the occurrence of gestational diabetes mellitus (GDM). However, none of these differential sites found has been verified in a larger sample. Here, our aim was to evaluate whether first trimester changes in target CpG sites in the peripheral blood of pregnancy women predict subsequent development of GDM. This nested case–control study was based upon an early pregnancy follow-up cohort (ChiCTR1900020652). Target CpG sites were extracted from related published literature and bioinformatics analysis. The DNA methylation levels at 337 CpG sites of 80 GDM cases and 80 matched healthy controls during the early pregnancy (10–15 weeks) were assessed using MethylTarget sequencing. The best cut-off level for methylation of CpG site was determined using the generated ROC curve. The independent effect of CpG site methylation status on GDM was analyzed using conditional logistic regression. Methylation levels at 6 CpG sites were significantly higher in the GDM group than in controls, whereas those at another 6 CpG sites were significantly lower (FDR < 0.05). The area under the ROC curve at each methylation level of the significant CpG sites ranged between 0.593 and 0.650 for the occurrence of GDM. After adjusting for possible confounders, the hypermethylation status of CpG site 68167324 (OR = 3.168, 1.038–9.666) and 24837915 (OR = 5.232, 1.659–16.506) was identified as more strongly associated with GDM; meanwhile, the hypermethylation of CpG site 157130156 (OR = 0.361, 0.135–0.966) and 89438648 (OR = 0.206, 0.065–0.655) might indicate lower risk of GDM. The methylation status of target CpG sites in the peripheral blood of pregnant women during the first trimester may be associated with GDM pathogenesis, and has potential as a predictor of GDM.


2018 ◽  
Author(s):  
Mairead L Bermingham ◽  
Rosie M Walker ◽  
Riccardo E. Marioni ◽  
Stewart M Morris ◽  
Konrad Rawlik ◽  
...  

AbstractBackgroundThe causes of poor respiratory function and COPD are incompletely understood, but it is clear that genes and the environment play a role. As DNA methylation is under both genetic and environmental control, we hypothesised that investigation of differential methylation associated with these phenotypes would permit mechanistic insights, and improve prediction of COPD. We investigated genome-wide differential DNA methylation patterns using the recently released 850K Illumina EPIC array in the largest single population sample to date.MethodsEpigenome-wide association studies (EWASs) of respiratory function and COPD were performed in peripheral blood samples from the Generation Scotland: Scottish Family Health Study (GS:SFHS) cohort (N=3,791; 274 COPD cases and 2,928 controls). In independent COPD incidence data (N=150), significantly differentially methylated sites (DMSs; p<3.6×10−8) were evaluated for their added predictive power when added to a model including clinical variables, age, sex, height and smoking history using receiver operating characteristic analysis. The Lothian Birth Cohort 1936 (LBC1936) was used to replicate association (N=895) and prediction (N=178) results.FindingsWe identified 29 respiratory function and/or COPD associated DMSs, which mapped to genes involved in alternative splicing, JAK-STAT signalling, and axon guidance. In prediction analyses, we observed significant improvement in discrimination between COPD cases and controls (p<0.05) in independent GS:SFHS (p=0.014) and LBC1936 (p=0.018) datasets by adding DMSs to a clinical model.InterpretationIdentification of novel DMSs has provided insight into the molecular mechanisms regulating respiratory function and aided prediction of COPD risk.FundingWellcome Trust Strategic Award 10436/Z/14/Z.Research in contextEvidence before this studyWe searched for articles in PubMed published in English up to July 25, 2018, with the search terms “DNA methylation” and “respiratory function”, or “COPD”. We found some evidence for association between differential DNA methylation and both respiratory function and COPD. Of the twelve previous studies identified, eight used peripheral blood samples (sample size [N] range = 100-1,085) and four used lung tissue samples (N range = 24-160). The number of CpG loci analysed range from 27,578 to 485,512. These studies have not identified consistent changes in methylation, most likely due to a combination of factors including small sample sizes, technical issues, phenotypic definitions, and study design. In addition, no previous study has: analysed a sample from a large single cohort; used the recently released Illumina EPIC array (which assesses ~850,000 CpG loci); adjusted methylation data and phenotype for smoking history, or used both prevalent and incident COPD electronic health record data.Added value of this studyTo our knowledge, this is the largest single cohort epigenome-wide association study (EWAS) of respiratory function and COPD to date (N=3,791). After applying stringent genome-wide significance criteria (P <3.6×10−8), we found that DNA methylation levels at 29 CpG sites in peripheral blood were associated with respiratory function or COPD. Of these 29, seven were testable in an independent population sample: all seven showed consistent direction of effect between the two samples and three showed replication (p<0.007 [0.05/7 CpG sites tested]). Our results suggest that adjustment of both the phenotypic and the DNA methylation probe data for smoking history, which has not been carried out in previous studies, reduces the confounding effects of smoking, identifies larger numbers of associations, and reduces the heterogeneity of effects across smoking strata. We used gene set enrichment and pathway analyses, together with an approach that combines DNA methylation results with gene expression data to provide evidence for enrichment of differentially methylated sites in genes linked to alternative splicing, and JAK-STAT signalling and axon guidance. Finally, we demonstrated that the inclusion of DNA methylation data improves COPD risk prediction over established clinical variables alone in two independent datasets.Implications of all the available evidenceThere is now accumulating evidence that DNA methylation in peripheral blood is associated with respiratory function and COPD.Our study has shown that DNA methylation levels at 29 CpG sites are robustly associated with respiratory function and COPD, provide mechanistic insights, and can improve prediction of COPD risk. Further studies are warranted to improve understanding of the aetiology of COPD and to assess the utility of DNA methylation profiling in the clinical management of this condition.


2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Qingqin S. Li ◽  
Aparna Vasanthakumar ◽  
Justin W. Davis ◽  
Kenneth B. Idler ◽  
Kwangsik Nho ◽  
...  

Abstract Background Identifying biomarkers associated with Alzheimer’s disease (AD) progression may enable patient enrichment and improve clinical trial designs. Epigenome-wide association studies have revealed correlations between DNA methylation at cytosine-phosphate-guanine (CpG) sites and AD pathology and diagnosis. Here, we report relationships between peripheral blood DNA methylation profiles measured using Infinium® MethylationEPIC BeadChip and AD progression in participants from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) cohort. Results The rate of cognitive decline from initial DNA sampling visit to subsequent visits was estimated by the slopes of the modified Preclinical Alzheimer Cognitive Composite (mPACC; mPACCdigit and mPACCtrailsB) and Clinical Dementia Rating Scale Sum of Boxes (CDR-SB) plots using robust linear regression in cognitively normal (CN) participants and patients with mild cognitive impairment (MCI), respectively. In addition, diagnosis conversion status was assessed using a dichotomized endpoint. Two CpG sites were significantly associated with the slope of mPACC in CN participants (P < 5.79 × 10−8 [Bonferroni correction threshold]); cg00386386 was associated with the slope of mPACCdigit, and cg09422696 annotated to RP11-661A12.5 was associated with the slope of CDR-SB. No significant CpG sites associated with diagnosis conversion status were identified. Genes involved in cognition and learning were enriched. A total of 19, 13, and 5 differentially methylated regions (DMRs) associated with the slopes of mPACCtrailsB, mPACCdigit, and CDR-SB, respectively, were identified by both comb-p and DMRcate algorithms; these included DMRs annotated to HOXA4. Furthermore, 5 and 19 DMRs were associated with conversion status in CN and MCI participants, respectively. The most significant DMR was annotated to the AD-associated gene PM20D1 (chr1: 205,818,956 to 205,820,014 [13 probes], Sidak-corrected P = 7.74 × 10−24), which was associated with both the slope of CDR-SB and the MCI conversion status. Conclusion Candidate CpG sites and regions in peripheral blood were identified as associated with the rate of cognitive decline in participants in the ADNI cohort. While we did not identify a single CpG site with sufficient clinical utility to be used by itself due to the observed effect size, a biosignature composed of DNA methylation changes may have utility as a prognostic biomarker for AD progression.


2018 ◽  
Author(s):  
T Battram ◽  
RC Richmond ◽  
L Baglietto ◽  
P Haycock ◽  
V Perduca ◽  
...  

AbstractDNA methylation changes in peripheral blood have been identified in relation to lung cancer risk. However, the causal nature of these associations remains to be fully elucidated. Meta-analysis of four epigenome-wide association studies (918 cases, 918 controls) revealed differential methylation at 16 CpG sites (FDR < 0.05) in relation to lung cancer risk. A two-sample Mendelian randomization analysis, using genetic instruments for methylation at 14 of the 16 CpG sites, and 29,863 cases and 55,586 controls from the TRICL-ILCCO lung cancer consortium, was performed to appraise the causal role of methylation at these sites on lung cancer. This approach provided little evidence that DNA methylation in peripheral blood at the 14 CpG sites play a causal role in lung cancer development, including for cg05575921AHRR, where methylation is strongly associated with lung cancer risk. Further studies are needed to investigate the causal role played by DNA methylation in lung tissue.


2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Jochen Kruppa ◽  
Miriam Sieg ◽  
Gesa Richter ◽  
Anne Pohrt

Abstract Background In DNA methylation analyses like epigenome-wide association studies, effects in differentially methylated CpG sites are assessed. Two kinds of outcomes can be used for statistical analysis: Beta-values and M-values. M-values follow a normal distribution and help to detect differentially methylated CpG sites. As biological effect measures, differences of M-values are more or less meaningless. Beta-values are of more interest since they can be interpreted directly as differences in percentage of DNA methylation at a given CpG site, but they have poor statistical properties. Different frameworks are proposed for reporting estimands in DNA methylation analysis, relying on Beta-values, M-values, or both. Results We present and discuss four possible approaches of achieving estimands in DNA methylation analysis. In addition, we present the usage of M-values or Beta-values in the context of bioinformatical pipelines, which often demand a predefined outcome. We show the dependencies between the differences in M-values to differences in Beta-values in two data simulations: a analysis with and without confounder effect. Without present confounder effects, M-values can be used for the statistical analysis and Beta-values statistics for the reporting. If confounder effects exist, we demonstrate the deviations and correct the effects by the intercept method. Finally, we demonstrate the theoretical problem on two large human genome-wide DNA methylation datasets to verify the results. Conclusions The usage of M-values in the analysis of DNA methylation data will produce effect estimates, which cannot be biologically interpreted. The parallel usage of Beta-value statistics ignores possible confounder effects and can therefore not be recommended. Hence, if the differences in Beta-values are the focus of the study, the intercept method is recommendable. Hyper- or hypomethylated CpG sites must then be carefully evaluated. If an exploratory analysis of possible CpG sites is the aim of the study, M-values can be used for inference.


Author(s):  
Annelie Angerfors ◽  
Martina Olsson Lindvall ◽  
Björn Andersson ◽  
Staffan Nilsson ◽  
Marcela Davila Lopez ◽  
...  

AbstractDNA methylation has become increasingly recognized in the etiology of complex diseases, including thrombotic disorders. Blood is often collected in epidemiological studies for genotyping and has recently also been used to examine DNA methylation in epigenome-wide association studies. DNA methylation patterns are often tissue-specific, thus, peripheral blood may not accurately reflect the methylation pattern in the tissue of relevance. Here, we collected paired liver and blood samples concurrently from 27 individuals undergoing liver surgery. We performed targeted bisulfite sequencing for a set of 35 hemostatic genes primarily expressed in liver to analyze DNA methylation levels of >10,000 cytosine-phosphate-guanine (CpG) dinucleotides. We evaluated whether DNA methylation in blood could serve as a proxy for DNA methylation in liver at individual CpGs. Approximately 30% of CpGs were nonvariable and were predominantly hypo- (<25%) or hypermethylated (>70%) in both tissues. While blood can serve as a proxy for liver at these CpGs, the low variability renders these unlikely to explain phenotypic differences. We therefore focused on CpG sites with variable methylation levels in liver. The level of blood–liver tissue correlation varied widely across these variable CpGs; moderate correlations (0.5 ≤ r < 0.75) were detected for 6% and strong correlations (r ≥ 0.75) for a further 4%. Our findings indicate that it is essential to study the concordance of DNA methylation between blood and liver at individual CpGs. This paired blood–liver dataset is intended as a resource to aid interpretation of blood-based DNA methylation results.


Genes ◽  
2021 ◽  
Vol 12 (6) ◽  
pp. 870
Author(s):  
Jiansheng Zhang ◽  
Hongli Fu ◽  
Yan Xu

In recent years, scientists have found a close correlation between DNA methylation and aging in epigenetics. With the in-depth research in the field of DNA methylation, researchers have established a quantitative statistical relationship to predict the individual ages. This work used human blood tissue samples to study the association between age and DNA methylation. We built two predictors based on healthy and disease data, respectively. For the health data, we retrieved a total of 1191 samples from four previous reports. By calculating the Pearson correlation coefficient between age and DNA methylation values, 111 age-related CpG sites were selected. Gradient boosting regression was utilized to build the predictive model and obtained the R2 value of 0.86 and MAD of 3.90 years on testing dataset, which were better than other four regression methods as well as Horvath’s results. For the disease data, 354 rheumatoid arthritis samples were retrieved from a previous study. Then, 45 CpG sites were selected to build the predictor and the corresponded MAD and R2 were 3.11 years and 0.89 on the testing dataset respectively, which showed the robustness of our predictor. Our results were better than the ones from other four regression methods. Finally, we also analyzed the twenty-four common CpG sites in both healthy and disease datasets which illustrated the functional relevance of the selected CpG sites.


Sign in / Sign up

Export Citation Format

Share Document