scholarly journals An analytical pipeline for DNA Methylation Array Biomarker Studies

2021 ◽  
Author(s):  
Jennifer Lu ◽  
Darren Korbie ◽  
Matt Trau

DNA methylation is one of the most commonly studied epigenetic biomarkers, due to its role in disease and development. The Illumina Infinium methylation arrays still remains the most common method to interrogate methylation across the human genome, due to its capabilities of screening over 480, 000 loci simultaneously. As such, initiatives such as The Cancer Genome Atlas (TCGA) have utilized this technology to examine the methylation profile of over 20,000 cancer samples. There is a growing body of methods for pre-processing, normalisation and analysis of array-based DNA methylation data. However, the shape and sampling distribution of probe-wise methylation that could influence the way data should be examined was rarely discussed. Therefore, this article introduces a pipeline that predicts the shape and distribution of normalised methylation patterns prior to selection of the most optimal inferential statistics screen for differential methylation. Additionally, we put forward an alternative pipeline, which employed feature selection, and demonstrate its ability to select for biomarkers with outstanding differences in methylation, which does not require the predetermination of the shape or distribution of the data of interest. Availability: The Distribution test and the feature selection pipelines are available for download at: https://github.com/uqjlu8/DistributionTest Keywords: DNA methylation, Biomarkers, Cancers, Data Distribution, TCGA, 450K

2021 ◽  
Vol 12 ◽  
Author(s):  
Zhengang Hu ◽  
Hao Zhang ◽  
Fan Fan ◽  
Zeyu Wang ◽  
Jiahao Xu ◽  
...  

DNA methylation patterns are essential in understanding carcinogenesis. However, the relationship between DNA methylation and the immune process has not been clearly established—this study aimed at elucidating the interaction between glioma and DNA methylation, consolidating glioma classification and prognosis. A total of 2,483 immune-related genes and 24,556 corresponding immune-related methylation probes were identified. From the Cancer Genome Atlas (TCGA) glioma cohort, a total of 683 methylation samples were stratified into two different clusters using unsupervised clustering, and eight types of other cancer samples from the TCGA database were shown to exhibit excellent distributions. A total of 3,562 differentially methylated probes (DMPs) were selected and used for machine learning. A five-probe signature was established to evaluate the prognosis of glioma as well as the potential benefits of radiotherapy and Procarbazine, CCNU, Vincristine (PCV) treatment. Other prognostic clinical models, such as nomogram and decision tree, were also evaluated. Our findings confirmed the interactions between immune-related methylation patterns and glioma. This novel approach for cancer molecular characterization and prognosis should be validated in further studies.


Epigenomics ◽  
2021 ◽  
Vol 13 (8) ◽  
pp. 599-612
Author(s):  
Jie Wu ◽  
Deng Lin ◽  
Liandi Jiu ◽  
Qi Liu ◽  
Zhenglong Gu ◽  
...  

Aim: To explore the mechanism of cancer by employing a comprehensive analysis of DNA methylation patterns and variations among pan-cancer cohorts. Materials & methods: This research focused on the discovery of universally specific or common biomarkers by mathematical statistics and machine learning methods in The Cancer Genome Atlas. Results: We found 138 differently methylated CpGs (DMCs) with a common methylation trend and eight common differently methylated regions in different cancer cohorts. Additionally, we found 99 DMCs to distinguish 32 different cancer cohorts in random forest analysis because of the specificity mechanism, but each DMC still had high instability. Conclusion: Our results could facilitate the development of biomarkers that are universally specific and common features across pan-cancer cohorts.


2020 ◽  
Vol 21 (S6) ◽  
Author(s):  
Xinyu Hu ◽  
Li Tang ◽  
Linconghua Wang ◽  
Fang-Xiang Wu ◽  
Min Li

Abstract Background DNA methylation in the human genome is acknowledged to be widely associated with biological processes and complex diseases. The Illumina Infinium methylation arrays have been approved as one of the most efficient and universal technologies to investigate the whole genome changes of methylation patterns. As methylation arrays may still be the dominant method for detecting methylation in the anticipated future, it is crucial to develop a reliable workflow to analysis methylation array data. Results In this study, we develop a web service MADA for the whole process of methylation arrays data analysis, which includes the steps of a comprehensive differential methylation analysis pipeline: pre-processing (data loading, quality control, data filtering, and normalization), batch effect correction, differential methylation analysis, and downstream analysis. In addition, we provide the visualization of pre-processing, differentially methylated probes or regions, gene ontology, pathway and cluster analysis results. Moreover, a customization function for users to define their own workflow is also provided in MADA. Conclusions With the analysis of two case studies, we have shown that MADA can complete the whole procedure of methylation array data analysis. MADA provides a graphical user interface and enables users with no computational skills and limited bioinformatics background to carry on complicated methylation array data analysis. The web server is available at: http://120.24.94.89:8080/MADA


2021 ◽  
Author(s):  
Dylane Detilleux ◽  
Yannick G Spill ◽  
Delphine Balaramane ◽  
Michaël Weber ◽  
Anaïs Flore Bardet

ABSTRACTAberrant DNA methylation has emerged as a hallmark of cancer cells and profiling their epigenetic landscape has widely been carried out in many types of cancer. However, the mechanisms underlying changes in DNA methylation remain elusive. Transcription factors, initially thought to be repressed from binding by DNA methylation, have recently emerged as potential drivers of DNA methylation patterns. Here we perform a rigorous bioinformatic analysis integrating the massive amount of data available from The Cancer Genome Atlas to identify transcription factors driving aberrant DNA methylation. We predict TFs known to be involved in cancer as well as novel candidates to drive hypo-methylated regions such as FOXA1 and GATA3 in breast cancer, FOXA1 and TWIST1 in prostate cancer and NFE2L2 in lung cancer. We also predict TFs that lead to hyper-methylated regions upon TF loss such as EGR1 in several cancer types. Finally, we validate experimentally that FOXA1 and GATA3 mediate hypo-methylated regions in breast cancer cells. Our work shows the importance of TFs as upstream regulators shaping DNA methylation patterns in cancer.


2019 ◽  
Vol 48 (D1) ◽  
pp. D890-D895 ◽  
Author(s):  
Zhuang Xiong ◽  
Mengwei Li ◽  
Fei Yang ◽  
Yingke Ma ◽  
Jian Sang ◽  
...  

Abstract Epigenome-Wide Association Study (EWAS) has become an effective strategy to explore epigenetic basis of complex traits. Over the past decade, a large amount of epigenetic data, especially those sourced from DNA methylation array, has been accumulated as the result of numerous EWAS projects. We present EWAS Data Hub (https://bigd.big.ac.cn/ewas/datahub), a resource for collecting and normalizing DNA methylation array data as well as archiving associated metadata. The current release of EWAS Data Hub integrates a comprehensive collection of DNA methylation array data from 75 344 samples and employs an effective normalization method to remove batch effects among different datasets. Accordingly, taking advantages of both massive high-quality DNA methylation data and standardized metadata, EWAS Data Hub provides reference DNA methylation profiles under different contexts, involving 81 tissues/cell types (that contain 25 brain parts and 25 blood cell types), six ancestry categories, and 67 diseases (including 39 cancers). In summary, EWAS Data Hub bears great promise to aid the retrieval and discovery of methylation-based biomarkers for phenotype characterization, clinical treatment and health care.


2020 ◽  
Vol 22 (Supplement_3) ◽  
pp. iii400-iii401
Author(s):  
Kuo-Sheng Wu ◽  
Tai-Tong Wong

Abstract BACKGROUND Medulloblastoma (MB) was classified to 4 molecular subgroups: WNT, SHH, group 3 (G3), and group 4 (G4) with the demographic and clinical differences. In 2017, The heterogeneity within MB was proposed, and 12 subtypes with distinct molecular and clinical characteristics. PATIENTS AND METHODS: PATIENTS AND METHODS We retrieved 52 MBs in children to perform RNA-Seq and DNA methylation array. Subtype cluster analysis performed by similarity network fusion (SNF) method. With clinical results and molecular profiles, the characteristics including age, gender, histological variants, tumor location, metastasis status, survival, cytogenetic and genetic aberrations among MB subtypes were identified. RESULTS In this cohort series, 52 childhood MBs were classified into 11 subtypes by SNF cluster analysis. WNT tumors shown no metastasis and 100% survival rate. All WNT tumors located on midline in 4th ventricle. Monosomy 6 presented in WNT α, but not in β subtype. SHH α and β occurred in children, while SHH γ in infant. Among SHH tumors, α subtype showed the worst outcome. G3 γ showed the highest metastatic rate and worst survival associated with MYC amplification. G4 α has the highest metastatic rate, however G4 γ showed the worst survival. CONCLUSION We identified molecular subgroups and subtypes of MBs based on gene expression and DNA methylation profile in children in our cohort series. The results may contribute to the establishment of nation-wide correlated optimal diagnosis and treatment strategies for MBs in infant and children.


2019 ◽  
Vol 10 (1) ◽  
Author(s):  
Yu Kong ◽  
Christopher M. Rose ◽  
Ashley A. Cass ◽  
Alexander G. Williams ◽  
Martine Darwish ◽  
...  

AbstractProfound global loss of DNA methylation is a hallmark of many cancers. One potential consequence of this is the reactivation of transposable elements (TEs) which could stimulate the immune system via cell-intrinsic antiviral responses. Here, we develop REdiscoverTE, a computational method for quantifying genome-wide TE expression in RNA sequencing data. Using The Cancer Genome Atlas database, we observe increased expression of over 400 TE subfamilies, of which 262 appear to result from a proximal loss of DNA methylation. The most recurrent TEs are among the evolutionarily youngest in the genome, predominantly expressed from intergenic loci, and associated with antiviral or DNA damage responses. Treatment of glioblastoma cells with a demethylation agent results in both increased TE expression and de novo presentation of TE-derived peptides on MHC class I molecules. Therapeutic reactivation of tumor-specific TEs may synergize with immunotherapy by inducing inflammation and the display of potentially immunogenic neoantigens.


Author(s):  
Marina Bibikova ◽  
Bret Barnes ◽  
Chan Tsan ◽  
Vincent Ho ◽  
Brandy Klotzle ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document