Prediction Of Histone Post-Translational Modifications Using Deep Learning

Author(s):  
Dipankar Ranjan Baisya ◽  
Stefano Lonardi

Abstract Motivation Histone post-translational modifications (PTMs) are involved in a variety of essential regulatory processes in the cell, including transcription control. Recent studies have shown that histone PTMs can be accurately predicted from the knowledge of transcription factor binding or DNase hypersensitivity data. Similarly, it has been shown that one can predict PTMs from the underlying DNA primary sequence. Results In this study, we introduce a deep learning architecture called DeepPTM for predicting histone PTMs from transcription factor binding data and the primary DNA sequence. Extensive experimental results show that our deep learning model outperforms the prediction accuracy of the model proposed in Benveniste et al. (PNAS 2014) and DeepHistone (BMC Genomics 2019). The competitive advantage of our framework lies in the synergistic use of deep learning combined with an effective pre-processing step. Our classification framework has also enabled the discovery that the knowledge of a small subset of transcription factors (which are histone-PTM and cell-type specific) can provide almost the same prediction accuracy that can be obtained using all the transcription factors data. Availability https://github.com/dDipankar/DeepPTM

2018 ◽  
Author(s):  
Mehran Karimzadeh ◽  
Michael M. Hoffman

AbstractMotivationIdentifying transcription factor binding sites is the first step in pinpointing non-coding mutations that disrupt the regulatory function of transcription factors and promote disease. ChIP-seq is the most common method for identifying binding sites, but performing it on patient samples is hampered by the amount of available biological material and the cost of the experiment. Existing methods for computational prediction of regulatory elements primarily predict binding in genomic regions with sequence similarity to known transcription factor sequence preferences. This has limited efficacy since most binding sites do not resemble known transcription factor sequence motifs, and many transcription factors are not even sequence-specific.ResultsWe developed Virtual ChIP-seq, which predicts binding of individual transcription factors in new cell types using an artificial neural network that integrates ChIP-seq results from other cell types and chromatin accessibility data in the new cell type. Virtual ChIP-seq also uses learned associations between gene expression and transcription factor binding at specific genomic regions. This approach outperforms methods that predict TF binding solely based on sequence preference, pre-dicting binding for 36 transcription factors (Matthews correlation coefficient > 0.3).AvailabilityThe datasets we used for training and validation are available at https://virchip.hoffmanlab.org. We have deposited in Zenodo the current version of our software (http://doi.org/10.5281/zenodo.1066928), datasets (http://doi.org/10.5281/zenodo.823297), predictions for 36 transcription factors on Roadmap Epigenomics cell types (http://doi.org/10.5281/zenodo.1455759), and predictions in Cistrome as well as ENCODE-DREAM in vivo TF Binding Site Prediction Challenge (http://doi.org/10.5281/zenodo.1209308).


IEEE Access ◽  
2020 ◽  
Vol 8 ◽  
pp. 219256-219274
Author(s):  
Yuanqi Zeng ◽  
Meiqin Gong ◽  
Meng Lin ◽  
Dongrui Gao ◽  
Yongqing Zhang

2016 ◽  
Vol 2016 ◽  
pp. 1-27 ◽  
Author(s):  
Kristopher J. L. Irizarry ◽  
Randall L. Bryden

Color variation provides the opportunity to investigate the genetic basis of evolution and selection. Reptiles are less studied than mammals. Comparative genomics approaches allow for knowledge gained in one species to be leveraged for use in another species. We describe a comparative vertebrate analysis of conserved regulatory modules in pythons aimed at assessing bioinformatics evidence that transcription factors important in mammalian pigmentation phenotypes may also be important in python pigmentation phenotypes. We identified 23 python orthologs of mammalian genes associated with variation in coat color phenotypes for which we assessed the extent of pairwise protein sequence identity between pythons and mouse, dog, horse, cow, chicken, anole lizard, and garter snake. We next identified a set of melanocyte/pigment associated transcription factors (CREB, FOXD3, LEF-1, MITF, POU3F2, and USF-1) that exhibit relatively conserved sequence similarity within their DNA binding regions across species based on orthologous alignments across multiple species. Finally, we identified 27 evolutionarily conserved clusters of transcription factor binding sites within ~200-nucleotide intervals of the 1500-nucleotide upstream regions of AIM1, DCT, MC1R, MITF, MLANA, OA1, PMEL, RAB27A, and TYR from Python bivittatus. Our results provide insight into pigment phenotypes in pythons.


Cells ◽  
2020 ◽  
Vol 9 (6) ◽  
pp. 1435
Author(s):  
Yu-Chin Lien ◽  
Paul Zhiping Wang ◽  
Xueqing Maggie Lu ◽  
Rebecca A. Simmons

Intrauterine growth retardation (IUGR), which induces epigenetic modifications and permanent changes in gene expression, has been associated with the development of type 2 diabetes. Using a rat model of IUGR, we performed ChIP-Seq to identify and map genome-wide histone modifications and gene dysregulation in islets from 2- and 10-week rats. IUGR induced significant changes in the enrichment of H3K4me3, H3K27me3, and H3K27Ac marks in both 2-wk and 10-wk islets, which were correlated with expression changes of multiple genes critical for islet function in IUGR islets. ChIP-Seq analysis showed that IUGR-induced histone mark changes were enriched at critical transcription factor binding motifs, such as C/EBPs, Ets1, Bcl6, Thrb, Ebf1, Sox9, and Mitf. These transcription factors were also identified as top upstream regulators in our previously published transcriptome study. In addition, our ChIP-seq data revealed more than 1000 potential bivalent genes as identified by enrichment of both H3K4me3 and H3K27me3. The poised state of many potential bivalent genes was altered by IUGR, particularly Acod1, Fgf21, Serpina11, Cdh16, Lrrc27, and Lrrc66, key islet genes. Collectively, our findings suggest alterations of histone modification in key transcription factors and genes that may contribute to long-term gene dysregulation and an abnormal islet phenotype in IUGR rats.


2021 ◽  
Vol 49 (17) ◽  
pp. 9809-9820
Author(s):  
Wakana Koda ◽  
Satoshi Senmatsu ◽  
Takuya Abe ◽  
Charles S Hoffman ◽  
Kouji Hirota

Abstract Transcriptional regulation, a pivotal biological process by which cells adapt to environmental fluctuations, is achieved by the binding of transcription factors to target sequences in a sequence-specific manner. However, how transcription factors recognize the correct target from amongst the numerous candidates in a genome has not been fully elucidated. We here show that, in the fission-yeast fbp1 gene, when transcription factors bind to target sequences in close proximity, their binding is reciprocally stabilized, thereby integrating distinct signal transduction pathways. The fbp1 gene is massively induced upon glucose starvation by the activation of two transcription factors, Atf1 and Rst2, mediated via distinct signal transduction pathways. Atf1 and Rst2 bind to the upstream-activating sequence 1 region, carrying two binding sites located 45 bp apart. Their binding is reciprocally stabilized due to the close proximity of the two target sites, which destabilizes the independent binding of Atf1 or Rst2. Tup11/12 (Tup-family co-repressors) suppress independent binding. These data demonstrate a previously unappreciated mechanism by which two transcription-factor binding sites, in close proximity, integrate two independent-signal pathways, thereby behaving as a hub for signal integration.


2020 ◽  
Author(s):  
Hye Kyung Lee ◽  
Chengyu Liu ◽  
Lothar Hennighausen

AbstractEnhancers are transcription factor platforms that synergize with promoters to activate gene expression up to several-thousand-fold. While genome-wide structural studies are used to predict enhancers, the in vivo significance is less clear. Specifically, the biological importance of individual transcription factors within enhancer complexes remains to be understood. Here we investigate the structural and biological importance of individual transcription factor binding sites and redundancy among transcription components within a complex enhancer in vivo. The Csn1s2b gene is expressed exclusively in mammary tissue and activated several thousand-fold during pregnancy and lactation. Using ChIP-seq we identified a complex lactation-specific candidate enhancer that binds multiple transcription factors and coincides with activating histone marks. Using experimental mouse genetics, we determined that deletion of canonical binding motifs for the transcription factors NFIB and STAT5, individually and combined, had a limited biological impact. Loss of these sites led to a shift of transcription factor binding to juxtaposed sites, suggesting exceptional plasticity that does not require direct protein-DNA interactions. Additional deletions revealed the critical importance of a non-canonical STAT5 binding site for enhancer activity. Our data also suggest that enhancer RNAs are not required for the activity of this specific enhancer. While ChIP-seq experiments predicted an additional candidate intronic enhancer, its deletion did not adversely affect gene expression, emphasizing the limited biological information provided by structural data. Our study provides comprehensive insight into the anatomy and biology of a composite mammary enhancer that activates its target gene several hundred-fold during lactation.


Sign in / Sign up

Export Citation Format

Share Document