scholarly journals Mapping DNA sequence to transcription factor binding energy in vivo

2018 ◽  
Author(s):  
Stephanie L. Barnes ◽  
Nathan M. Belliveau ◽  
William T. Ireland ◽  
Justin B. Kinney ◽  
Rob Phillips

AbstractDespite the central importance of transcriptional regulation in systems biology, it has proven difficult to determine the regulatory mechanisms of individual genes, let alone entire gene networks. It is particularly difficult to analyze a promoter sequence and identify the locations, regulatory roles, and energetic properties of binding sites for transcription factors and RNA polymerase. In this work, we present a strategy for interpreting transcriptional regulatory sequences using in vivo methods (i.e. the massively parallel reporter assay Sort-Seq) to formulate quantitative models that map a transcription factor binding site’s DNA sequence to transcription factor-DNA binding energy. We use these models to predict the binding energies of transcription factor binding sites to within 1 kBT of their measured values. We further explore how such a sequence-energy mapping relates to the mechanisms of trancriptional regulation in various promoter contexts. Specifically, we show that our models can be used to design specific induction responses, analyze the effects of amino acid mutations on DNA sequence preference, and determine how regulatory context affects a transcription factor’s sequence specificity.

2019 ◽  
Vol 15 (2) ◽  
pp. e1006226 ◽  
Author(s):  
Stephanie L. Barnes ◽  
Nathan M. Belliveau ◽  
William T. Ireland ◽  
Justin B. Kinney ◽  
Rob Phillips

2015 ◽  
Vol 9S4 ◽  
pp. BBI.S29330
Author(s):  
Stephen A. Ramsey

A Bayesian method for sampling from the distribution of matches to a precompiled transcription factor binding site (TFBS) sequence pattern (conditioned on an observed nucleotide sequence and the sequence pattern) is described. The method takes a position frequency matrix as input for a set of representative binding sites for a transcription factor and two sets of noncoding, 5’ regulatory sequences for gene sets that are to be compared. An empirical prior on the frequency A (per base pair of gene-vicinal, noncoding DNA) of TFBSs is developed using data from the ENCODE project and incorporated into the method. In addition, a probabilistic model for binding site occurrences conditioned on λ is developed analytically, taking into account the finite-width effects of binding sites. The count of TFBS β (conditioned on the observed sequence) is sampled using Metropolis-Hastings with an information entropybased move generator. The derivation of the method is presented in a step-by-step fashion, starting from specific conditional independence assumptions. Empirical results show that the newly proposed prior on β improves accuracy for estimating the number of TFBS within a set of promoter sequences.


2008 ◽  
Vol 2008 ◽  
pp. 1-9 ◽  
Author(s):  
J. Sunil Rao ◽  
Suresh Karanam ◽  
Colleen D. McCabe ◽  
Carlos S. Moreno

Background. The computational identification of functional transcription factor binding sites (TFBSs) remains a major challenge of computational biology. Results. We have analyzed the conserved promoter sequences for the complete set of human RefSeq genes using our conserved transcription factor binding site (CONFAC) software. CONFAC identified 16296 human-mouse ortholog gene pairs, and of those pairs, 9107 genes contained conserved TFBS in the 3 kb proximal promoter and first intron. To attempt to predict in vivo occupancy of transcription factor binding sites, we developed a novel marginal effect isolator algorithm that builds upon Bayesian methods for multigroup TFBS filtering and predicted the in vivo occupancy of two transcription factors with an overall accuracy of 84%. Conclusion. Our analyses show that integration of chromatin immunoprecipitation data with conserved TFBS analysis can be used to generate accurate predictions of functional TFBS. They also show that TFBS cooccurrence can be used to predict transcription factor binding to promoters in vivo.


2020 ◽  
Author(s):  
Jiayue-Clara Jiang ◽  
Joseph Rothnagel ◽  
Kyle Upton

ABSTRACTTransposons, a type of repetitive DNA elements, can contribute cis-regulatory sequences and regulate the expression of human genes. L1PA2 is a hominoid-specific subfamily of LINE1 transposons, with approximately 4,940 copies in the human genome. Individual transposons have been demonstrated to contribute specific biological functions, such as cancer-specific alternate promoter activity for the MET oncogene, which is correlated with enhanced malignancy and poor prognosis in cancer. Given the sequence similarity between L1PA2 elements, we hypothesise that transposons within the L1PA2 subfamily likely have a common regulatory potential and may provide a mechanism for global genome regulation. Here we show that in breast cancer, the regulatory potential of L1PA2 is not limited to single transposons, but is common within the subfamily. We demonstrate that the L1PA2 subfamily is an abundant reservoir of transcription factor binding sites, the majority of which cluster in the LINE1 5’UTR. In MCF7 breast cancer cells, over 27% of L1PA2 transposons harbour binding sites of functionally interacting, cancer-associated transcription factors. The ubiquitous and replicative nature of L1PA2 makes them an exemplary vector to disperse co-localised transcription factor binding sites, facilitating the co-ordinated regulation of genes. In MCF7 cells, L1PA2 transposons also supply transcription start sites to up-regulated transcripts. These transcriptionally active L1PA2 transposons display a cancer-specific active epigenetic profile, and likely play an oncogenic role in breast cancer aetiology. Overall, we show that the L1PA2 subfamily contributes abundant regulatory sequences in breast cancer cells, and likely plays a global role in modulating the tumorigenic state in breast cancer.


BMC Biology ◽  
2021 ◽  
Vol 19 (1) ◽  
Author(s):  
Lianggang Huang ◽  
Xuejie Li ◽  
Liangbo Dong ◽  
Bin Wang ◽  
Li Pan

Abstract Background The identification of open chromatin regions and transcription factor binding sites (TFBs) is an important step in understanding the regulation of gene expression in diverse species. ATAC-seq is a technique used for such purpose by providing high-resolution measurements of chromatin accessibility revealed through integration of Tn5 transposase. However, the existence of cell walls in filamentous fungi and associated difficulty in purifying nuclei have precluded the routine application of this technique, leading to a lack of experimentally determined and computationally inferred data on the identity of genome-wide cis-regulatory elements (CREs) and TFBs. In this study, we constructed an ATAC-seq platform suitable for filamentous fungi and generated ATAC-seq libraries of Aspergillus niger and Aspergillus oryzae grown under a variety of conditions. Results We applied the ATAC-seq assay for filamentous fungi to delineate the syntenic orthologue and differentially changed chromatin accessibility regions among different Aspergillus species, during different culture conditions, and among specific TF-deleted strains. The syntenic orthologues of accessible regions were responsible for the conservative functions across Aspergillus species, while regions differentially changed between culture conditions and TFs mutants drove differential gene expression programs. Importantly, we suggest criteria to determine TFBs through the analysis of unbalanced cleavage of distinct TF-bound DNA strands by Tn5 transposase. Based on this criterion, we constructed data libraries of the in vivo genomic footprint of A. niger under distinct conditions, and generated a database of novel transcription factor binding motifs through comparison of footprints in TF-deleted strains. Furthermore, we validated the novel TFBs in vivo through an artificial synthetic minimal promoter system. Conclusions We characterized the chromatin accessibility regions of filamentous fungi species, and identified a complete TFBs map by ATAC-seq, which provides valuable data for future analyses of transcriptional regulation in filamentous fungi.


Sign in / Sign up

Export Citation Format

Share Document