scholarly journals target: an R package to predict combined function of transcription factors

F1000Research ◽  
2021 ◽  
Vol 10 ◽  
pp. 344
Author(s):  
Mahmoud Ahmed ◽  
Deok Ryong Kim

Researchers use ChIP binding data to identify potential transcription factor binding sites. Similarly, they use gene expression data from sequencing or microarrays to quantify the effect of the transcription factor overexpression or knockdown on its targets. Therefore, the integration of the binding and expression data can be used to improve the understanding of a transcription factor function. Here, we implemented the binding and expression target analysis (BETA) in an R/Bioconductor package. This algorithm ranks the targets based on the distances of their assigned peaks from the transcription factor ChIP experiment and the signed statistics from gene expression profiling with transcription factor perturbation. We further extend BETA to integrate two sets of data from two transcription factors to predict their targets and their combined functions. In this article, we briefly describe the workings of the algorithm and provide a workflow with a real dataset for using it. The gene targets and the aggregate functions of transcription factors YY1 and YY2 in HeLa cells were identified. Using the same datasets, we identified the shared targets of the two transcription factors, which were found to be, on average, more cooperatively regulated.

F1000Research ◽  
2021 ◽  
Vol 10 ◽  
pp. 344
Author(s):  
Mahmoud Ahmed ◽  
Deok Ryong Kim

Researchers use ChIP binding data to identify potential transcription factor binding sites. Similarly, they use gene expression data from sequencing or microarrays to quantify the effect of the factor overexpression or knockdown on its targets. Therefore, the integration of the binding and expression data can be used to improve the understanding of a transcription factor function. Here, we implemented the binding and expression target analysis (BETA) in an R/Bioconductor package. This algorithm ranks the targets based on the distances of their assigned peaks from the factor ChIP experiment and the signed statistics from gene expression profiling with factor perturbation. We further extend BETA to integrate two sets of data from two factors to predict their targets and their combined functions. In this article, we briefly describe the workings of the algorithm and provide a workflow with a real dataset for using it. The gene targets and the aggregate functions of transcription factors YY1 and YY2 in HeLa cells were identified. Using the same datasets, we identified the shared targets of the two factors, which were found to be, on average, more cooperatively regulated.


BMC Genomics ◽  
2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Mahmoud Ahmed ◽  
Do Sik Min ◽  
Deok Ryong Kim

Abstract Background Transcription factor binding to the regulatory region of a gene induces or represses its gene expression. Transcription factors share their binding sites with other factors, co-factors and/or DNA-binding proteins. These proteins form complexes which bind to the DNA as one-units. The binding of two factors to a shared site does not always lead to a functional interaction. Results We propose a method to predict the combined functions of two factors using comparable binding and expression data (target). We based this method on binding and expression target analysis (BETA), which we re-implemented in R and extended for this purpose. target ranks the factor’s targets by importance and predicts the dominant type of interaction between two transcription factors. We applied the method to simulated and real datasets of transcription factor-binding sites and gene expression under perturbation of factors. We found that Yin Yang 1 transcription factor (YY1) and YY2 have antagonistic and independent regulatory targets in HeLa cells, but they may cooperate on a few shared targets. Conclusion We developed an R package and a web application to integrate binding (ChIP-seq) and expression (microarrays or RNA-seq) data to determine the cooperative or competitive combined function of two transcription factors.


F1000Research ◽  
2019 ◽  
Vol 8 ◽  
pp. 152
Author(s):  
Benjamin J. Stubbs ◽  
Shweta Gopaulakrishnan ◽  
Kimberly Glass ◽  
Nathalie Pochet ◽  
Celine Everaert ◽  
...  

DNA transcription is intrinsically complex. Bioinformatic work with transcription factors (TFs) is complicated by a multiplicity of data resources and annotations. The Bioconductor package TFutils includes data structures and functions to enhance the precision and utility of integrative analyses that have components involving TFs. TFutils provides catalogs of human TFs from three reference sources (CISBP, HOCOMOCO, and GO), a catalog of TF targets derived from MSigDb, and multiple approaches to enumerating TF binding sites. Aspects of integration of TF binding patterns and genome-wide association study results are explored in examples.


2007 ◽  
Vol 4 (2) ◽  
pp. 1-23
Author(s):  
Amitava Karmaker ◽  
Kihoon Yoon ◽  
Mark Doderer ◽  
Russell Kruzelock ◽  
Stephen Kwek

Summary Revealing the complex interaction between trans- and cis-regulatory elements and identifying these potential binding sites are fundamental problems in understanding gene expression. The progresses in ChIP-chip technology facilitate identifying DNA sequences that are recognized by a specific transcription factor. However, protein-DNA binding is a necessary, but not sufficient, condition for transcription regulation. We need to demonstrate that their gene expression levels are correlated to further confirm regulatory relationship. Here, instead of using a linear correlation coefficient, we used a non-linear function that seems to better capture possible regulatory relationships. By analyzing tissue-specific gene expression profiles of human and mouse, we delineate a list of pairs of transcription factor and gene with highly correlated expression levels, which may have regulatory relationships. Using two closely-related species (human and mouse), we perform comparative genome analysis to cross-validate the quality of our prediction. Our findings are confirmed by matching publicly available TFBS databases (like TRANFAC and ConSite) and by reviewing biological literature. For example, according to our analysis, 80% and 85.71% of the targets genes associated with E2F5 and RELB transcription factors have the corresponding known binding sites. We also substantiated our results on some oncogenes with the biomedical literature. Moreover, we performed further analysis on them and found that BCR and DEK may be regulated by some common transcription factors. Similar results for BTG1, FCGR2B and LCK genes were also reported.


Development ◽  
2002 ◽  
Vol 129 (19) ◽  
pp. 4387-4397
Author(s):  
Fiona C. Wardle ◽  
Daniel H. Wainstock ◽  
Hazel L. Sive

The cement gland marks the extreme anterior ectoderm of the Xenopus embryo, and is determined through the overlap of several positional domains. In order to understand how these positional cues activate cement gland differentiation, the promoter of Xag1, a marker of cement gland differentiation, was analyzed. Previous studies have shown that Xag1 expression can be activated by the anterior-specific transcription factor Otx2, but that this activation is indirect. 102 bp of upstream genomic Xag1 sequence restricts reporter gene expression specifically to the cement gland. Within this region, putative binding sites for Ets and ATF/CREB transcription factors are both necessary and sufficient to drive cement gland-specific expression, and cooperate to do so. Furthermore, while the putative ATF/CREB factor is activated by Otx2, a factor acting through the putative Ets-binding site is not. These results suggest that Ets-like and ATF/CREB-like family members play a role in regulating Xag1 expression in the cement gland, through integration of Otx2 dependent and independent pathways.


2020 ◽  
Vol 4 (Supplement_1) ◽  
Author(s):  
Alexandre Daly ◽  
Leonard Cheung ◽  
Michelle Brinkmeier ◽  
Sally Ann Camper

Abstract Recent genome wide association studies have begun to identify loci that are risk factors for sporadic pituitary adenomas, but the genes associated with these loci are unknown. In general, ~90% of GWAS hits are in noncoding regions, making it difficult to transition from genetic mapping to a biological understanding of risk factors. Recent studies that identify enhancer regions by undertaking large scale functional genomic annotation of non-coding elements like Encyclopedia of DNA Elements (ENCODE) have begun to yield a better understanding of some complex diseases. Dense molecular profiling maps of the transcriptome and epigenome have been generated for more than 250 cell lines and 150 tissues, but pituitary cell lines or tissues were not included. Epigenetic and gene expression data are emerging for somatotropes, gonadotropes and corticotropes, but there is very little available data on thyrotropes. We identified the transcription factors and epigenetic changes in chromatin that are associated with differentiation of POU1F1-expressing progenitors into thyrotropes using cell lines that represent an early, undifferentiated Pou1f1 lineage progenitor (GHF-T1) and a committed thyrotrope (TαT1). TαT1 is an excellent cell line for this purpose because it responds to TRH, retinoids, and secretes TSH in response to diurnal cues. We have also used genetic labeling and fluorescence activated cell sorting to purify thyrotropes from pituitaries of young mice and analyzed gene expression using single cell transcriptomics. We used the Assay for TransposaseAccessible Chromatin with sequencing (ATACseq) and Cleavage Under Target and Release Using Nuclease (CUT&RUN) to identify POU1F1 binding sites and histone marks associated with active enhancers, H3K27Ac and H3K4Me1, or inactive regions, H3K27Me3, in GHF-T1 and TαT1 cells. We integrated DNA accessibility, histone modification patterns, transcription factor binding and RNA expression data to identify regulatory elements and candidate transcriptional regulators. We identified POU1F1 binding sites that were unique to each cell line. For example, POU1F1 binds sites in and around Cga and Tshb only in TαT1 cells and Twist1 and Gli3 only in GHFT1 cells. POU1F1 binding sites are commonly associated with bZIP factor consensus binding sites in GHFT1 cells and Helix-Turn-Helix or basic Helix-Loop-Helix in TαT1 cells, suggesting classes of transcription factors that may recruit POU1F1 to unique sites. We validated enhancer function of novel elements we mapped near Tshb, Gata2, and Pitx1 by transfection in TαT1 cells. Finally, we confirmed that an enhancer element near Tshb can drive expression in thyrotropes of transgenic mice. These data extend the ENCODE analysis to an organ that is critical for growth and metabolism. This information could be valuable for understanding pituitary development and disease pathogenesis.


2020 ◽  
Author(s):  
Evan Witt ◽  
Nicolas Svetec ◽  
Sigi Benjamin ◽  
Li Zhao

AbstractEvolutionarily young genes are usually preferentially expressed in the testis across species. While it is known that older genes are generally more broadly expressed than younger genes, the properties that shaped this pattern are unknown. Older genes may gain expression across other tissues uniformly, or faster in certain tissues than others. Using Drosophila gene expression data, we confirmed previous findings that younger genes are disproportionately testis-biased and older genes are disproportionately ovary-biased. We found that the relationship between gene age and expression is stronger in the ovary than any other tissue, and weakest in testis. We performed ATAC-seq on Drosophila testis and found that while genes of all ages are more likely to have open promoter chromatin in testis than in ovary, promoter chromatin alone does not explain the ovary-bias of older genes. Instead, we found that upstream transcription factor (TF) expression is highly predictive of gene expression in ovary, but not in testis. In ovary, TF expression is more predictive of gene expression than open promoter chromatin, whereas testis gene expression is similarly influenced by both TF expression and open promoter chromatin. We propose that the testis is uniquely able to expresses younger genes controlled by relatively few TFs, while older genes with more TF partners are broadly expressed with peak expression most likely in ovary. The testis allows widespread baseline expression that is relatively unresponsive to regulatory changes, whereas the ovary transcriptome is more responsive to trans-regulation and has a higher ceiling for gene expression.


2005 ◽  
Vol 03 (02) ◽  
pp. 281-301 ◽  
Author(s):  
PATRICK C. H. MA ◽  
KEITH C. C. CHAN ◽  
DAVID K. Y. CHIU

The combined interpretation of gene expression data and gene sequences is important for the investigation of the intricate relationships of gene expression at the transcription level. The expression data produced by microarray hybridization experiments can lead to the identification of clusters of co-expressed genes that are likely co-regulated by the same regulatory mechanisms. By analyzing the promoter regions of co-expressed genes, the common regulatory patterns characterized by transcription factor binding sites can be revealed. Many clustering algorithms have been used to uncover inherent clusters in gene expression data. In this paper, based on experiments using simulated and real data, we show that the performance of these algorithms could be further improved. For the clustering of expression data typically characterized by a lot of noise, we propose to use a two-phase clustering algorithm consisting of an initial clustering phase and a second re-clustering phase. The proposed algorithm has several desirable features: (i) it utilizes both local and global information by computing both a "local" pairwise distance between two gene expression profiles in Phase 1 and a "global" probabilistic measure of interestingness of cluster patterns in Phase 2, (ii) it distinguishes between relevant and irrelevant expression values when performing re-clustering, and (iii) it makes explicit the patterns discovered in each cluster for possible interpretations. Experimental results show that the proposed algorithm can be an effective algorithm for discovering clusters in the presence of very noisy data. The patterns that are discovered in each cluster are found to be meaningful and statistically significant, and cannot otherwise be easily discovered. Based on these discovered patterns, genes co-expressed under the same experimental conditions and range of expression levels have been identified and evaluated. When identifying regulatory patterns at the promoter regions of the co-expressed genes, we also discovered well-known transcription factor binding sites in them. These binding sites can provide explanations for the co-expressed patterns.


2018 ◽  
Vol 115 (30) ◽  
pp. E7222-E7230 ◽  
Author(s):  
Sharon R. Grossman ◽  
Jesse Engreitz ◽  
John P. Ray ◽  
Tung H. Nguyen ◽  
Nir Hacohen ◽  
...  

Gene expression is controlled by sequence-specific transcription factors (TFs), which bind to regulatory sequences in DNA. TF binding occurs in nucleosome-depleted regions of DNA (NDRs), which generally encompass regions with lengths similar to those protected by nucleosomes. However, less is known about where within these regions specific TFs tend to be found. Here, we characterize the positional bias of inferred binding sites for 103 TFs within ∼500,000 NDRs across 47 cell types. We find that distinct classes of TFs display different binding preferences: Some tend to have binding sites toward the edges, some toward the center, and some at other positions within the NDR. These patterns are highly consistent across cell types, suggesting that they may reflect TF-specific intrinsic structural or functional characteristics. In particular, TF classes with binding sites at NDR edges are enriched for those known to interact with histones and chromatin remodelers, whereas TFs with central enrichment interact with other TFs and cofactors such as p300. Our results suggest distinct regiospecific binding patterns and functions of TF classes within enhancers.


Sign in / Sign up

Export Citation Format

Share Document