scholarly journals T-Gene: Improved target gene prediction

2019 ◽  
Author(s):  
Timothy O’Connor ◽  
Charles E. Grant ◽  
Mikael Bodén ◽  
Timothy L. Bailey

AbstractMotivationIdentifying the genes regulated by a given transcription factor (its “target genes”) is a key step in developing a comprehensive understanding of gene regulation. Previously we developed a method for predicting the target genes of a transcription factor (TF) based solely on the correlation between a histone modification at the TF’s binding site and the expression of the gene across a set of tissues. That approach is limited to organisms for which extensive histone and expression data is available, and does not explicitly incorporate the genomic distance between the TF and the gene.ResultsWe present the T-Gene algorithm, which overcomes these limitations. T-Gene can be used to predict which genes are most likely to be regulated by a TF, and which of the TF’s binding sites are most likely involved in regulating particular genes. T-Gene calculates a novel score that combines distance and histone/expression correlation, and we show that this score accurately predicts when a regulatory element bound by a TF is in contact with a gene’s promoter, achieving median positive predictive value (PPV) above 50%. T-Gene is easy to use via its web server or as a command-line tool, and can also make accurate predictions (median PPV above 40%) based on distance alone when extensive histone/expression data is not available for the organism. T-Gene provides an estimate of the statistical significance of each of its predictions.AvailabilityThe T-Gene web server, source code, histone/expression data and genome annotation files are provided at http://[email protected]

2020 ◽  
Vol 36 (12) ◽  
pp. 3902-3904
Author(s):  
Timothy O’Connor ◽  
Charles E Grant ◽  
Mikael Bodén ◽  
Timothy L Bailey

Abstract Motivation Identifying the genes regulated by a given transcription factor (TF) (its ‘target genes’) is a key step in developing a comprehensive understanding of gene regulation. Previously, we developed a method (CisMapper) for predicting the target genes of a TF based solely on the correlation between a histone modification at the TF’s binding site and the expression of the gene across a set of tissues or cell lines. That approach is limited to organisms for which extensive histone and expression data are available, and does not explicitly incorporate the genomic distance between the TF and the gene. Results We present the T-Gene algorithm, which overcomes these limitations. It can be used to predict which genes are most likely to be regulated by a TF, and which of the TF’s binding sites are most likely involved in regulating particular genes. T-Gene calculates a novel score that combines distance and histone/expression correlation, and we show that this score accurately predicts when a regulatory element bound by a TF is in contact with a gene’s promoter, achieving median precision above 60%. T-Gene is easy to use via its web server or as a command-line tool, and can also make accurate predictions (median precision above 40%) based on distance alone when extensive histone/expression data is not available for the organism. T-Gene provides an estimate of the statistical significance of each of its predictions. Availability and implementation The T-Gene web server, source code, histone/expression data and genome annotation files are provided at http://meme-suite.org. Supplementary information Supplementary data are available at Bioinformatics online.


Development ◽  
2002 ◽  
Vol 129 (13) ◽  
pp. 3115-3126 ◽  
Author(s):  
Ron Galant ◽  
Christopher M. Walsh ◽  
Sean B. Carroll

Homeotic (Hox) genes regulate the identity of structures along the anterior-posterior axis of most animals. The low DNA-binding specificities of Hox proteins have raised the question of how these transcription factors selectively regulate target gene expression. The discovery that the Extradenticle (Exd)/Pbx and Homothorax (Hth)/Meis proteins act as cofactors for several Hox proteins has advanced the view that interactions with cofactors are critical to the target selectivity of Hox proteins. It is not clear, however, to what extent Hox proteins also regulate target genes in the absence of cofactors. In Drosophila melanogaster, the Hox protein Ultrabithorax (Ubx) promotes haltere development and suppresses wing development by selectively repressing many genes of the wing-patterning hierarchy, and this activity requires neither Exd nor Hth function. Here, we show that Ubx directly regulates a flight appendage-specific cis-regulatory element of the spalt (sal) gene. We find that multiple monomer Ubx-binding sites are required to completely repress this cis-element in the haltere, and that individual Ubx-binding sites are sufficient to mediate its partial repression. These results suggest that Hox proteins can directly regulate target genes in the absence of the cofactor Extradenticle. We propose that the regulation of some Hox target genes evolves via the accumulation of multiple Hox monomer binding sites. Furthermore, because the development and morphological diversity of the distal parts of most arthropod and vertebrate appendages involve Hox, but not Exd/Pbx or Hth/Meis proteins, this mode of target gene regulation appears to be important for distal appendage development and the evolution of appendage diversity.


F1000Research ◽  
2018 ◽  
Vol 7 ◽  
pp. 1933 ◽  
Author(s):  
Ruipeng Lu ◽  
Peter K. Rogan

Background:The distribution and composition ofcis-regulatory modules composed of transcription factor (TF) binding site (TFBS) clusters in promoters substantially determine gene expression patterns and TF targets. TF knockdown experiments have revealed that TF binding profiles and gene expression levels are correlated. We use TFBS features within accessible promoter intervals to predict genes with similar tissue-wide expression patterns and TF targets.Methods:Genes with correlated expression patterns across 53 tissues and TF targets were respectively identified from Bray-Curtis Similarity and TF knockdown experiments. Corresponding promoter sequences were reduced to DNase I-accessible intervals; TFBSs were then identified within these intervals using information theory-based position weight matrices for each TF (iPWMs) and clustered. Features from information-dense TFBS clusters predicted these genes with machine learning classifiers, which were evaluated for accuracy, specificity and sensitivity. Mutations in TFBSs were analyzed toin silicoexamine their impact on cluster densities and the regulatory states of target genes.Results:  We initially chose the glucocorticoid receptor gene (NR3C1), whose regulation has been extensively studied, to test this approach.SLC25A32andTANKwere found to exhibit the most similar expression patterns toNR3C1. A Decision Tree classifier exhibited the largest area under the Receiver Operating Characteristic (ROC) curve in detecting such genes. Target gene prediction was confirmed using siRNA knockdown of TFs, which was found to be more accurate than those predicted after CRISPR/CAS9 inactivation.In-silicomutation analyses of TFBSs also revealed that one or more information-dense TFBS clusters in promoters are required for accurate target gene prediction. Conclusions: Machine learning based on TFBS information density, organization, and chromatin accessibility accurately identifies gene targets with comparable tissue-wide expression patterns. Multiple information-dense TFBS clusters in promoters appear to protect promoters from effects of deleterious binding site mutations in a single TFBS that would otherwise alter regulation of these genes.


F1000Research ◽  
2019 ◽  
Vol 7 ◽  
pp. 1933 ◽  
Author(s):  
Ruipeng Lu ◽  
Peter K. Rogan

Background:The distribution and composition ofcis-regulatory modules composed of transcription factor (TF) binding site (TFBS) clusters in promoters substantially determine gene expression patterns and TF targets. TF knockdown experiments have revealed that TF binding profiles and gene expression levels are correlated. We use TFBS features within accessible promoter intervals to predict genes with similar tissue-wide expression patterns and TF targets using Machine Learning (ML).Methods:Bray-Curtis Similarity was used to identify genes with correlated expression patterns across 53 tissues. TF targets from knockdown experiments were also analyzed by this approach to set up the ML framework. TFBSs were selected within DNase I-accessible intervals of corresponding promoter sequences using information theory-based position weight matrices (iPWMs) for each TF. Features from information-dense clusters of TFBSs were input to ML classifiers which predict these gene targets along with their accuracy, specificity and sensitivity. Mutations in TFBSs were analyzedin silicoto examine their impact on TFBS clustering and predict changes in gene regulation.Results: The glucocorticoid receptor gene (NR3C1), whose regulation has been extensively studied, was selected to test this approach.SLC25A32andTANKexhibited the most similar expression patterns toNR3C1. A Decision Tree classifier exhibited the best performance in detecting such genes, based on Area Under the Receiver Operating Characteristic curve (ROC). TF target gene prediction was confirmed using siRNA knockdown, which was more accurate than CRISPR/CAS9 inactivation. TFBS mutation analyses revealed that accurate target gene prediction required  at least 1  information-dense TFBS cluster. Conclusions: ML based on TFBS information density, organization, and chromatin accessibility accurately identifies gene targets with comparable tissue-wide expression patterns. Multiple information-dense TFBS clusters in promoters appear to protect promoters from effects of deleterious binding site mutations in a single TFBS that would otherwise alter regulation of these genes.


2018 ◽  
Author(s):  
Ruipeng Lu ◽  
Peter K. Rogan

ABSTRACTBackgroundThe distribution and composition ofcis-regulatory modules (e.g. transcription factor binding site (TFBS) clusters) in promoters substantially determine gene expression patterns and TF targets, whose expression levels are significantly regulated by TF binding. TF knockdown experiments have revealed correlations between TF binding profiles and gene expression levels. We present a general framework capable of predicting genes with similar tissue-wide expression patterns from activated or repressed TF targets using machine learning to combine TF binding and epigenetic features.MethodsGenes with correlated expression patterns across 53 tissues were identified according to their Bray-Curtis similarity. DNase I HyperSensitive region (DHS) -accessible promoter intervals of direct TF target genes were scanned with previously derived information theory-based position weight matrices (iPWMs) of 82 TFs. Features from information density-based TFBS clusters were used to predict target genes with machine learning classifiers. The accuracy, specificity and sensitivity of the classifiers were determined for different feature sets. Mutations in TFBSs were also introduced to examine their impact on cluster densities and the regulatory states of predicted target genes.ResultsWe initially chose the glucocorticoid receptor gene (NR3C1), whose regulation has been extensively studied, to test this approach.SLC25A32andTANKwere found to exhibit the most similar expression patterns to this gene across 53 tissues. Prediction of other genes with similar expression profiles was significantly improved by eliminating inaccessible promoter intervals based on DHSs. A Random Forest classifier exhibited the best performance in detecting such coordinately regulated genes (accuracy was 0.972 for training, 0.976 for testing). Target gene prediction was confirmed using CRISPR knockdown data of TFs, which was more accurate than siRNA inactivation. Mutation analyses of TFBSs also revealed that one or more information-dense TFBS clusters in promoters are required for accurate target gene prediction.ConclusionsMachine learning based on TFBS information density, organization, and chromatin accessibility accurately identifies gene targets with comparable tissue-wide expression patterns. Multiple, information-dense TFBS clusters in promoters appear to protect promoters from the effects of deleterious binding site mutations in a single TFBS that would effectively alter the expression state of these genes.


Cells ◽  
2019 ◽  
Vol 8 (4) ◽  
pp. 338 ◽  
Author(s):  
Xiaoqiong Duan ◽  
Xiao Liu ◽  
Wenting Li ◽  
Jacinta A. Holmes ◽  
Annie J. Kruger ◽  
...  

We previously identified that miR-130a downregulates HCV replication through two independent pathways: restoration of host immune responses and regulation of pyruvate metabolism. In this study, we further sought to explore host antiviral target genes regulated by miR-130a. We performed a RT² Profiler™ PCR array to identify the host antiviral genes regulated by miR-130a. The putative binding sites between miR-130a and its downregulated genes were predicted by miRanda. miR-130a and predicted target genes were over-expressed or knocked down by siRNA or CRISPR/Cas9 gRNA. Selected gene mRNAs and their proteins, together with HCV replication in JFH1 HCV-infected Huh7.5.1 cells were monitored by qRT-PCR and Western blot. We identified 32 genes that were significantly differentially expressed more than 1.5-fold following miR-130a overexpression, 28 of which were upregulated and 4 downregulated. We found that ATG5, a target gene for miR-130a, significantly upregulated HCV replication and downregulated interferon stimulated gene expression. miR-130a downregulated ATG5 expression and its conjugation complex with ATG12. ATG5 and ATG5-ATG12 complex affected interferon stimulated gene (ISG) such as MX1 and OAS3 expression and subsequently HCV replication. We concluded that miR-130a regulates host antiviral response and HCV replication through targeting ATG5 via the ATG5-dependent autophagy pathway.


2001 ◽  
Vol 21 (19) ◽  
pp. 6418-6428 ◽  
Author(s):  
Shelley Lane ◽  
Song Zhou ◽  
Ting Pan ◽  
Qian Dai ◽  
Haoping Liu

ABSTRACT Candida albicans undergoes a morphogenetic switch from budding yeast to hyphal growth form in response to a variety of stimuli and growth conditions. Multiple signaling pathways, including a Cph1-mediated mitogen-activated protein kinase pathway and an Efg1-mediated cyclic AMP/protein kinase A pathway, regulate the transition. Here we report the identification of a basic helix-loop-helix transcription factor of the Myc subfamily (Cph2) by its ability to promote pseudohyphal growth inSaccharomyces cerevisiae. Like sterol response element binding protein 1, Cph2 has a Tyr instead of a conserved Arg in the basic DNA binding region. Cph2 regulates hyphal development in C. albicans, ascph2/cph2 mutant strains show medium-specific impairment in hyphal development and in the induction of hypha-specific genes. However, many hypha-specific genes do not have potential Cph2 binding sites in their upstream regions. Interestingly, upstream sequences of all known hypha-specific genes are found to contain potential binding sites for Tec1, a regulator of hyphal development. Northern analysis shows that TEC1 transcription is highest in the medium in which cph2/cph2 displays a defect in hyphal development, and Cph2 is necessary for this transcriptional induction of TEC1. In vitro gel mobility shift experiments show that Cph2 directly binds to the two sterol regulatory element 1-like elements upstream of TEC1. Furthermore, the ectopic expression of TEC1 suppresses the defect ofcph2/cph2 in hyphal development. Therefore, the function of Cph2 in hyphal transcription is mediated, in part, through Tec1. We further show that this function of Cph2 is independent of the Cph1- and Efg1-mediated pathways.


2013 ◽  
Vol 368 (1632) ◽  
pp. 20130018 ◽  
Author(s):  
Andrea I. Ramos ◽  
Scott Barolo

In the era of functional genomics, the role of transcription factor (TF)–DNA binding affinity is of increasing interest: for example, it has recently been proposed that low-affinity genomic binding events, though frequent, are functionally irrelevant. Here, we investigate the role of binding site affinity in the transcriptional interpretation of Hedgehog (Hh) morphogen gradients . We noted that enhancers of several Hh-responsive Drosophila genes have low predicted affinity for Ci, the Gli family TF that transduces Hh signalling in the fly. Contrary to our initial hypothesis, improving the affinity of Ci/Gli sites in enhancers of dpp , wingless and stripe , by transplanting optimal sites from the patched gene, did not result in ectopic responses to Hh signalling. Instead, we found that these enhancers require low-affinity binding sites for normal activation in regions of relatively low signalling. When Ci/Gli sites in these enhancers were altered to improve their binding affinity, we observed patterning defects in the transcriptional response that are consistent with a switch from Ci-mediated activation to Ci-mediated repression. Synthetic transgenic reporters containing isolated Ci/Gli sites confirmed this finding in imaginal discs. We propose that the requirement for gene activation by Ci in the regions of low-to-moderate Hh signalling results in evolutionary pressure favouring weak binding sites in enhancers of certain Hh target genes.


2011 ◽  
Vol 286 (21) ◽  
pp. 18641-18649 ◽  
Author(s):  
Lishan Chen ◽  
Jiashun Zheng ◽  
Nan Yang ◽  
Hao Li ◽  
Su Guo

Sign in / Sign up

Export Citation Format

Share Document