scholarly journals DeepTFactor: A deep learning-based tool for the prediction of transcription factors

2020 ◽  
Vol 118 (2) ◽  
pp. e2021171118
Author(s):  
Gi Bae Kim ◽  
Ye Gao ◽  
Bernhard O. Palsson ◽  
Sang Yup Lee

A transcription factor (TF) is a sequence-specific DNA-binding protein that modulates the transcription of a set of particular genes, and thus regulates gene expression in the cell. TFs have commonly been predicted by analyzing sequence homology with the DNA-binding domains of TFs already characterized. Thus, TFs that do not show homologies with the reported ones are difficult to predict. Here we report the development of a deep learning-based tool, DeepTFactor, that predicts whether a protein in question is a TF. DeepTFactor uses a convolutional neural network to extract features of a protein. It showed high performance in predicting TFs of both eukaryotic and prokaryotic origins, resulting in F1 scores of 0.8154 and 0.8000, respectively. Analysis of the gradients of prediction score with respect to input suggested that DeepTFactor detects DNA-binding domains and other latent features for TF prediction. DeepTFactor predicted 332 candidate TFs in Escherichia coli K-12 MG1655. Among them, 84 candidate TFs belong to the y-ome, which is a collection of genes that lack experimental evidence of function. We experimentally validated the results of DeepTFactor prediction by further characterizing genome-wide binding sites of three predicted TFs, YqhC, YiaU, and YahB. Furthermore, we made available the list of 4,674,808 TFs predicted from 73,873,012 protein sequences in 48,346 genomes. DeepTFactor will serve as a useful tool for predicting TFs, which is necessary for understanding the regulatory systems of organisms of interest. We provide DeepTFactor as a stand-alone program, available at https://bitbucket.org/kaistsystemsbiology/deeptfactor.

RNA Biology ◽  
2018 ◽  
Vol 15 (12) ◽  
pp. 1468-1476 ◽  
Author(s):  
Fan Wang ◽  
Pranik Chainani ◽  
Tommy White ◽  
Jin Yang ◽  
Yu Liu ◽  
...  

1994 ◽  
Vol 14 (10) ◽  
pp. 6570-6583 ◽  
Author(s):  
N D Perkins ◽  
A B Agranoff ◽  
E Pascal ◽  
G J Nabel

Induction of human immunodeficiency virus type 1 (HIV-1) gene expression in stimulated T cells has been attributed to the activation of the transcription factor NF-kappa B. The twice-repeated kappa B sites within the HIV-1 long terminal repeat are in close proximity to three binding sites for Sp1. We have previously shown that a cooperative interaction of NF-kappa B with Sp1 is required for the efficient stimulation of HIV-1 transcription. In this report, we define the domains of each protein responsible for this effect. Although the transactivation domains seemed likely to mediate this interaction, we find, surprisingly, that this interaction occurs through the putative DNA-binding domains of both proteins. Sp1 specifically interacted with the amino-terminal region of RelA(p65). Similarly, RelA bound directly to the zinc finger region of Sp1. This interaction was specific and resulted in cooperative DNA binding to the kappa B and Sp1 sites in the HIV-1 long terminal repeat. Furthermore, the amino-terminal region of RelA did not associate with several other transcription factors, including MyoD, E12, or Kox15, another zinc finger protein. These findings suggest that the juxtaposition of DNA-binding sites promotes a specific protein interaction between the DNA-binding regions of these transcription factors. This interaction is required for HIV transcriptional activation and may provide a mechanism to allow for selective activation of kappa B-regulated genes.


PLoS ONE ◽  
2016 ◽  
Vol 11 (9) ◽  
pp. e0162681 ◽  
Author(s):  
Yuriy D. Korostelev ◽  
Ilya A. Zharov ◽  
Andrey A. Mironov ◽  
Alexandra B. Rakhmaininova ◽  
Mikhail S. Gelfand

1994 ◽  
Vol 14 (10) ◽  
pp. 6570-6583 ◽  
Author(s):  
N D Perkins ◽  
A B Agranoff ◽  
E Pascal ◽  
G J Nabel

Induction of human immunodeficiency virus type 1 (HIV-1) gene expression in stimulated T cells has been attributed to the activation of the transcription factor NF-kappa B. The twice-repeated kappa B sites within the HIV-1 long terminal repeat are in close proximity to three binding sites for Sp1. We have previously shown that a cooperative interaction of NF-kappa B with Sp1 is required for the efficient stimulation of HIV-1 transcription. In this report, we define the domains of each protein responsible for this effect. Although the transactivation domains seemed likely to mediate this interaction, we find, surprisingly, that this interaction occurs through the putative DNA-binding domains of both proteins. Sp1 specifically interacted with the amino-terminal region of RelA(p65). Similarly, RelA bound directly to the zinc finger region of Sp1. This interaction was specific and resulted in cooperative DNA binding to the kappa B and Sp1 sites in the HIV-1 long terminal repeat. Furthermore, the amino-terminal region of RelA did not associate with several other transcription factors, including MyoD, E12, or Kox15, another zinc finger protein. These findings suggest that the juxtaposition of DNA-binding sites promotes a specific protein interaction between the DNA-binding regions of these transcription factors. This interaction is required for HIV transcriptional activation and may provide a mechanism to allow for selective activation of kappa B-regulated genes.


2020 ◽  
Author(s):  
Audrey Pelletier ◽  
Alexandre Mayran ◽  
Arthur Gouhier ◽  
James G Omichinski ◽  
Aurelio Balsalobre ◽  
...  

AbstractThe pioneer transcription factor Pax7 contains two DNA binding domains (DBD), a paired and a homeo domain. Previous work on Pax7 and the related Pax3 had shown that each DBD can bind a cognate DNA sequence, thus defining two targets of binding and possibly modalities of action. Genomic targets of Pax7 pioneer action leading to chromatin opening are enriched for composite DNA target sites containing juxtaposed binding sites for both paired and homeo domains. The present work investigated the implication of both DBDs in pioneer action. We now show that the composite sequence is a higher affinity Pax7 binding site compared to either paired or homeo binding sites and that efficient binding to this site involves both DBDs. We also show that a Pax7 monomer binds composite sites and that methylation of cytosines within the binding site does not affect binding, which is consistent with pioneer action exerted at methylated DNA sites within nucleosomal heterochromatin. Finally, introduction of single amino acid mutations in either the paired or homeo domain that impair binding to cognate DNA sequences showed that both DBDs must be intact for pioneer action. In contrast, only the paired domain is required for low affinity binding of heterochromatin sites. Thus, Pax7 pioneer action on heterochromatin requires unique protein:DNA interactions that are more complex compared to its simpler DNA binding modalities at accessible enhancer target sites.Significance StatementPioneer transcription factors have the unique ability to recognize DNA target sites within closed heterochromatin and to trigger chromatin opening. Only a fraction of the heterochromatin recruitment sites of pioneers are subject to chromatin opening. The molecular basis for this selectivity is unknown and the present work addressed the importance of DNA sequence affinity for selection of sites to open. The pioneering ability of the pioneer factor Pax7 is not strictly determined by affinity or DNA sequence of binding sites, nor by number or methylation status of DNA sites. Mutation analyses showed that recruitment to heterochromatin is primarily dependent on the Pax7 paired domain whereas the ability to open chromatin requires both paired and homeo DNA binding domains.


2018 ◽  
Vol 109 (6) ◽  
pp. 845-864 ◽  
Author(s):  
Elisabeth Härtig ◽  
Claudia Frädrich ◽  
Maren Behringer ◽  
Anja Hartmann ◽  
Meina Neumann‐Schaal ◽  
...  

Methods ◽  
1993 ◽  
Vol 5 (2) ◽  
pp. 125-137 ◽  
Author(s):  
Jingdong Liu ◽  
Thomas E. Wilson ◽  
Jeffrey Milbrandt ◽  
Mark Johnston

1994 ◽  
Vol 14 (3) ◽  
pp. 1786-1795 ◽  
Author(s):  
J F Morris ◽  
R Hromas ◽  
F J Rauscher

The myeloid zinc finger gene 1, MZF1, encodes a transcription factor which is expressed in hematopoietic progenitor cells that are committed to myeloid lineage differentiation. MZF1 contains 13 C2H2 zinc fingers arranged in two domains which are separated by a short glycine- and proline-rich sequence. The first domain consists of zinc fingers 1 to 4, and the second domain is formed by zinc fingers 5 to 13. We have determined that both sets of zinc finger domains bind DNA. Purified, recombinant MZF1 proteins containing either the first set of zinc fingers or the second set were prepared and used to affinity select DNA sequences from a library of degenerate oligonucleotides by using successive rounds of gel shift followed by PCR amplification. Surprisingly, both DNA-binding domains of MZF1 selected similar DNA-binding consensus sequences containing a core of four or five guanine residues, reminiscent of an NF-kappa B half-site: 1-4, 5'-AGTGGGGA-3'; 5-13, 5'-CGGGnGAGGGGGAA-3'. The full-length MZF1 protein containing both sets of zinc finger DNA-binding domains recognizes synthetic oligonucleotides containing either the 1-4 or 5-13 consensus binding sites in gel shift assays. Thus, we have identified the core DNA consensus binding sites for each of the two DNA-binding domains of a myeloid-specific zinc finger transcription factor. Identification of these DNA-binding sites will allow us to identify target genes regulated by MZF1 and to assess the role of MZF1 as a transcriptional regulator of hematopoiesis.


2015 ◽  
Author(s):  
Sonja Hänzelmann ◽  
Chao-Chung Kuo ◽  
Marie Kalwa ◽  
Wolfgang Wagner ◽  
Ivan G. Costa

AbstractLong (>200vbps) non-coding RNAs (lncRNA) can act as a scaffold promoting the interaction of several proteins, RNA and DNA. Some lncRNAs interact with the DNA via a triple helix formation. Triple helices are formed by a single stranded RNA/DNA molecule, which binds to the major groove of a double helix following a canonical code. Recently, sequence analysis methods have been proposed to detect triple helices for a given RNA and DNA sequences. We propose the Triplex Domain Finder (TDF) to detect DNA binding domains in RNA molecules. For a candidate lncRNA and potential target DNA regions, i.e. promoter of genes differentially regulated after the knockdown of the lncRNA, TDF evaluates whether particular RNA regions are likely to form DNA binding domains (DBD). Moreover, the DNA binding sites from the predicted DBDs are used to indicate potential target DNA regions, i.e. genes with high binding site coverage in their promoter. The command line tool provides results on a user friendly and graphical html interface. A case study on FENDRR, an lncRNA known to form triple helices, demonstrates that TDF is able to recover both previously discovered DBDs and DNA binding sites. Source code, tutorial and case studies are available at www.regulatory-genomics.org/tdf.


1994 ◽  
Vol 14 (3) ◽  
pp. 1786-1795
Author(s):  
J F Morris ◽  
R Hromas ◽  
F J Rauscher

The myeloid zinc finger gene 1, MZF1, encodes a transcription factor which is expressed in hematopoietic progenitor cells that are committed to myeloid lineage differentiation. MZF1 contains 13 C2H2 zinc fingers arranged in two domains which are separated by a short glycine- and proline-rich sequence. The first domain consists of zinc fingers 1 to 4, and the second domain is formed by zinc fingers 5 to 13. We have determined that both sets of zinc finger domains bind DNA. Purified, recombinant MZF1 proteins containing either the first set of zinc fingers or the second set were prepared and used to affinity select DNA sequences from a library of degenerate oligonucleotides by using successive rounds of gel shift followed by PCR amplification. Surprisingly, both DNA-binding domains of MZF1 selected similar DNA-binding consensus sequences containing a core of four or five guanine residues, reminiscent of an NF-kappa B half-site: 1-4, 5'-AGTGGGGA-3'; 5-13, 5'-CGGGnGAGGGGGAA-3'. The full-length MZF1 protein containing both sets of zinc finger DNA-binding domains recognizes synthetic oligonucleotides containing either the 1-4 or 5-13 consensus binding sites in gel shift assays. Thus, we have identified the core DNA consensus binding sites for each of the two DNA-binding domains of a myeloid-specific zinc finger transcription factor. Identification of these DNA-binding sites will allow us to identify target genes regulated by MZF1 and to assess the role of MZF1 as a transcriptional regulator of hematopoiesis.


Sign in / Sign up

Export Citation Format

Share Document