scholarly journals Global properties of regulatory sequences are predicted by transcription factor recognition mechanisms

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Zain M. Patel ◽  
Timothy R. Hughes

Abstract Background Mammalian genomes contain millions of putative regulatory sequences, which are delineated by binding of multiple transcription factors. The degree to which spacing and orientation constraints among transcription factor binding sites contribute to the recognition and identity of regulatory sequence is an unresolved but important question that impacts our understanding of genome function and evolution. Global mechanisms that underlie phenomena including the size of regulatory sequences, their uniqueness, and their evolutionary turnover remain poorly described. Results Here, we ask whether models incorporating different degrees of spacing and orientation constraints among transcription factor binding sites are broadly consistent with several global properties of regulatory sequence. These properties include length, sequence diversity, turnover rate, and dominance of specific TFs in regulatory site identity and cell type specification. Models with and without spacing and orientation constraints are generally consistent with all observed properties of regulatory sequence, and with regulatory sequences being fundamentally small (~ 1 nucleosome). Uniqueness of regulatory regions and their rapid evolutionary turnover are expected under all models examined. An intriguing issue we identify is that the complexity of eukaryotic regulatory sites must scale with the number of active transcription factors, in order to accomplish observed specificity. Conclusions Models of transcription factor binding with or without spacing and orientation constraints predict that regulatory sequences should be fundamentally short, unique, and turn over rapidly. We posit that the existence of master regulators may be, in part, a consequence of evolutionary pressure to limit the complexity and increase evolvability of regulatory sites.

2016 ◽  
Vol 2016 ◽  
pp. 1-27 ◽  
Author(s):  
Kristopher J. L. Irizarry ◽  
Randall L. Bryden

Color variation provides the opportunity to investigate the genetic basis of evolution and selection. Reptiles are less studied than mammals. Comparative genomics approaches allow for knowledge gained in one species to be leveraged for use in another species. We describe a comparative vertebrate analysis of conserved regulatory modules in pythons aimed at assessing bioinformatics evidence that transcription factors important in mammalian pigmentation phenotypes may also be important in python pigmentation phenotypes. We identified 23 python orthologs of mammalian genes associated with variation in coat color phenotypes for which we assessed the extent of pairwise protein sequence identity between pythons and mouse, dog, horse, cow, chicken, anole lizard, and garter snake. We next identified a set of melanocyte/pigment associated transcription factors (CREB, FOXD3, LEF-1, MITF, POU3F2, and USF-1) that exhibit relatively conserved sequence similarity within their DNA binding regions across species based on orthologous alignments across multiple species. Finally, we identified 27 evolutionarily conserved clusters of transcription factor binding sites within ~200-nucleotide intervals of the 1500-nucleotide upstream regions of AIM1, DCT, MC1R, MITF, MLANA, OA1, PMEL, RAB27A, and TYR from Python bivittatus. Our results provide insight into pigment phenotypes in pythons.


2020 ◽  
Author(s):  
Jiayue-Clara Jiang ◽  
Joseph Rothnagel ◽  
Kyle Upton

ABSTRACTTransposons, a type of repetitive DNA elements, can contribute cis-regulatory sequences and regulate the expression of human genes. L1PA2 is a hominoid-specific subfamily of LINE1 transposons, with approximately 4,940 copies in the human genome. Individual transposons have been demonstrated to contribute specific biological functions, such as cancer-specific alternate promoter activity for the MET oncogene, which is correlated with enhanced malignancy and poor prognosis in cancer. Given the sequence similarity between L1PA2 elements, we hypothesise that transposons within the L1PA2 subfamily likely have a common regulatory potential and may provide a mechanism for global genome regulation. Here we show that in breast cancer, the regulatory potential of L1PA2 is not limited to single transposons, but is common within the subfamily. We demonstrate that the L1PA2 subfamily is an abundant reservoir of transcription factor binding sites, the majority of which cluster in the LINE1 5’UTR. In MCF7 breast cancer cells, over 27% of L1PA2 transposons harbour binding sites of functionally interacting, cancer-associated transcription factors. The ubiquitous and replicative nature of L1PA2 makes them an exemplary vector to disperse co-localised transcription factor binding sites, facilitating the co-ordinated regulation of genes. In MCF7 cells, L1PA2 transposons also supply transcription start sites to up-regulated transcripts. These transcriptionally active L1PA2 transposons display a cancer-specific active epigenetic profile, and likely play an oncogenic role in breast cancer aetiology. Overall, we show that the L1PA2 subfamily contributes abundant regulatory sequences in breast cancer cells, and likely plays a global role in modulating the tumorigenic state in breast cancer.


2009 ◽  
Vol 2009 ◽  
pp. 1-8 ◽  
Author(s):  
K. Shameer ◽  
S. Ambika ◽  
Susan Mary Varghese ◽  
N. Karaba ◽  
M. Udayakumar ◽  
...  

Elucidating the key players of molecular mechanism that mediate the complex stress-responses in plants system is an important step to develop improved variety of stress tolerant crops. Understanding the effects of different types of biotic and abiotic stress is a rapidly emerging domain in the area of plant research to develop better, stress tolerant plants. Information about the transcription factors, transcription factor binding sites, function annotation of proteins coded by genes expressed during abiotic stress (for example: drought, cold, salinity, excess light, abscisic acid, and oxidative stress) response will provide better understanding of this phenomenon. STIFDB is a database of abiotic stress responsive genes and their predicted abiotic transcription factor binding sites in Arabidopsis thaliana. We integrated 2269 genes upregulated in different stress related microarray experiments and surveyed their 1000 bp and 100 bp upstream regions and 5′UTR regions using the STIF algorithm and identified putative abiotic stress responsive transcription factor binding sites, which are compiled in the STIFDB database. STIFDB provides extensive information about various stress responsive genes and stress inducible transcription factors of Arabidopsis thaliana. STIFDB will be a useful resource for researchers to understand the abiotic stress regulome and transcriptome of this important model plant system.


2008 ◽  
Vol 294 (6) ◽  
pp. G1354-G1361 ◽  
Author(s):  
Ramesh Kekuda ◽  
Prosenjit Saha ◽  
Uma Sundaram

In a rabbit model of chronic intestinal inflammation, we previously demonstrated that the activity of Na-glucose cotransporter (SGLT1), SLC5A1, is inhibited. This inhibition is secondary to a decrease in the number of cotransporters, indicating that the regulation of SGLT1 during chronic inflammation is at the level of transcription. However, the regulation of SGLT1 expression and the transcription factors involved in the regulation are not yet known. In this report, we describe the cloning and characterization of rabbit SGLT1 promoter and the identification of transcription factors affected in villus cells during chronic intestinal inflammation. The promoter sequence for SGLT1 gene was identified by using the publicly available rabbit genomic sequence. Even though rabbit SGLT1 promoter did not have considerable overall homology with other mammalian SGLT1 promoters, two specificity protein 1 (Sp1) and a hepatocyte nuclear factor 1 (HNF1) binding sites were highly conserved among the species. Rabbit SGLT1 cDNA was encoded by 15 exons. Minimal promoter region determination showed that 196 nucleotides upstream of the transcription start site were sufficient for optimal promoter activity. This region encompassed two transcription factor binding sites, Sp1 and HNF1. For maximal SGLT1 promoter activity, these two transcription factor binding sites were essential, and their effect was synergistic, indicating that two separate regulatory pathways might be involved in their regulation. Using mobility shift assays, we further demonstrated that the binding of both Sp1 and HNF1 transcription factors to SGLT1 promoter regions were affected during chronic intestinal inflammation. Thus this report demonstrates that Sp1 and HNF1 transcription factors act in concert to regulate SGLT1 transcription in the chronically inflamed intestine.


2004 ◽  
Vol 128 (12) ◽  
pp. 1364-1371
Author(s):  
Ximbo Zhang ◽  
Frederick L. Kiechle

Abstract Context.—The pyrimidine nucleoside analog, cytosine arabinoside (Ara-C), is an effective therapeutic agent for acute leukemia. The phosphorylated triphosphate, cytosine arabinoside triphosphate, competes with deoxycytosine triphosphate as a substrate for incorporation into DNA. Once incorporated into DNA, it inhibits DNA polymerase and topoisomerase I and modifies the tertiary structure of DNA. Objective.—To determine if the substitution of Ara-C for cytosine in double-stranded oligonucleotides that contain 4 specific transcription factor binding sites (TATA, GATA, C/EBP, and AP-2α) alters transcription factor binding to their respective DNA binding elements. Design.—Transcription factors were obtained from nuclear extracts from human promyelocytic leukemia HL-60 cells. [32P]-end-labeled double-stranded oligonucleotides that contained 1 or 2 specific transcription factor binding sites with or without Ara-C substitution for cytosine were used to assess transcription factor binding by electrophoretic mobility shift assay. Results.—The substitution of Ara-C for cytosine within and outside the transcription factor binding element (AP-2α, C/EBP), outside the binding element only (GATA, TATA), or within the binding element only (AP-2α) all result in a reduction in transcription factor binding to their respective DNA binding element. Conclusion.—The reduction of the binding capacity of transcription factors with their respective DNA binding elements may depend on structural changes within oligonucleotides induced by Ara-C incorporation. This altered binding capacity of transcription factors to their DNA binding elements may represent one mechanism for Ara-C cytotoxicity secondary to inhibition of transcription of new messenger RNAs and, subsequently, translation of new proteins.


2015 ◽  
Vol 2015 ◽  
pp. 1-7 ◽  
Author(s):  
Guohua Wang ◽  
Fang Wang ◽  
Qian Huang ◽  
Yu Li ◽  
Yunlong Liu ◽  
...  

Transcription factors are proteins that bind to DNA sequences to regulate gene transcription. The transcription factor binding sites are short DNA sequences (5–20 bp long) specifically bound by one or more transcription factors. The identification of transcription factor binding sites and prediction of their function continue to be challenging problems in computational biology. In this study, by integrating the DNase I hypersensitive sites with known position weight matrices in the TRANSFAC database, the transcription factor binding sites in gene regulatory region are identified. Based on the global gene expression patterns in cervical cancer HeLaS3 cell and HelaS3-ifnα4h cell (interferon treatment on HeLaS3 cell for 4 hours), we present a model-based computational approach to predict a set of transcription factors that potentially cause such differential gene expression. Significantly, 6 out 10 predicted functional factors, including IRF, IRF-2, IRF-9, IRF-1 and IRF-3, ICSBP, belong to interferon regulatory factor family and upregulate the gene expression levels responding to the interferon treatment. Another factor, ISGF-3, is also a transcriptional activator induced by interferon alpha. Using the different transcription factor binding sites selected criteria, the prediction result of our model is consistent. Our model demonstrated the potential to computationally identify the functional transcription factors in gene regulation.


Sign in / Sign up

Export Citation Format

Share Document