Comparison of Promoter Sequences Based on Inter Motif Distance

Author(s):  
A. Meera ◽  
Lalitha Rangarajan

Understanding how the regulation of gene networks is orchestrated is an important challenge for characterizing complex biological processes. The DNA sequences that comprise promoters do not provide much direct information about regulation. A substantial part of the regulation results from the interaction of transcription factors (TFs) with specific cis regulatory DNA sequences. These regulatory sequences are organized in a modular fashion, with each module (enhancer) containing one or more binding sites for a specific combination of TFs. In the present work, the authors have proposed to investigate the inter motif distance between the important motifs in the promoter sequences of citrate synthase of different mammals. The authors have used a new distance measure to compare the promoter sequences. Results reveal that there exists more similarity between organisms in the same chromosome.

Author(s):  
A. Meera ◽  
Lalitha Rangarajan

Understanding how the regulation of gene networks is orchestrated is an important challenge for characterizing complex biological processes. The DNA sequences that comprise promoters do not provide much direct information about regulation. A substantial part of the regulation results from the interaction of transcription factors (TFs) with specific cis regulatory DNA sequences. These regulatory sequences are organized in a modular fashion, with each module (enhancer) containing one or more binding sites for a specific combination of TFs. In the present work, the authors have proposed to investigate the inter motif distance between the important motifs in the promoter sequences of citrate synthase of different mammals. The authors have used a new distance measure to compare the promoter sequences. Results reveal that there exists more similarity between organisms in the same chromosome.


2021 ◽  
Author(s):  
Timothy T. Harden ◽  
Ben J. Vincent ◽  
Angela H. DePace

SUMMARYMost animal transcription factors are categorized as activators or repressors without specifying their mechanisms of action. Defining their specific roles is critical for deciphering the logic of transcriptional regulation and predicting the function of regulatory sequences. Here, we define the kinetic roles of three activating transcription factors in the Drosophila embryo—Zelda, Bicoid and Stat92E—by introducing their binding sites into theeven skippedstripe 2 enhancer and measuring transcriptional output with live imaging. We find that these transcription factors act on different subsets of kinetic parameters, and these subsets can change over the course of nuclear cycle (NC) 14. These transcription factors all increase the fraction of active nuclei. Zelda dramatically shortens the time interval between the start of NC 14 and initial activation, and Stat92E increases the duration of active transcription intervals throughout NC 14. Zelda also decreases the time intervals between instances of active transcription early in NC 14, while Stat92E does so later. Different transcription factors therefore play distinct kinetic roles in activating transcription; this has consequences for understanding both regulatory DNA sequences as well as the biochemical function of transcription factors.


2021 ◽  
Author(s):  
Eeshit Dhaval Vaishnav ◽  
Carl G. de Boer ◽  
Moran Yassour ◽  
Jennifer Molinet ◽  
Lin Fan ◽  
...  

Mutations in non-coding cis-regulatory DNA sequences can alter gene expression, organismal phenotype, and fitness. Fitness landscapes, which map DNA sequence to organismal fitness, are a long-standing goal in biology, but have remained elusive because it is challenging to generalize accurately to the vast space of possible sequences using models built on measurements from a limited number of endogenous regulatory sequences. Here, we construct a sequence-to-expression model for such a landscape and use it to decipher principles of cis-regulatory evolution. Using tens of millions of randomly sampled promoter DNA sequences and their measured expression levels in the yeast Sacccharomyces cerevisiae, we construct a deep transformer neural network model that generalizes with exceptional accuracy, and enables sequence design for gene expression engineering. Using our model, we predict and experimentally validate expression divergence under random genetic drift and strong selection weak mutation regimes, show that conflicting expression objectives in different environments constrain expression adaptation, and find that stabilizing selection on gene expression leads to the moderation of regulatory complexity. We present an approach for detecting selective constraint on gene expression using our model and natural sequence variation, and validate it using observed cis-regulatory diversity across 1,011 yeast strains, cross-species RNA-seq from three different clades, and measured expression-to-fitness curves. Finally, we develop a characterization of regulatory evolvability, use it to visualize fitness landscapes in two dimensions, discover evolvability archetypes, quantify the mutational robustness of individual sequences and highlight the mutational robustness of extant natural regulatory sequence populations. Our work provides a general framework that addresses key questions in the evolution of cis-regulatory sequences.


PLoS ONE ◽  
2019 ◽  
Vol 14 (6) ◽  
pp. e0218073 ◽  
Author(s):  
Rajiv Movva ◽  
Peyton Greenside ◽  
Georgi K. Marinov ◽  
Surag Nair ◽  
Avanti Shrikumar ◽  
...  

1987 ◽  
Vol 7 (5) ◽  
pp. 1807-1814 ◽  
Author(s):  
A B Chepelinsky ◽  
B Sommer ◽  
J Piatigorsky

Previous experiments have indicated that 5' flanking DNA sequences (nucleotides-366 to +46) are capable of regulating the lens-specific transcription of the murine alpha A-crystallin gene. Here we have analyzed these 5' regulatory sequences by transfecting explanted embryonic chicken lens epithelia with different alpha A-crystallin-CAT (chloramphenicol acetyltransferase) hybrid genes (alpha A-crystallin promoter sequences fused to the bacterial CAT gene in the pSVO-CAT expression vector). The results indicated the presence of a proximal (-88 to +46) and a distal (-111 to -88) domain which must interact for promoter function. Deletion experiments showed that the sequence between -88 and -60 was essential for function of the proximal domain in the explanted epithelia. A synthetic oligonucleotide containing the sequence between -111 and -84 activated the proximal domain when placed in either orientation 57 base pairs upstream from position -88 of the alpha A-crystallin-CAT hybrid gene.


1991 ◽  
Vol 96 (2) ◽  
pp. 162-167 ◽  
Author(s):  
Chuan-Kui Jiang ◽  
Howard S Epstein ◽  
Marjana Tomic ◽  
Irwin M Freedberg ◽  
Miroslav Blumenberg

2018 ◽  
Author(s):  
Rajiv Movva ◽  
Peyton Greenside ◽  
Georgi K. Marinov ◽  
Surag Nair ◽  
Avanti Shrikumar ◽  
...  

AbstractThe relationship between noncoding DNA sequence and gene expression is not well-understood. Massively parallel reporter assays (MPRAs), which quantify the regulatory activity of large libraries of DNA sequences in parallel, are a powerful approach to characterize this relationship. We present MPRA-DragoNN, a convolutional neural network (CNN)-based framework to predict and interpret the regulatory activity of DNA sequences as measured by MPRAs. While our method is generally applicable to a variety of MPRA designs, here we trained our model on the Sharpr-MPRA dataset that measures the activity of ~500,000 constructs tiling 15,720 regulatory regions in human K562 and HepG2 cell lines. MPRA-DragoNN predictions were moderately correlated (Spearman ρ = 0.28) with measured activity and were within range of replicate concordance of the assay. State-of-the-art model interpretation methods revealed high-resolution predictive regulatory sequence features that overlapped transcription factor (TF) binding motifs. We used the model to investigate the cell type and chromatin state preferences of predictive TF motifs. We explored the ability of our model to predict the allelic effects of regulatory variants in an independent MPRA experiment and fine map putative functional SNPs in loci associated with lipid traits. Our results suggest that interpretable deep learning models trained on MPRA data have the potential to reveal meaningful patterns in regulatory DNA sequences and prioritize regulatory genetic variants, especially as larger, higher-quality datasets are produced.


Sign in / Sign up

Export Citation Format

Share Document