scholarly journals Probing 3′UTRs as modular regulators of gene expression

2021 ◽  
Author(s):  
Jamie Y Auxillos ◽  
Samuel J Haynes ◽  
Abhishek Jain ◽  
Clemence Alibert ◽  
Weronika Danecka ◽  
...  

Genes are commonly abstracted into a coding sequence and cis-regulatory elements (CREs), such as promoter and terminator regions, and short sequence motifs within these regions. Modern cloning techniques allow easy assembly of synthetic genetic constructs from discrete cis-regulatory modules. However, it is unclear how much the contributions of CREs to gene expression depend on other CREs in the host gene. Using budding yeast, we probe the extent of composability, or independent effects, of distinct CREs. We confirm that the quantitative effect of a terminator on gene expression depends on both promoter and coding sequence. We then explore whether individual cis-regulatory motifs within terminator regions display similar context dependence, focusing on putative regulatory motifs inferred using transcriptome-wide datasets of mRNA decay. We construct a library of diverse reporter genes, consisting of different combinations of motifs within various terminator contexts, paired with different promoters, to test the extent of composability. Our results show that the effect of a motif on RNA abundance depends both on its host terminator, and also on the associated promoter sequence. Consequently, this emphasises the need for improved motif inference algorithms that include both local and global context effects, which in turn could aid researchers in the accurate use of diverse CREs for the engineering of synthetic genetic constructs.

2019 ◽  
Vol 19 (1) ◽  
Author(s):  
Lin Zhang ◽  
Zhiqiang Song ◽  
Fangfang Li ◽  
Xixi Li ◽  
Haikun Ji ◽  
...  

Abstract Background Drought stress is one of the major abiotic stresses that affects plant growth and productivity. The GAPCp genes play important roles in drought stress tolerance in multiple species. The aim of this experiment was to identify the core cis-regulatory elements that may respond to drought stress in the GAPCp2 and GAPCp3 promoter sequences. Results In this study, the promoters of GAPCp2 and GAPCp3 were cloned. The promoter activities were significantly improved under abiotic stress via regulation of Rluc reporter gene expression, while promoter sequence analysis indicated that these fragments were not almost identical. In transgenic Arabidopsis with the expression of the GUS reporter gene under the control of one of these promoters, the activities of GUS were strong in almost all tissues except the seeds, and the activities were induced after abiotic stress. The yeast one-hybrid system and EMSA demonstrated that TaMYB bound TaGAPCp2P/3P. By analyzing different 5′ deletion mutants of these promoters, it was determined that TaGAPCp2P (− 1312~ − 528) and TaGAPCp3P (− 2049~ − 610), including the MYB binding site, contained enhancer elements that increased gene expression levels under drought stress. We used an effector and a reporter to co-transform tobacco and found that TaMYB interacted with the specific MYB binding sites of TaGAPCp2P (− 1197~ − 635) and TaGAPCp3P (− 1456~ − 1144 and − 718~ − 610) in plant cells. Then, the Y1H system and EMSA assay demonstrated that these MYB binding sites in TaGAPCp2P (− 1135 and − 985) and TaGAPCp3P (− 1414 and − 665) were the target cis-elements of TaMYB. The deletion of the specific MYB binding sites in the promoter fragments significantly restrained the drought response, and these results confirmed that these MYB binding sites (AACTAAA/C) play vital roles in improving the transcription levels under drought stress. The results of qRT-PCR in wheat protoplasts transiently overexpressing TaMYB indicated that the expression of TaGAPCp2/3 induced by abiotic stress was upregulated by TaMYB. Conclusion The MYB binding sites (AACTAAA/C) in TaGAPCp2P/3P were identified as the key cis-elements for responding to drought stress and were bound by the transcription factor TaMYB.


2021 ◽  
Author(s):  
Travis L La Fleur ◽  
Ayaan Hossain ◽  
Howard M Salis

Transcription rates are regulated by the interactions between RNA polymerase, sigma factor, and promoter DNA sequences in bacteria. However, it remains unclear how non-canonical sequence motifs collectively control transcription rates. Here, we combined massively parallel assays, biophysics, and machine learning to develop a 346-parameter model that predicts site-specific transcription initiation rates for any σ70 promoter sequence, validated across 17396 bacterial promoters with diverse sequences. We applied the model to predict genetic context effects, design σ70 promoters with desired transcription rates, and identify undesired promoters inside engineered genetic systems. The model provides a biophysical basis for understanding gene regulation in natural genetic systems and precise transcriptional control for engineering synthetic genetic systems.


2021 ◽  
Author(s):  
Paola Cornejo-Paramo ◽  
Kathrein E Roper ◽  
Sandie M Degnan ◽  
Bernard Degnan ◽  
Emily Wong

The chromatin environment plays a central role in regulating developmental gene expression in metazoans. Yet, the basal regulatory landscape of metazoan embryogenesis is unknown. Here, we generate chromatin accessibility profiles for six embryonic, plus larval and adult stages in the sponge Amphimedon queenslandica. These profiles are reproducible within stages, reflect histone modifications, and identify transcription factor (TF) binding sequence motifs predictive of cis-regulatory elements during embryogenesis in other metazoans but not the unicellular relative Capsaspora. Motif analysis of chromatin accessibility profiles across Amphimedon embryogenesis identifies three major developmental periods. As in bilaterian embryogenesis, early development in Amphimedon involves activating and repressive chromatin in regions both proximal and distal to transcription start sites. Transcriptionally repressive elements (silencers) are prominent during late embryogenesis and coincide with an increase in cis-regulatory regions harbouring metazoan TF binding motifs, and an increase in the expression of metazoan-specific genes. Changes in chromatin state and gene expression in Amphimedon suggest the conservation of distal enhancers, dynamically silenced chromatin, and TF-DNA binding specificity in animal embryogenesis.


2021 ◽  
Vol 22 (8) ◽  
pp. 3914
Author(s):  
Costantino Parisi ◽  
Shikha Vashisht ◽  
Cecilia Lanny Winata

Precise control of gene expression is crucial to ensure proper development and biological functioning of an organism. Enhancers are non-coding DNA elements which play an essential role in regulating gene expression. They contain specific sequence motifs serving as binding sites for transcription factors which interact with the basal transcription machinery at their target genes. Heart development is regulated by intricate gene regulatory network ensuring precise spatiotemporal gene expression program. Mutations affecting enhancers have been shown to result in devastating forms of congenital heart defect. Therefore, identifying enhancers implicated in heart biology and understanding their mechanism is key to improve diagnosis and therapeutic options. Despite their crucial role, enhancers are poorly studied, mainly due to a lack of reliable way to identify them and determine their function. Nevertheless, recent technological advances have allowed rapid progress in enhancer discovery. Model organisms such as the zebrafish have contributed significant insights into the genetics of heart development through enabling functional analyses of genes and their regulatory elements in vivo. Here, we summarize the current state of knowledge on heart enhancers gained through studies in model organisms, discuss various approaches to discover and study their function, and finally suggest methods that could further advance research in this field.


2020 ◽  
Author(s):  
Baoxing Song ◽  
Hai Wang ◽  
Yaoyao Wu ◽  
Evan Rees ◽  
Daniel J Gates ◽  
...  

AbstractDNA sequencing technology has advanced so quickly, identifying key functional regions using evolutionary approaches is required to understand how those genomes work. This research develops a sensitive sequence alignment approach to identify functional constrained non-coding sequences in the Andropogoneae tribe. The grass tribe Andropogoneae contains several crop species descended from a common ancestor ~18 million years ago. Despite broadly similar phenotypes, they have tremendous genomic diversity with a broad range of ploidy levels and transposons. These features make Andropogoneae a powerful system for studying conserved non-coding sequence (CNS), here we used it to understand the function of CNS in maize. We find that 86% of CNS comprise known genomic elements e.g., cis-regulatory elements, chromosome interactions, introns, several transposable element superfamilies, and are linked to genomic regions related to DNA replication initiation, DNA methylation and histone modification. In maize, we show that CNSs regulate gene expression and variants in CNS are associated with phenotypic variance, and rare CNS absence contributes to loss of gene expression. Furthermore, we find the evolution of CNS is associated with the functional diversification of duplicated genes in the context of the maize subgenomes. Our results provide a quantitative understanding of constrained non-coding elements and identify functional non-coding variation in maize.


Author(s):  
Peng He ◽  
Brian A. Williams ◽  
Diane Trout ◽  
Georgi K. Marinov ◽  
Henry Amrhein ◽  
...  

AbstractIn mammalian embryogenesis differential gene expression gradually builds the identity and complexity of each tissue and organ system. We systematically quantified mouse polyA-RNA from embryo day E10.5 to birth, sampling 17 whole tissues, enhanced with single-cell measurements for the developing limb. The resulting developmental transcriptome is globally structured by dynamic cytodifferentiation, body-axis and cell-proliferation gene sets, characterized by their promoters’ transcription factor (TF) motif codes. We decomposed the tissue-level transcriptome using scRNA-seq and found that neurogenesis and haematopoiesis dominate at both the gene and cellular levels, jointly accounting for 1/3 of differential gene expression and over 40% of identified cell types. Integrating promoter sequence motifs with companion ENCODE epigenomic profiles identified a promoter de-repression mechanism unique to neuronal expression clusters and attributable to known and novel repressors. Focusing on the developing limb, scRNA-seq identified 25 known and candidate novel cell types, including progenitor and differentiating states with computationally inferred lineage relationships. We extracted cell type TF networks and complementary sets of candidate enhancer elements by de-convolving whole-tissue IDEAS epigenome chromatin state models. These ENCODE reference data, computed network components and IDEAS chromatin segmentations, are companion resources to the matching epigenomic developmental matrix, available for researchers to further mine and integrate.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Karolina Stępniak ◽  
Magdalena A. Machnicka ◽  
Jakub Mieczkowski ◽  
Anna Macioszek ◽  
Bartosz Wojtaś ◽  
...  

AbstractChromatin structure and accessibility, and combinatorial binding of transcription factors to regulatory elements in genomic DNA control transcription. Genetic variations in genes encoding histones, epigenetics-related enzymes or modifiers affect chromatin structure/dynamics and result in alterations in gene expression contributing to cancer development or progression. Gliomas are brain tumors frequently associated with epigenetics-related gene deregulation. We perform whole-genome mapping of chromatin accessibility, histone modifications, DNA methylation patterns and transcriptome analysis simultaneously in multiple tumor samples to unravel epigenetic dysfunctions driving gliomagenesis. Based on the results of the integrative analysis of the acquired profiles, we create an atlas of active enhancers and promoters in benign and malignant gliomas. We explore these elements and intersect with Hi-C data to uncover molecular mechanisms instructing gene expression in gliomas.


Author(s):  
Yanrong Ji ◽  
Zhihan Zhou ◽  
Han Liu ◽  
Ramana V Davuluri

Abstract Motivation Deciphering the language of non-coding DNA is one of the fundamental problems in genome research. Gene regulatory code is highly complex due to the existence of polysemy and distant semantic relationship, which previous informatics methods often fail to capture especially in data-scarce scenarios. Results To address this challenge, we developed a novel pre-trained bidirectional encoder representation, named DNABERT, to capture global and transferrable understanding of genomic DNA sequences based on up and downstream nucleotide contexts. We compared DNABERT to the most widely used programs for genome-wide regulatory elements prediction and demonstrate its ease of use, accuracy and efficiency. We show that the single pre-trained transformers model can simultaneously achieve state-of-the-art performance on prediction of promoters, splice sites and transcription factor binding sites, after easy fine-tuning using small task-specific labeled data. Further, DNABERT enables direct visualization of nucleotide-level importance and semantic relationship within input sequences for better interpretability and accurate identification of conserved sequence motifs and functional genetic variant candidates. Finally, we demonstrate that pre-trained DNABERT with human genome can even be readily applied to other organisms with exceptional performance. We anticipate that the pre-trained DNABERT model can be fined tuned to many other sequence analyses tasks. Availability and implementation The source code, pretrained and finetuned model for DNABERT are available at GitHub (https://github.com/jerryji1993/DNABERT). Supplementary information Supplementary data are available at Bioinformatics online.


Sign in / Sign up

Export Citation Format

Share Document