scholarly journals Massively parallel analysis of human 3′ UTRs reveals that AU-rich element length and registration predict mRNA destabilization

Author(s):  
David A Siegel ◽  
Olivier Le Tonqueze ◽  
Anne Biton ◽  
Noah Zaitlen ◽  
David J Erle

Abstract AU-rich elements (AREs) are 3′ UTR cis-regulatory elements that regulate the stability of mRNAs. Consensus ARE motifs have been determined, but little is known about how differences in 3′ UTR sequences that conform to these motifs affect their function. Here we use functional annotation of sequences from 3′ UTRs (fast-UTR), a massively parallel reporter assay (MPRA), to investigate the effects of 41,288 3′ UTR sequence fragments from 4,653 transcripts on gene expression and mRNA stability in Jurkat and Beas2B cells. Our analyses demonstrate that the length of an ARE and its registration (the first and last nucleotides of the repeating ARE motif) have significant effects on gene expression and stability. Based on this finding, we propose improved ARE classification and concomitant methods to categorize and predict the effect of AREs on gene expression and stability. Finally, to investigate the advantages of our general experimental design we examine other motifs including constitutive decay elements (CDEs), where we show that the length of the CDE stem-loop has a significant impact on steady-state expression and mRNA stability. We conclude that fast-UTR, in conjunction with our analytical approach, can produce improved yet simple sequence-based rules for predicting the activity of human 3′ UTRs.

Author(s):  
David A. Siegel ◽  
Olivier Le Tonqueze ◽  
Anne Biton ◽  
Noah Zaitlen ◽  
David J. Erle

AbstractAU-rich elements (AREs) are 3′ UTR cis-regulatory elements that regulate the stability of mRNAs. Consensus ARE motifs have been determined, but little is known about how differences in 3′ UTR sequences that conform to these motifs affect their function. Here we use functional annotation of sequences from 3′ UTRs (fast-UTR), a massively parallel reporter assay (MPRA), to investigate the effects of 41,288 3′ UTR sequence fragments from 4,653 transcripts on gene expression and mRNA stability. The library included 9,142 AREs, and incorporated a set of fragments bearing mutations in each ARE. Our analyses demonstrate that the length of an ARE and its registration (the first and last nucleotides of the repeating ARE motif) have significant effects on gene expression and stability. Based on this finding, we propose improved ARE classification and concomitant methods to categorize and predict the effect of AREs on gene expression and stability. Our new approach explains 64±13% of the contribution of AREs to the stability of human 3′ UTRs in Jurkat cells and predicts ARE activity in an unrelated cell type. Finally, to investigate the advantages of our general experimental design for annotating 3′ UTR elements we examine other motifs including constitutive decay elements (CDEs), where we show that the length of the CDE stem-loop has a significant impact on steady-state expression and mRNA stability. We conclude that fast-UTR, in conjunction with our analytical approach, can produce improved yet simple sequence-based rules for predicting the activity of human 3′ UTRs containing functional motifs.


2015 ◽  
Author(s):  
Ilias Georgakopoulos-Soares ◽  
Naman Jain ◽  
Jesse Gray ◽  
Martin Hemberg

DNA regulatory elements contain short motifs where transcription factors (TFs) can bind to modulate gene expression. Although the broad principles of TF regulation are well understood, the rules that dictate how combinatorial TF binding translates into transcriptional activity remain largely unknown. With the rapid advances in DNA synthesis and sequencing technologies and the continuing decline in the associated costs, high-throughput experiments can be performed to investigate the regulatory role of thousands of oligonucleotide sequences simultaneously. Nevertheless, designing high-throughput reporter assay experiments such as Massively Parallel Reporter Assays (MPRAs) and similar methods remains challenging. We introduce MPRAnator, a set of tools that facilitate rapid design of MPRA experiments. With MPRA Motif design, a set of variables provides fine control of how motifs are placed into sequences therefore allowing the user to investigate the rules that govern TF occupancy. MPRA SNP design can be used to investigate the functional effects of single or combinations of SNPs at regulatory sequences. Finally, the Transmutation tool allows for the design of negative controls by permitting scrambling, reversing, complementing or introducing multiple random mutations in the input sequences or motifs.


2016 ◽  
Author(s):  
Avanthi Raghavan ◽  
Xiao Wang ◽  
Peter Rogov ◽  
Li Wang ◽  
Xiaolan Zhang ◽  
...  

AbstractGenome-wide association studies have identified a number of novel genetic loci linked to serum cholesterol and triglyceride levels. The causal DNA variants at these loci and the mechanisms by which they influence phenotype and disease risk remain largely unexplored. Expression quantitative trait locus analyses of patient liver and fat biopsies indicate that many lipid-associated variants influence gene expression in a cis-regulatory manner. However, linkage disequilibrium among neighboring SNPs at a genome-wide association study-implicated locus makes it challenging to pinpoint the actual variant underlying an association signal. We used a methodological framework for causal variant discovery that involves high-throughput identification of putative disease-causal loci through a functional reporter-based screen, the massively parallel reporter assay, followed by validation of prioritized variants in genome-edited human pluripotent stem cell models generated with CRISPR-Cas9. We complemented the stem cell models with CRISPR interference experiments in vitro and in knock-in mice in vivo. We provide validation for two high-priority SNPs, rs2277862 and rs10889356, being causal for lipid-associated expression quantitative trait loci. We also highlight the challenges inherent in modeling common genetic variation with these experimental approaches.Author SummaryGenome-wide association studies have identified numerous loci linked to a variety of clinical phenotypes. It remains a challenge to identify and validate the causal DNA variants in these loci. We describe the use of a high-throughput technique called the massively parallel reporter assay to analyze thousands of candidate causal DNA variants for their potential effects on gene expression. We use a combination of genome editing in human pluripotent stem cells, “CRISPR interference” experiments in other cultured human cell lines, and genetically modified mice to analyze the two highest-priority candidate DNA variants to emerge from the massively parallel reporter assay, and we confirm the relevance of the variants to nearby gene expression. These findings highlight a methodological framework with which to identify and functionally validate causal DNA variants.


2021 ◽  
Author(s):  
Nicolai von Kuegelgen ◽  
Samantha Mendonsa ◽  
Sayaka Dantsuji ◽  
Maya Ron ◽  
Marieluise Kirchner ◽  
...  

Cells adopt highly polarized shapes and form distinct subcellular compartments largely due to the localization of many mRNAs to specific areas, where they are translated into proteins with local functions. This mRNA localization is mediated by specific cis-regulatory elements in mRNAs, commonly called "zipcodes." Their recognition by RNA-binding proteins (RBPs) leads to the integration of the mRNAs into macromolecular complexes and their localization. While there are hundreds of localized mRNAs, only a few zipcodes have been characterized. Here, we describe a novel neuronal zipcode identification protocol (N-zip) that can identify zipcodes across hundreds of 3'UTRs. This approach combines a method of separating the principal subcellular compartments of neurons - cell bodies and neurites - with a massively parallel reporter assay. Our analysis identifies the let-7 binding site and (AU)n motif as de novo zipcodes in mouse primary cortical neurons and suggests a strategy for detecting many more.


Author(s):  
Hsin-Yen Larry Wu ◽  
Polly Yingshan Hsu

ABSTRACTUpstream ORFs (uORFs) are widespread cis-regulatory elements in the 5’ untranslated regions of eukaryotic genes. Translation of uORFs could negatively regulate protein synthesis by repressing main ORF (mORF) translation and by reducing mRNA stability presumably through nonsense-mediated decay (NMD). While the above expectations were supported in animals, they have not been extensively tested in plants. Using ribosome profiling, we systematically identified 2093 Actively Translated uORFs (ATuORFs) in Arabidopsis seedlings and examined their roles in gene expression regulation by integrating multiple genome-wide datasets. Compared with genes without uORFs, we found ATuORFs result in 38%, 14%, and 43% reductions in translation efficiency, mRNA stability, and protein levels, respectively. The effects of predicted but not actively translated uORFs are much weaker than those of ATuORFs. Interestingly, ATuORF-containing genes are also expressed at higher levels and encode longer proteins with conserved domains, features that are common in evolutionarily older genes. Moreover, we provide evidence that uORF translation in plants, unlike in vertebrates, generally does not trigger NMD. We found ATuORF-containing transcripts are degraded through 5’ to 3’ decay, while NMD targets are degraded through both 5’ to 3’ and 3’ to 5’ decay, suggesting uORF-associated mRNA decay and NMD have distinct genetic requirements. Furthermore, we showed ATuORFs and NMD repress translation through separate mechanisms. Our results reveal that the potent inhibition of uORFs on mORF translation and mRNA stability in plants are independent of NMD, highlighting a fundamental difference in gene expression regulation by uORFs in the plant and animal kingdoms.


Circulation ◽  
2015 ◽  
Vol 132 (suppl_3) ◽  
Author(s):  
Nathan R Tucker ◽  
Jiangchuan Ye ◽  
Honghuang Lin ◽  
Michael A McLellan ◽  
Emelia J Benjamin ◽  
...  

Introduction: Genome-wide association studies have identified 14 independent loci for atrial fibrillation (AF). The 4q25 locus upstream of the left-right asymmetry gene PITX2 is, by far, the strongest association signal for AF. However, as with most GWAS loci, the functional variants are noncoding, presumed to be regulatory, and remain unknown. We therefore sought to rapidly identify the functional variants at an AF locus by combining high throughput sequencing and massively parallel reporter assays. Methods and Results: We sequenced a ~750kb region encompassing the PITX2 locus in 462 individuals with early-onset AF from the MGH AF Study and 464 referents from the Framingham Heart Study. The SNP most significantly associated with AF in our sequenced sample was rs2129983, which is 140kb from PITX2 (OR=2.43, P =8.9X10 -16 ). rs2129983 is approximately 1.7kb from the most significantly associated SNP in a prior AF GWAS, rs6817105 (r 2 =0.52). From the targeted sequencing analysis, we identified 262 SNVs with a MAF >0.5% within a genomic region bounded by SNPs with an r2 greater than 0.4 with the top variant. To identify functional variants, we then utilized a massively parallel reporter assay (MPRA) in order to measure enhancer activity at each SNP across the entire AF locus. In both HL-1 and C2C12 myoblasts, MPRA identified many distinct SNP regions with differential enhancer activity. Using AF-association status as a standard, we were able to identify a series of variants that have both differential activity in either cell line tested and also a high level of association (rs17042076, rs4469143). Mechanistically, these functional SNPs are predicted to alter transcription factor binding. Conclusions: We have comprehensively identified the AF-associated variation at 4q25 and determined which of these variants are functional through differential enhancer activity. Here, in addition to identifying the causative variation for AF at 4q25, we provide a generalizable pathway for translating this work to other loci, a method that could expedite the identification of causative genetic variants at other disease loci.


Blood ◽  
2009 ◽  
Vol 114 (22) ◽  
pp. 4059-4059
Author(s):  
Osheiza Abdulmalik ◽  
J. Eric Russell

Abstract 4059 Poster Board III-994 Transgenic approaches to β thalassemia and sickle cell disease require viral vectors that express high levels of therapeutic β-like globin proteins. We recently proposed that the overall expression of these transgenes would likely be improved by structural modifications that prolong the cytoplasmic half-lives of their encoded mRNAs. Relevant experiments from our laboratory have previously linked the constitutively high stability of β-globin mRNA to a region of its 3'UTR that appears to interact with at least two distinct cytoplasmic mRNA-stabilizing factors, and is predicted to form an imperfect stem-loop (SL) structure. Based upon these findings, we conducted enzymatic secondary-structure mapping studies of the β-globin 3'UTR, unequivocally validating the existence of the predicted functional stem-loop element. We subsequently reasoned that the constitutive half-life of β-globin mRNA might be prolonged by the insertion of multiple SL motifs into its 3'UTR, resulting in increased levels of the mRNA–and its encoded β-globin product–in terminally differentiating erythroid cells. To test this hypothesis, we constructed full-length β-globin genes containing either wild-type 3'UTRs, or variant 3'UTRs that had been modified to contain either two or three tandem SL motifs. Each gene was identically linked to a tetracycline-suppressible promoter, permitting pulse-chase mRNA stability analyses to be conducted in vivo in intact cultured cells. Erythroid-phenotype K562 cells were transiently transfected with SL-variant and control wild-type β-globin genes, exposed to tetracycline, and levels of β-globin mRNA determined by qRT-PCR at defined intervals using tet-indifferent β-actin mRNA as internal control. Relative to wild-type β-globin mRNA, SL-duplicate β-globin mRNAs displayed a position-dependent two-fold increase in cytoplasmic half-life; SL-triplicate β-globin mRNAs did not exhibit any additional stability. These experiments confirm the existence of a defined SL structure within the β-globin 3'UTR, and demonstrate that duplication of this motif can substantially increase the stability of β-globin mRNA. We subsequently designed a series of experiments to elucidate post-transcriptional processes involved in mRNA hyperstability. These studies required the construction of HeLa cells that stably express either wild-type β-globin mRNA (11 subclones) or SL-duplicate β-globin mRNAs (10 subclones). Preliminary analyses indicate an approximate 1.5-fold increase in the median steady-state expression of SL-duplicate genes, consistent with a prolongation in the half-life of its encoded mRNA. While formal mRNA stability studies are not yet complete, early data appear to replicate results from experiments conducted in transiently transfected cells. We have also initiated structural studies to link differences in the stability of SL-variant β-globin mRNA to alterations in its poly(A) tail. Using an RNase H-based strategy, we identified a previously unknown poly(A)-site heterogeneity–of undetermined significance–affecting both wild-type and SL-duplicate β-globin mRNAs. Finally, we recently isolated fifty-four K562 subclones expressing SL-duplicate or control β-globin mRNAs; parallel analyses of these cells will permit the cell-specificity of β-globin SL-directed mRNA stabilization to be investigated in detail. Results from each of these studies will be immediately applicable to the design of high-efficiency therapeutic transgenes for β thalassemia and sickle-cell disease. Disclosures: No relevant conflicts of interest to declare.


2021 ◽  
Author(s):  
Andrew R. Norman ◽  
Ann H. Ryu ◽  
Kirsty Jamieson ◽  
Sean Thomas ◽  
Yin Shen ◽  
...  

ABSTRACTHuman accelerated regions (HARs) are sequences that have evolved at an accelerated rate in the human lineage. Some HARs are developmental enhancers. We used a massively parallel reporter assay (MPRA) to identify HARs with enhancer activity in a mammalian testis cell line. A subset of HARs exhibited differential activity between the human and chimpanzee orthologs, representing candidates for underlying unique human male reproductive biology. We further characterized one of these candidate testis enhancers, 2xHAR.238. CRISPR/Cas9-mediated deletion in a testis cell line and mice revealed that 2xHAR.238 enhances expression of Gli2, encoding a Hedgehog pathway effector, in testis Leydig cells. 4C-seq revealed that 2xHAR.238 contacts the Gli2 promoter, consistent with enhancer function. In adult male mice, deletion of 2xHAR.238 disrupted mouse male-typical behavior and male interest in female odor. Combined, our work identifies a HAR that promotes the expression of Gli2 in Leydig cells and may have contributed to the evolution of human male reproductive biology.


2017 ◽  
Author(s):  
Chinmay J Shukla ◽  
Alexandra L McCorkindale ◽  
Chiara Gerhardinger ◽  
Keegan D Korthauer ◽  
Moran N Cabili ◽  
...  

SummaryOne of the biggest surprises since the sequencing of the human genome has been the discovery of thousands of long noncoding RNAs (lncRNAs)1–6. Although lncRNAs and mRNAs are similar in many ways, they differ with lncRNAs being more nuclear-enriched and in several cases exclusively nuclear7,8. Yet, the RNA-based sequences that determine nuclear localization remain poorly understood9–11. Towards the goal of systematically dissecting the lncRNA sequences that impart nuclear localization, we developed a massively parallel reporter assay (MPRA). Unlike previous MPRAs12–15 that determine motifs important for transcriptional regulation, we have modified this approach to identify sequences sufficient for RNA nuclear enrichment for 38 human lncRNAs. Using this approach, we identified 109 unique, conserved nuclear enrichment regions, originating from 29 distinct lncRNAs. We also discovered two shorter motifs within our nuclear enrichment regions. We further validated the sufficiency of several regions to impart nuclear localization by single molecule RNA fluorescence in situ hybridization (smRNA-FISH). Taken together, these results provide a first systematic insight into the sequence elements responsible for the nuclear enrichment of lncRNA molecules.


Sign in / Sign up

Export Citation Format

Share Document