scholarly journals Structure of the full SARS-CoV-2 RNA genome in infected cells

Author(s):  
Tammy C. T. Lan ◽  
Matthew F. Allan ◽  
Lauren E. Malsick ◽  
Stuti Khandwala ◽  
Sherry S. Y. Nyeo ◽  
...  

SUMMARYSARS-CoV-2 is a betacoronavirus with a single-stranded, positive-sense, 30-kilobase RNA genome responsible for the ongoing COVID-19 pandemic. Currently, there are no antiviral drugs or vaccines with proven efficacy, and development of these treatments are hampered by our limited understanding of the molecular and structural biology of the virus. Like many other RNA viruses, RNA structures in coronaviruses regulate gene expression and are crucial for viral replication. Although genome and transcriptome data were recently reported, there is to date little experimental data on predicted RNA structures in SARS-CoV-2 and most putative regulatory sequences are uncharacterized. Here we report the secondary structure of the entire SARS-CoV-2 genome in infected cells at single nucleotide resolution using dimethyl sulfate mutational profiling with sequencing (DMS-MaPseq). Our results reveal previously undescribed structures within critical regulatory elements such as the genomic transcription-regulating sequences (TRSs). Contrary to previous studies, our in-cell data show that the structure of the frameshift element, which is a major drug target, is drastically different from prevailing in vitro models. The genomic structure detailed here lays the groundwork for coronavirus RNA biology and will guide the design of SARS-CoV-2 RNA-based therapeutics.

2021 ◽  
Author(s):  
Silvi Rouskin ◽  
Tammy Lan ◽  
Matthew Allan ◽  
Lauren Malsick ◽  
Stuti Khandwala ◽  
...  

Abstract SARS-CoV-2 is a betacoronavirus with a single-stranded, positive-sense, 30-kilobase RNA genome responsible for the ongoing COVID-19 pandemic. Currently, there are no antiviral drugs with proven efficacy, and development of these treatments are hampered by our limited understanding of the molecular and structural biology of the virus. Like many other RNA viruses, RNA structures in coronaviruses regulate gene expression and are crucial for viral replication. Although genome and transcriptome data were recently reported, there is to date little experimental data on native RNA structures in SARS-CoV-2 and most putative regulatory sequences are functionally uncharacterized. Here we report secondary structure ensembles of the entire SARS-CoV-2 genome in infected cells at single nucleotide resolution using dimethyl sulfate mutational profiling with sequencing (DMS-MaPseq) and the algorithm ‘detection of RNA folding ensembles using expectation–maximization’ clustering (DREEM). Our results reveal previously undescribed alternative RNA conformations across the genome, including structures of the frameshift stimulating element (FSE), a major drug target, that are drastically different from prevailing in vitro population average models. Importantly, we find that this structural ensemble promotes frameshifting rates (~40%) similar to in vivo ribosome profiling studies and much higher than the canonical minimal FSE (~20%). Overall, our result highlight the value of studying RNA folding in its native, dynamic and cellular context. The genomic structures detailed here lays the groundwork for coronavirus RNA biology and will guide the design of SARS-CoV-2 RNA-based therapeutics.


1986 ◽  
Vol 6 (12) ◽  
pp. 4548-4557
Author(s):  
J Hirsh ◽  
B A Morgan ◽  
S B Scholnick

We delimited sequences necessary for in vivo expression of the Drosophila melanogaster dopa decarboxylase gene Ddc. The expression of in vitro-altered genes was assayed following germ line integration via P-element vectors. Sequences between -209 and -24 were necessary for normally regulated expression, although genes lacking these sequences could be expressed at 10 to 50% of wild-type levels at specific developmental times. These genes showed components of normal developmental expression, which suggests that they retain some regulatory elements. All Ddc genes lacking the normal immediate 5'-flanking sequences were grossly deficient in larval central nervous system expression. Thus, this upstream region must contain at least one element necessary for this expression. A mutated Ddc gene without a normal TATA boxlike sequence used the normal RNA start points, indicating that this sequences is not required for start point specificity.


2021 ◽  
Author(s):  
Giuliano Crispatzu ◽  
Rizwan Rehimi ◽  
Tomas Pachano ◽  
Tore Bleckwehl ◽  
Sara de la Cruz Molina ◽  
...  

AbstractPoised enhancers (PEs) represent a limited and genetically distinct set of distal regulatory elements that control the induction of developmental genes in a hierarchical and non-redundant manner. Before becoming activated in differentiating cells, PEs are already bookmarked in pluripotent cells with unique chromatin and topological features that could contribute to their privileged regulatory properties. However, since PEs were originally identified and subsequently characterized using embryonic stem cells (ESC) as an in vitro differentiation system, it is currently unknown whether PEs are functionally conserved in vivo. Here, we generate and mine various types of genomic data to show that the chromatin and 3D structural features of PEs are conserved among mouse pluripotent cells both in vitro and in vivo. We also uncovered that, in mouse pluripotent cells, the interactions between PEs and their bivalent target genes are globally controlled by the combined action of Polycomb, Trithorax and architectural proteins. Moreover, distal regulatory sequences located close to developmental genes and displaying the typical genetic (i.e. proximity to CpG islands) and chromatin (i.e. high accessibility and H3K27me3 levels) features of PEs are commonly found across vertebrates. These putative PEs show high sequence conservation, preferentially within specific vertebrate clades, with only a small subset being evolutionary conserved across all vertebrates. Lastly, by genetically disrupting evolutionary conserved PEs in mouse and chicken embryos, we demonstrate that these regulatory elements play essential and non-redundant roles during the induction of major developmental genes in vivo.


2016 ◽  
Author(s):  
Molly Gasperini ◽  
Gregory M. Findlay ◽  
Aaron McKenna ◽  
Jennifer H. Milbank ◽  
Choli Lee ◽  
...  

AbstractThe extent to which distal non-coding mutations contribute to Mendelian disease remains a major unknown in human genetics. Given that a gene’s in vivo function can be appropriately modeled in vitro, CRISPR/Cas9 genome editing enables the large-scale perturbation of distal non-coding regions to identify functional elements in their native context. However, early attempts at such screens have relied on one individual guide RNA (gRNA) per cell, resulting in sparse mutagenesis with minimal redundancy across regions of interest. To address this, we developed a system that uses pairs of gRNAs to program thousands of kilobase-scale deletions that scan across a targeted region in a tiling fashion (“ScanDel”). As a proof-of-concept, we applied ScanDel to program 4,342 overlapping 1- and 2- kilobase (Kb) deletions that tile a 206 Kb region centered on HPRT1, the gene underlying Lesch-Nyhan syndrome, with median 27-fold redundancy per base. Programmed deletions were functionally assayed by selecting for loss of HPRT1 function with 6-thioguanine. HPRT1 exons served as positive controls, and all were successfully identified as functionally critical by the screen. Remarkably, HPRT1 function appeared robust to deletion of any intergenic or deeply intronic non-coding region across the 206 Kb locus, indicating that proximal regulatory sequences are sufficient for its expression. A sparser mutagenesis screen of the same 206 Kb with individual gRNAs also failed to identify critical distal regulatory elements. Although our screen did find programmed deletions and individual gRNAs with putative functional consequences that targeted exon-proximal non-coding sequences (e.g. the promoter), long-read sequencing revealed that this signal was driven almost entirely by rare, unexpected deletions that extended into exonic sequence. These targeted validation experiments defined a small region surrounding the transcriptional start site as the only non-coding sequence essential to HPRT1 function. Overall, our results suggest that distal regulatory elements are not critical for HPRT1 expression, and underscore the necessity of comprehensive edited-locus genotyping for validating the results of CRISPR screens. The application of ScanDel to additional loci will enable more insight into the extent to which the disruption of distal non-coding elements contributes to Mendelian diseases. In addition, dense, redundant, large-scale deletion scanning with gRNA pairs will facilitate a deeper understanding of endogenous gene regulation in the human genome.


Development ◽  
1996 ◽  
Vol 122 (2) ◽  
pp. 627-635 ◽  
Author(s):  
D.L. Song ◽  
G. Chalepakis ◽  
P. Gruss ◽  
A.L. Joyner

The temporally and spatially restricted expression of the mouse Engrailed (En) genes is essential for development of the midbrain and cerebellum. The regulation of En-2 expression was studied using in vitro protein-DNA binding assays and in vivo expression analysis in transgenic mice to gain insight into the genetic events that lead to regionalization of the developing brain. A minimum En-2 1.0 kb enhancer fragment was defined and found to contain multiple positive and negative regulatory elements that function in concert to establish the early embryonic mid-hindbrain expression. Furthermore, the mid-hindbrain regulatory sequences were shown to be structurally and functionally conserved in humans. The mouse paired-box-containing genes Pax-2, Pax-5 and Pax-8 show overlapping expression with the En genes in the developing brain. Significantly, two DNA-binding sites for Pax-2, Pax-5 and Pax-8 proteins were identified in the 1.0 kb En-2 regulatory sequences, and mutation of the binding sites disrupted initiation and maintenance of expression in transgenic mice. These results present strong molecular evidence that the Pax genes are direct upstream regulators of En-2 in the genetic cascade controlling mid-hindbrain development. These mouse studies, taken together with others in Drosophila and zebrafish on the role of Pax genes in controlling expression of En family members, indicate that a Pax-En genetic pathway has been conserved during evolution.


2020 ◽  
Vol 48 (22) ◽  
pp. 12436-12452 ◽  
Author(s):  
Ilaria Manfredonia ◽  
Chandran Nithin ◽  
Almudena Ponce-Salvatierra ◽  
Pritha Ghosh ◽  
Tomasz K Wirecki ◽  
...  

Abstract SARS-CoV-2 is a betacoronavirus with a linear single-stranded, positive-sense RNA genome, whose outbreak caused the ongoing COVID-19 pandemic. The ability of coronaviruses to rapidly evolve, adapt, and cross species barriers makes the development of effective and durable therapeutic strategies a challenging and urgent need. As for other RNA viruses, genomic RNA structures are expected to play crucial roles in several steps of the coronavirus replication cycle. Despite this, only a handful of functionally-conserved coronavirus structural RNA elements have been identified to date. Here, we performed RNA structure probing to obtain single-base resolution secondary structure maps of the full SARS-CoV-2 coronavirus genome both in vitro and in living infected cells. Probing data recapitulate the previously described coronavirus RNA elements (5′ UTR and s2m), and reveal new structures. Of these, ∼10.2% show significant covariation among SARS-CoV-2 and other coronaviruses, hinting at their functionally-conserved role. Secondary structure-restrained 3D modeling of these segments further allowed for the identification of putative druggable pockets. In addition, we identify a set of single-stranded segments in vivo, showing high sequence conservation, suitable for the development of antisense oligonucleotide therapeutics. Collectively, our work lays the foundation for the development of innovative RNA-targeted therapeutic strategies to fight SARS-related infections.


1986 ◽  
Vol 6 (12) ◽  
pp. 4548-4557 ◽  
Author(s):  
J Hirsh ◽  
B A Morgan ◽  
S B Scholnick

We delimited sequences necessary for in vivo expression of the Drosophila melanogaster dopa decarboxylase gene Ddc. The expression of in vitro-altered genes was assayed following germ line integration via P-element vectors. Sequences between -209 and -24 were necessary for normally regulated expression, although genes lacking these sequences could be expressed at 10 to 50% of wild-type levels at specific developmental times. These genes showed components of normal developmental expression, which suggests that they retain some regulatory elements. All Ddc genes lacking the normal immediate 5'-flanking sequences were grossly deficient in larval central nervous system expression. Thus, this upstream region must contain at least one element necessary for this expression. A mutated Ddc gene without a normal TATA boxlike sequence used the normal RNA start points, indicating that this sequences is not required for start point specificity.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Giuliano Crispatzu ◽  
Rizwan Rehimi ◽  
Tomas Pachano ◽  
Tore Bleckwehl ◽  
Sara Cruz-Molina ◽  
...  

AbstractPoised enhancers (PEs) represent a genetically distinct set of distal regulatory elements that control the expression of major developmental genes. Before becoming activated in differentiating cells, PEs are already bookmarked in pluripotent cells with unique chromatin and topological features that could contribute to their privileged regulatory properties. However, since PEs were originally characterized in embryonic stem cells (ESC), it is currently unknown whether PEs are functionally conserved in vivo. Here, we show that the chromatin and 3D structural features of PEs are conserved among mouse pluripotent cells both in vitro and in vivo. We also uncovered that the interactions between PEs and their target genes are globally controlled by the combined action of Polycomb, Trithorax and architectural proteins. Moreover, distal regulatory sequences located close to developmental genes and displaying the typical genetic (i.e. CpG islands) and chromatin (i.e. high accessibility and H3K27me3 levels) features of PEs are commonly found across vertebrates. These putative PEs show high sequence conservation within specific vertebrate clades, with only a few being evolutionary conserved across all vertebrates. Lastly, by genetically disrupting PEs in mouse and chicken embryos, we demonstrate that these regulatory elements play essential roles during the induction of major developmental genes in vivo.


2017 ◽  
Author(s):  
Clarence Y. Cheng ◽  
Wipapat Kladwang ◽  
Joseph Yesselman ◽  
Rhiju Das

ABSTRACTDespite the critical roles RNA structures play in regulating gene expression, sequencing-based methods for experimentally determining RNA base pairs have remained inaccurate. Here, we describe a multidimensional chemical mapping method called M2-seq (mutate-and-map read out through next-generation sequencing) that takes advantage of sparsely mutated nucleotides to induce structural perturbations at partner nucleotides and then detects these events through dimethyl sulfate (DMS) probing and mutational profiling. In special cases, fortuitous errors introduced during DNA template preparation and RNA transcription are sufficient to give M2-seq helix signatures; these signals were previously overlooked or mistaken for correlated double DMS events. When mutations are enhanced through error-prone PCR, in vitro M2-seq experimentally resolves 33 of 68 helices in diverse structured RNAs including ribozyme domains, riboswitch aptamers, and viral RNA domains with a single false positive. These inferences do not require energy minimization algorithms and can be made by either direct visual inspection or by a new neural-net-inspired algorithm called M2-net. Measurements on the P4-P6 domain of the Tetrahymena group I ribozyme embedded in Xenopus egg extract demonstrate the ability of M2-seq to detect RNA helices in a complex biological environment.SIGNIFICANCE STATEMENTThe intricate structures of RNA molecules are crucial to their biological functions but have been difficult to accurately characterize. Multidimensional chemical mapping methods improve accuracy but have so far involved painstaking experiments and reliance on secondary structure prediction software. A methodology called M2-seq now lifts these limitations. Mechanistic studies clarify the origin of serendipitous M2-seq-like signals that were recently discovered but not correctly explained and also provide mutational strategies that enable robust M2-seq for new RNA transcripts. The method detects dozens of Watson-Crick helices across diverse RNA folds in vitro and within frog egg extract, with low false positive rate (< 5%). M2-seq opens a route to unbiased discovery of RNA structures in vitro and beyond.


PLoS Biology ◽  
2021 ◽  
Vol 19 (10) ◽  
pp. e3001425
Author(s):  
Amanda Jack ◽  
Luke S. Ferro ◽  
Michael J. Trnka ◽  
Eddie Wehri ◽  
Amrut Nadgir ◽  
...  

The Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) infection causes Coronavirus Disease 2019 (COVID-19), a pandemic that seriously threatens global health. SARS-CoV-2 propagates by packaging its RNA genome into membrane enclosures in host cells. The packaging of the viral genome into the nascent virion is mediated by the nucleocapsid (N) protein, but the underlying mechanism remains unclear. Here, we show that the N protein forms biomolecular condensates with viral genomic RNA both in vitro and in mammalian cells. While the N protein forms spherical assemblies with homopolymeric RNA substrates that do not form base pairing interactions, it forms asymmetric condensates with viral RNA strands. Cross-linking mass spectrometry (CLMS) identified a region that forms interactions between N proteins in condensates, and truncation of this region disrupts phase separation. We also identified small molecules that alter the formation of N protein condensates and inhibit the proliferation of SARS-CoV-2 in infected cells. These results suggest that the N protein may utilize biomolecular condensation to package the SARS-CoV-2 RNA genome into a viral particle.


Sign in / Sign up

Export Citation Format

Share Document