scholarly journals RNA structure inference through chemical mapping after accidental or intentional mutations

2017 ◽  
Author(s):  
Clarence Y. Cheng ◽  
Wipapat Kladwang ◽  
Joseph Yesselman ◽  
Rhiju Das

ABSTRACTDespite the critical roles RNA structures play in regulating gene expression, sequencing-based methods for experimentally determining RNA base pairs have remained inaccurate. Here, we describe a multidimensional chemical mapping method called M2-seq (mutate-and-map read out through next-generation sequencing) that takes advantage of sparsely mutated nucleotides to induce structural perturbations at partner nucleotides and then detects these events through dimethyl sulfate (DMS) probing and mutational profiling. In special cases, fortuitous errors introduced during DNA template preparation and RNA transcription are sufficient to give M2-seq helix signatures; these signals were previously overlooked or mistaken for correlated double DMS events. When mutations are enhanced through error-prone PCR, in vitro M2-seq experimentally resolves 33 of 68 helices in diverse structured RNAs including ribozyme domains, riboswitch aptamers, and viral RNA domains with a single false positive. These inferences do not require energy minimization algorithms and can be made by either direct visual inspection or by a new neural-net-inspired algorithm called M2-net. Measurements on the P4-P6 domain of the Tetrahymena group I ribozyme embedded in Xenopus egg extract demonstrate the ability of M2-seq to detect RNA helices in a complex biological environment.SIGNIFICANCE STATEMENTThe intricate structures of RNA molecules are crucial to their biological functions but have been difficult to accurately characterize. Multidimensional chemical mapping methods improve accuracy but have so far involved painstaking experiments and reliance on secondary structure prediction software. A methodology called M2-seq now lifts these limitations. Mechanistic studies clarify the origin of serendipitous M2-seq-like signals that were recently discovered but not correctly explained and also provide mutational strategies that enable robust M2-seq for new RNA transcripts. The method detects dozens of Watson-Crick helices across diverse RNA folds in vitro and within frog egg extract, with low false positive rate (< 5%). M2-seq opens a route to unbiased discovery of RNA structures in vitro and beyond.


2017 ◽  
Vol 114 (37) ◽  
pp. 9876-9881 ◽  
Author(s):  
Clarence Y. Cheng ◽  
Wipapat Kladwang ◽  
Joseph D. Yesselman ◽  
Rhiju Das

Despite the critical roles RNA structures play in regulating gene expression, sequencing-based methods for experimentally determining RNA base pairs have remained inaccurate. Here, we describe a multidimensional chemical-mapping method called “mutate-and-map read out through next-generation sequencing” (M2-seq) that takes advantage of sparsely mutated nucleotides to induce structural perturbations at partner nucleotides and then detects these events through dimethyl sulfate (DMS) probing and mutational profiling. In special cases, fortuitous errors introduced during DNA template preparation and RNA transcription are sufficient to give M2-seq helix signatures; these signals were previously overlooked or mistaken for correlated double-DMS events. When mutations are enhanced through error-prone PCR, in vitro M2-seq experimentally resolves 33 of 68 helices in diverse structured RNAs including ribozyme domains, riboswitch aptamers, and viral RNA domains with a single false positive. These inferences do not require energy minimization algorithms and can be made by either direct visual inspection or by a neural-network–inspired algorithm called M2-net. Measurements on the P4–P6 domain of the Tetrahymena group I ribozyme embedded in Xenopus egg extract demonstrate the ability of M2-seq to detect RNA helices in a complex biological environment.



2021 ◽  
Vol 17 (3) ◽  
pp. e1009345
Author(s):  
Susan M. Brewer ◽  
Christian Twittenhoff ◽  
Jens Kortmann ◽  
Sky W. Brubaker ◽  
Jared Honeycutt ◽  
...  

Sensing and responding to environmental signals is critical for bacterial pathogens to successfully infect and persist within hosts. Many bacterial pathogens sense temperature as an indication they have entered a new host and must alter their virulence factor expression to evade immune detection. Using secondary structure prediction, we identified an RNA thermosensor (RNAT) in the 5’ untranslated region (UTR) of tviA encoded by the typhoid fever-causing bacterium Salmonella enterica serovar Typhi (S. Typhi). Importantly, tviA is a transcriptional regulator of the critical virulence factors Vi capsule, flagellin, and type III secretion system-1 expression. By introducing point mutations to alter the mRNA secondary structure, we demonstrate that the 5’ UTR of tviA contains a functional RNAT using in vitro expression, structure probing, and ribosome binding methods. Mutational inhibition of the RNAT in S. Typhi causes aberrant virulence factor expression, leading to enhanced innate immune responses during infection. In conclusion, we show that S. Typhi regulates virulence factor expression through an RNAT in the 5’ UTR of tviA. Our findings demonstrate that limiting inflammation through RNAT-dependent regulation in response to host body temperature is important for S. Typhi’s “stealthy” pathogenesis.



2019 ◽  
Vol 10 (1) ◽  
Author(s):  
Jaswinder Singh ◽  
Jack Hanson ◽  
Kuldip Paliwal ◽  
Yaoqi Zhou

AbstractThe majority of our human genome transcribes into noncoding RNAs with unknown structures and functions. Obtaining functional clues for noncoding RNAs requires accurate base-pairing or secondary-structure prediction. However, the performance of such predictions by current folding-based algorithms has been stagnated for more than a decade. Here, we propose the use of deep contextual learning for base-pair prediction including those noncanonical and non-nested (pseudoknot) base pairs stabilized by tertiary interactions. Since only $$<$$<250 nonredundant, high-resolution RNA structures are available for model training, we utilize transfer learning from a model initially trained with a recent high-quality bpRNA dataset of $$> $$>10,000 nonredundant RNAs made available through comparative analysis. The resulting method achieves large, statistically significant improvement in predicting all base pairs, noncanonical and non-nested base pairs in particular. The proposed method (SPOT-RNA), with a freely available server and standalone software, should be useful for improving RNA structure modeling, sequence alignment, and functional annotations.



2020 ◽  
Author(s):  
Tycho Marinus ◽  
Adam B. Fessler ◽  
Craig A. Ogle ◽  
Danny Incarnato

ABSTRACTDue to the mounting evidence that RNA structure plays a critical role in regulating almost any physiological as well as pathological process, being able to accurately define the folding of RNA molecules within living cells has become a crucial need. We introduce here 2-aminopyridine-3-carboxylic acid imidazolide (2A3), as a general probe for the interrogation of RNA structures in vivo. 2A3 shows moderate improvements with respect to the state-of-the-art SHAPE reagent NAI on naked RNA under in vitro conditions, but it significantly outperforms NAI when probing RNA structure in vivo, particularly in bacteria, underlining its increased ability to permeate biological membranes. When used as a restraint to drive RNA structure prediction, data derived by SHAPE-MaP with 2A3 yields more accurate predictions than NAI-derived data. Due to its extreme efficiency and accuracy, we can anticipate that 2A3 will rapidly take over conventional SHAPE reagents for probing RNA structures both in vitro and in vivo.



Biologia ◽  
2014 ◽  
Vol 69 (5) ◽  
Author(s):  
Tingzhang Hu ◽  
Junnian Yang ◽  
Yongwei Yang ◽  
Yingmei Wu

AbstractLate embryogenesis abundant (LEA) proteins in organisms are closely associated with resistance to abiotic stresses. Here we characterized a rice LEA protein, OsLEA3-1, by bioinformatics analysis and heterologous expression in Escherichia coli. Bioinformatics analysis showed that OsLEA3-1 contains a 603-bp open reading frame encoding a putative polypeptide of 200 amino acids, which contains a “LEA_4” motif at positions 5–48 and belongs to a typical group 3 LEA. OsLEA3-1 polypeptide is rich in Ala, Lys, and Thr, but depleted in Cys, Pro, and Trp residues; and is strongly hydrophilic. Secondary structure prediction showed that OsLEA3-1 polypeptide contained an α-helical domain in positions 4-195 but not any β-sheet domain. OsLEA3-1 gene can express in shoot and root of germinating seeds, seedling, panicles, mature embryo, seed, and callus; and was also up-regulated by ultraviolet (UV), heat, cold, salt, and emergency drought. OsLEA3-1 gene was introduced into E. coli. A fusion protein of about 28.03 kDa was expressed in recombinant E. coli cells after the induction by isopropylthio-β-D-galactoside. Compared with control E. coli cells harbouring pET30a, the accumulation of the OsLEA3-1 fusion protein increased the tolerance of the E. coli recombinants under diverse abiotic stresses: high salinity, metal ions, hyperosmotic, heat, and UV radiation. The OsLEA3-1 has the ability to protect the lactate dehydrogenase activity under heating, drying, and MnCl2 treatment in vitro. The findings suggested that the OsLEA3-1 gene may contribute to the ability of adapting to stressful environments of plants.



2021 ◽  
Author(s):  
Tycho Marinus ◽  
Adam B Fessler ◽  
Craig A Ogle ◽  
Danny Incarnato

Abstract Due to the mounting evidence that RNA structure plays a critical role in regulating almost any physiological as well as pathological process, being able to accurately define the folding of RNA molecules within living cells has become a crucial need. We introduce here 2-aminopyridine-3-carboxylic acid imidazolide (2A3), as a general probe for the interrogation of RNA structures in vivo. 2A3 shows moderate improvements with respect to the state-of-the-art selective 2′-hydroxyl acylation analyzed by primer extension (SHAPE) reagent NAI on naked RNA under in vitro conditions, but it significantly outperforms NAI when probing RNA structure in vivo, particularly in bacteria, underlining its increased ability to permeate biological membranes. When used as a restraint to drive RNA structure prediction, data derived by SHAPE-MaP with 2A3 yields more accurate predictions than NAI-derived data. Due to its extreme efficiency and accuracy, we can anticipate that 2A3 will rapidly take over conventional SHAPE reagents for probing RNA structures both in vitro and in vivo.



2017 ◽  
Author(s):  
Andrew Watkins ◽  
Caleb Geniesse ◽  
Wipapat Kladwang ◽  
Paul Zakrevsky ◽  
Luc Jaeger ◽  
...  

AbstractPrediction of RNA structure from nucleotide sequence remains an unsolved grand challenge of biochemistry and requires distinct concepts from protein structure prediction. Despite extensive algorithmic development in recent years, modeling of noncanonical base pairs of new RNA structural motifs has not been achieved in blind challenges. We report herein a stepwise Monte Carlo (SWM) method with a unique add-and-delete move set that enables predictions of noncanonical base pairs of complex RNA structures. A benchmark of 82 diverse motifs establishes the method’s general ability to recover noncanonical pairs ab initio, including multistrand motifs that have been refractory to prior approaches. In a blind challenge, SWM models predicted nucleotide-resolution chemical mapping and compensatory mutagenesis experiments for three in vitro selected tetraloop/receptors with previously unsolved structures (C7.2, C7.10, and R1). As a final test, SWM blindly and correctly predicted all noncanonical pairs of a Zika virus double pseudoknot during a recent community-wide RNA-puzzle. Stepwise structure formation, as encoded in the SWM method, enables modeling of noncanonical RNA structure in a variety of previously intractable problems.



2021 ◽  
Author(s):  
Mehdi Saman Booy ◽  
Alexander Ilin ◽  
Pekka Orponen

Predicting the secondary, i.e. base-pairing structure of a folded RNA strand is an important problem in synthetic and computational biology. First-principle algorithmic approaches to this task are challenging because existing models of the folding process are inaccurate, and even if a perfect model existed, finding an optimal solution would be in general NP-complete. In this paper, we propose a simple, yet extremely effective data-driven approach. We represent RNA sequences in the form of three-dimensional tensors in which we encode possible relations between all pairs of bases in a given sequence. We then use a convolutional neural network to predict a two-dimensional map which represents the correct pairings between the bases. Our model achieves significant accuracy improvements over existing methods on two standard datasets. Our experiments show excellent performance of the model across a wide range of sequence lengths and RNA families. We also observe considerable improvements in predicting complex pseudoknotted RNA structures, as compared to previous approaches.



2019 ◽  
Author(s):  
Kexin Zhang ◽  
Aaron T. Frank

ABSTRACTInspired by methods that utilize chemical-mapping data to guide secondary structure prediction, we sought to develop a framework for using assigned chemical shift data to guide RNA secondary structure prediction. We first used machine learning to develop classifiers which predict the base-pairing status of individual residues in an RNA based on their assigned chemical shifts. Then, we used these base-pairing status predictions as restraints to guide RNA folding algorithms. Our results showed that we could recover the correct secondary folds for nearly all of the 108 RNAs in our dataset with remarkable accuracy. Finally, we assessed whether we could conditionally predict the structure of the model RNA, microRNA-20b (miR-20b), by folding it using folding restraints derived from chemical shifts associated with two distinct conformational states, one a free (apo) state and the other a protein-bound (holo) state. For this test, we found that by using folding restraints derived from chemical shifts, we could recover the two distinct structures of the miR-20b, confirming our ability to conditionally predict its secondary structure. A command-line tool for Chemical Shifts to Base-Pairing Status (CS2BPS) predictions in RNA has been incorporated into our CS2Structure Git repository and can be accessed via: https://github.com/atfrank/CS2Structure.



10.2196/25995 ◽  
2021 ◽  
Vol 2 (1) ◽  
pp. e25995
Author(s):  
Emilio Mastriani ◽  
Alexey V Rakov ◽  
Shu-Lin Liu

Background COVID-19, caused by the novel SARS-CoV-2, is considered the most threatening respiratory infection in the world, with over 40 million people infected and over 0.934 million related deaths reported worldwide. It is speculated that epidemiological and clinical features of COVID-19 may differ across countries or continents. Genomic comparison of 48,635 SARS-CoV-2 genomes has shown that the average number of mutations per sample was 7.23, and most SARS-CoV-2 strains belong to one of 3 clades characterized by geographic and genomic specificity: Europe, Asia, and North America. Objective The aim of this study was to compare the genomes of SARS-CoV-2 strains isolated from Italy, Sweden, and Congo, that is, 3 different countries in the same meridian (longitude) but with different climate conditions, and from Brazil (as an outgroup country), to analyze similarities or differences in patterns of possible evolutionary pressure signatures in their genomes. Methods We obtained data from the Global Initiative on Sharing All Influenza Data repository by sampling all genomes available on that date. Using HyPhy, we achieved the recombination analysis by genetic algorithm recombination detection method, trimming, removal of the stop codons, and phylogenetic tree and mixed effects model of evolution analyses. We also performed secondary structure prediction analysis for both sequences (mutated and wild-type) and “disorder” and “transmembrane” analyses of the protein. We analyzed both protein structures with an ab initio approach to predict their ontologies and 3D structures. Results Evolutionary analysis revealed that codon 9628 is under episodic selective pressure for all SARS-CoV-2 strains isolated from the 4 countries, suggesting it is a key site for virus evolution. Codon 9628 encodes the P0DTD3 (Y14_SARS2) uncharacterized protein 14. Further investigation showed that the codon mutation was responsible for helical modification in the secondary structure. The codon was positioned in the more ordered region of the gene (41-59) and near to the area acting as the transmembrane (54-67), suggesting its involvement in the attachment phase of the virus. The predicted protein structures of both wild-type and mutated P0DTD3 confirmed the importance of the codon to define the protein structure. Moreover, ontological analysis of the protein emphasized that the mutation enhances the binding probability. Conclusions Our results suggest that RNA secondary structure may be affected and, consequently, the protein product changes T (threonine) to G (glycine) in position 50 of the protein. This position is located close to the predicted transmembrane region. Mutation analysis revealed that the change from G (glycine) to D (aspartic acid) may confer a new function to the protein—binding activity, which in turn may be responsible for attaching the virus to human eukaryotic cells. These findings can help design in vitro experiments and possibly facilitate a vaccine design and successful antiviral strategies.



Sign in / Sign up

Export Citation Format

Share Document