scholarly journals Estimating the power of sequence covariation for detecting conserved RNA structure

2020 ◽  
Vol 36 (10) ◽  
pp. 3072-3076 ◽  
Author(s):  
Elena Rivas ◽  
Jody Clements ◽  
Sean R Eddy

Abstract Pairwise sequence covariations are a signal of conserved RNA secondary structure. We describe a method for distinguishing when lack of covariation signal can be taken as evidence against a conserved RNA structure, as opposed to when a sequence alignment merely has insufficient variation to detect covariations. We find that alignments for several long non-coding RNAs previously shown to lack covariation support do have adequate covariation detection power, providing additional evidence against their proposed conserved structures. Availability and implementation The R-scape web server is at eddylab.org/R-scape, with a link to download the source code. Supplementary information Supplementary data are available at Bioinformatics online.

2020 ◽  
Author(s):  
Masaki Tagashira

AbstractMotivationThe simultaneous consideration of sequence alignment and RNA secondary structure, or structural alignment, is known to help predict more accurate secondary structures of homologs. However, the consideration is heavy and can be done only roughly to decompose structural alignments.ResultsThe PhyloFold method, which predicts secondary structures of homologs considering likely pairwise structural alignments, was developed in this study. The method shows the best prediction accuracy while demanding comparable running time compared to conventional methods.AvailabilityThe source code of the programs implemented in this study is available on “https://github.com/heartsh/phylofold” and “https://github.com/heartsh/phyloalifold“.Contact“[email protected]”.Supplementary informationSupplementary data are available.


2019 ◽  
Author(s):  
Elena Rivas ◽  
Jody Clements ◽  
Sean R. Eddy

AbstractPairwise sequence covariations are a signal of conserved RNA secondary structure. We describe a method for distinguishing when lack of covariation signal can be taken as evidence against a conserved RNA structure, as opposed to when a sequence alignment merely has insufficient variation to detect covariations. We find that alignments for several long noncoding RNAs previously shown to lack covariation support do have adequate covariation detection power, providing additional evidence against their proposed conserved structures.


Author(s):  
Jaswinder Singh ◽  
Kuldip Paliwal ◽  
Tongchuan Zhang ◽  
Jaspreet Singh ◽  
Thomas Litfin ◽  
...  

Abstract Motivation The recent discovery of numerous non-coding RNAs (long non-coding RNAs, in particular) has transformed our perception about the roles of RNAs in living organisms. Our ability to understand them, however, is hampered by our inability to solve their secondary and tertiary structures in high resolution efficiently by existing experimental techniques. Computational prediction of RNA secondary structure, on the other hand, has received much-needed improvement, recently, through deep learning of a large approximate data, followed by transfer learning with gold-standard base-pairing structures from high-resolution 3-D structures. Here, we expand this single-sequence-based learning to the use of evolutionary profiles and mutational coupling. Results The new method allows large improvement not only in canonical base-pairs (RNA secondary structures) but more so in base-pairing associated with tertiary interactions such as pseudoknots, non-canonical and lone base-pairs. In particular, it is highly accurate for those RNAs of more than 1000 homologous sequences by achieving >0.8 F1-score (harmonic mean of sensitivity and precision) for 14/16 RNAs tested. The method can also significantly improve base-pairing prediction by incorporating artificial but functional homologous sequences generated from deep mutational scanning without any modification. The fully automatic method (publicly available as server and standalone software) should provide the scientific community a new powerful tool to capture not only the secondary structure but also tertiary base-pairing information for building three-dimensional models. It also highlights the future of accurately solving the base-pairing structure by using a large number of natural and/or artificial homologous sequences. Availability and implementation Standalone-version of SPOT-RNA2 is available at https://github.com/jaswindersingh2/SPOT-RNA2. Direct prediction can also be made at https://sparks-lab.org/server/spot-rna2/. The datasets used in this research can also be downloaded from the GITHUB and the webserver mentioned above. Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Grace Meng ◽  
Marva Tariq ◽  
Swati Jain ◽  
Shereef Elmetwaly ◽  
Tamar Schlick

Abstract Summary We launch a webserver for RNA structure prediction and design corresponding to tools developed using our RNA-As-Graphs (RAG) approach. RAG uses coarse-grained tree graphs to represent RNA secondary structure, allowing the application of graph theory to analyze and advance RNA structure discovery. Our webserver consists of three modules: (a) RAG Sampler: samples tree graph topologies from an RNA secondary structure to predict corresponding tertiary topologies, (b) RAG Builder: builds three-dimensional atomic models from candidate graphs generated by RAG Sampler, and (c) RAG Designer: designs sequences that fold onto novel RNA motifs (described by tree graph topologies). Results analyses are performed for further assessment/selection. The Results page provides links to download results and indicates possible errors encountered. RAG-Web offers a user-friendly interface to utilize our RAG software suite to predict and design RNA structures and sequences. Availability and implementation The webserver is freely available online at: http://www.biomath.nyu.edu/ragtop/. Supplementary information Supplementary data are available at Bioinformatics online.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Blake A. Sweeney ◽  
David Hoksza ◽  
Eric P. Nawrocki ◽  
Carlos Eduardo Ribas ◽  
Fábio Madeira ◽  
...  

AbstractNon-coding RNAs (ncRNA) are essential for all life, and their functions often depend on their secondary (2D) and tertiary structure. Despite the abundance of software for the visualisation of ncRNAs, few automatically generate consistent and recognisable 2D layouts, which makes it challenging for users to construct, compare and analyse structures. Here, we present R2DT, a method for predicting and visualising a wide range of RNA structures in standardised layouts. R2DT is based on a library of 3,647 templates representing the majority of known structured RNAs. R2DT has been applied to ncRNA sequences from the RNAcentral database and produced >13 million diagrams, creating the world’s largest RNA 2D structure dataset. The software is amenable to community expansion, and is freely available at https://github.com/rnacentral/R2DT and a web server is found at https://rnacentral.org/r2dt.


Author(s):  
Pavel Beran ◽  
Dagmar Stehlíková ◽  
Stephen P Cohen ◽  
Vladislav Čurn

Abstract Summary Searching for amino acid or nucleic acid sequences unique to one organism may be challenging depending on size of the available datasets. K-mer elimination by cross-reference (KEC) allows users to quickly and easily find unique sequences by providing target and non-target sequences. Due to its speed, it can be used for datasets of genomic size and can be run on desktop or laptop computers with modest specifications. Availability and implementation KEC is freely available for non-commercial purposes. Source code and executable binary files compiled for Linux, Mac and Windows can be downloaded from https://github.com/berybox/KEC. Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Tomasz Zok

Abstract Motivation Biomolecular structures come in multiple representations and diverse data formats. Their incompatibility with the requirements of data analysis programs significantly hinders the analytics and the creation of new structure-oriented bioinformatic tools. Therefore, the need for robust libraries of data processing functions is still growing. Results BioCommons is an open-source, Java library for structural bioinformatics. It contains many functions working with the 2D and 3D structures of biomolecules, with a particular emphasis on RNA. Availability and implementation The library is available in Maven Central Repository and its source code is hosted on GitHub: https://github.com/tzok/BioCommons Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Author(s):  
Kengo Sato ◽  
Manato Akiyama ◽  
Yasubumi Sakakibara

RNA secondary structure prediction is one of the key technologies for revealing the essential roles of functional non-coding RNAs. Although machine learning-based rich-parametrized models have achieved extremely high performance in terms of prediction accuracy, the risk of overfitting for such models has been reported. In this work, we propose a new algorithm for predicting RNA secondary structures that uses deep learning with thermodynamic integration, thereby enabling robust predictions. Similar to our previous work, the folding scores, which are computed by a deep neural network, are integrated with traditional thermodynamic parameters to enable robust predictions. We also propose thermodynamic regularization for training our model without overfitting it to the training data. Our algorithm (MXfold2) achieved the most robust and accurate predictions in computational experiments designed for newly discovered non-coding RNAs, with significant 2–10 % improvements over our previous algorithm (MXfold) and standard algorithms for predicting RNA secondary structures in terms of F-value.


2020 ◽  
Author(s):  
David Heller ◽  
Martin Vingron

AbstractMotivationWith the availability of new sequencing technologies, the generation of haplotype-resolved genome assemblies up to chromosome scale has become feasible. These assemblies capture the complete genetic information of both parental haplotypes, increase structural variant (SV) calling sensitivity and enable direct genotyping and phasing of SVs. Yet, existing SV callers are designed for haploid genome assemblies only, do not support genotyping or detect only a limited set of SV classes.ResultsWe introduce our method SVIM-asm for the detection and genotyping of six common classes of SVs from haploid and diploid genome assemblies. Compared against the only other existing SV caller for diploid assemblies, DipCall, SVIM-asm detects more SV classes and reached higher F1 scores for the detection of insertions and deletions on two recently published assemblies of the HG002 individual.Availability and ImplementationSVIM-asm has been implemented in Python and can be easily installed via bioconda. Its source code is available at github.com/eldariont/[email protected] informationSupplementary data are available online.


Sign in / Sign up

Export Citation Format

Share Document