ASPRAlign: a tool for the alignment of RNA secondary structures with arbitrary pseudoknots

Abstract Summary Current methods for comparing RNA secondary structures are based on tree representations and exploit edit distance or alignment algorithms. Most of them can only process structures without pseudoknots. To overcome this limitation, we introduce ASPRAlign, a Java tool that aligns particular algebraic tree representations of RNA. These trees neglect the primary sequence and can handle structures with arbitrary pseudoknots. A measure of comparison, called ASPRA distance, is computed with a worst-case time complexity of O(n2) where n is the number of nucleotides of the longer structure. Availability and implementation ASPRAlign is implemented in Java and source code is released under the GNU GPLv3 license. Code and documentation are freely available at https://github.com/bdslab/aspralign. Contact [email protected] Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Comparison of Pseudoknotted RNA Secondary Structures by Topological Centroid Identification and Tree Edit Distance

Journal of Computational Biology ◽

10.1089/cmb.2019.0512 ◽

2020 ◽

Vol 27 (9) ◽

pp. 1443-1451

Author(s):

Feiqi Wang ◽

Tatsuya Akutsu ◽

Tomoya Mori

Keyword(s):

Edit Distance ◽

Secondary Structures ◽

Tree Edit Distance ◽

Rna Secondary Structures

Download Full-text

BiORSEO: a bi-objective method to predict RNA secondary structures with pseudoknots using RNA 3D modules

Bioinformatics ◽

10.1093/bioinformatics/btz962 ◽

2020 ◽

Vol 36 (8) ◽

pp. 2451-2457

Author(s):

Louis Becquey ◽

Eric Angel ◽

Fariza Tahi

Keyword(s):

Structure Prediction ◽

Secondary Structure Prediction ◽

State Of The Art ◽

Secondary Structures ◽

Supplementary Information ◽

Large Set ◽

Objective Method ◽

Rna Secondary Structures ◽

Knowledge Based ◽

Module Size

Abstract Motivation RNA loops have been modelled and clustered from solved 3D structures into ordered collections of recurrent non-canonical interactions called ‘RNA modules’, available in databases. This work explores what information from such modules can be used to improve secondary structure prediction. We propose a bi-objective method for predicting RNA secondary structures by minimizing both an energy-based and a knowledge-based potential. The tool, called BiORSEO, outputs secondary structures corresponding to the optimal solutions from the Pareto set. Results We compare several approaches to predict secondary structures using inserted RNA modules information: two module data sources, Rna3Dmotif and the RNA 3D Motif Atlas, and different ways to score the module insertions: module size, module complexity or module probability according to models like JAR3D and BayesPairing. We benchmark them against a large set of known secondary structures, including some state-of-the-art tools, and comment on the usefulness of the half physics-based, half data-based approach. Availability and implementation The software is available for download on the EvryRNA website, as well as the datasets. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

PhyloFold: Precise and Swift Prediction of RNA Secondary Structures to Incorporate Phylogeny among Homologs

10.1101/2020.03.05.975797 ◽

2020 ◽

Author(s):

Masaki Tagashira

Keyword(s):

Secondary Structure ◽

Rna Secondary Structure ◽

Prediction Accuracy ◽

Structural Alignment ◽

Source Code ◽

Secondary Structures ◽

Supplementary Information ◽

Supplementary Data ◽

Link Type ◽

Structural Alignments

AbstractMotivationThe simultaneous consideration of sequence alignment and RNA secondary structure, or structural alignment, is known to help predict more accurate secondary structures of homologs. However, the consideration is heavy and can be done only roughly to decompose structural alignments.ResultsThe PhyloFold method, which predicts secondary structures of homologs considering likely pairwise structural alignments, was developed in this study. The method shows the best prediction accuracy while demanding comparable running time compared to conventional methods.AvailabilityThe source code of the programs implemented in this study is available on “https://github.com/heartsh/phylofold” and “https://github.com/heartsh/phyloalifold“.Contact“[email protected]”.Supplementary informationSupplementary data are available.

Download Full-text

Dynamic Iterated Algorithm for RNA Pseudoknots Prediction

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.44-47.3365 ◽

2010 ◽

Vol 44-47 ◽

pp. 3365-3369

Author(s):

Heng Wu Li

Keyword(s):

Time Complexity ◽

Secondary Structures ◽

The Other ◽

Matching Method ◽

Rna Secondary Structures ◽

Weighted Matching ◽

Rna Pseudoknots ◽

Individual Sequences ◽

Similar Accuracy ◽

Comparative Information

Pseudoknots have generally been excluded from the prediction of RNA secondary structures due to its difficulty in modeling. Here we present an algorithm, dynamic iterated matching to predict RNA secondary structures including pseudoknots with O(n4) time. The method can utilize either thermodynamic or comparative information or both, thus is able to predict pseudoknots for both aligned and individual sequences. We have tested the algorithm on a number of RNA families. Comparisons show that our algorithm and loop matching method has similar accuracy and time complexity, and are more sensitive than the maximum weighted matching method and Rivas algorithm. Among the four methods, our algorithm has the best prediction specificity. The results show that our algorithm is more reliable and efficient than the other methods.

Download Full-text

Classifying Conserved RNA Secondary Structures With Pseudoknots by Vector-Edit Distance

IEEE Access ◽

10.1109/access.2021.3058263 ◽

2021 ◽

Vol 9 ◽

pp. 32008-32018

Author(s):

Liyu Huang ◽

Qingfeng Chen ◽

Yongjie Li ◽

Cheng Luo

Keyword(s):

Edit Distance ◽

Secondary Structures ◽

Rna Secondary Structures

Download Full-text

BGSA: a bit-parallel global sequence alignment toolkit for multi-core and many-core architectures

Bioinformatics ◽

10.1093/bioinformatics/bty930 ◽

2018 ◽

Vol 35 (13) ◽

pp. 2306-2308 ◽

Cited By ~ 2

Author(s):

Jikai Zhang ◽

Haidong Lan ◽

Yuandong Chan ◽

Yuan Shang ◽

Bertil Schmidt ◽

...

Keyword(s):

Sequence Alignment ◽

Large Scale ◽

Edit Distance ◽

Pairwise Alignment ◽

Supplementary Information ◽

Xeon Phi ◽

Supplementary Data ◽

Alignment Algorithms ◽

Scoring Schemes ◽

Many Core

Abstract Motivation Modern bioinformatics tools for analyzing large-scale NGS datasets often need to include fast implementations of core sequence alignment algorithms in order to achieve reasonable execution times. We address this need by presenting the BGSA toolkit for optimized implementations of popular bit-parallel global pairwise alignment algorithms on modern microprocessors. Results BGSA outperforms Edlib, SeqAn and BitPAl for pairwise edit distance computations and Parasail, SeqAn and BitPAl when using more general scoring schemes for pairwise alignments of a batch of sequence reads on both standard multi-core CPUs and Xeon Phi many-core CPUs. Furthermore, banded edit distance performance of BGSA on a Xeon Phi-7210 outperforms the highly optimized NVBio implementation on a Titan X GPU for the seed verification stage of a read mapper by a factor of 4.4. Availability and implementation BGSA is open-source and available at https://github.com/sdu-hpcl/BGSA. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

New opportunities for designing effective small interfering RNAs

Scientific Reports ◽

10.1038/s41598-019-52303-5 ◽

2019 ◽

Vol 9 (1) ◽

Author(s):

James J. Valdés ◽

Andrew D. Miller

Keyword(s):

Sequence Data ◽

Secondary Structures ◽

Design Guidelines ◽

Small Interfering Rnas ◽

Primary Sequence ◽

Rna Secondary Structures ◽

Sirna Design ◽

Adequate Dosage ◽

Primary Sequence Data

Abstract Small interfering RNAs (siRNAs) that silence genes of infectious diseases are potentially potent drugs. A continuing obstacle for siRNA-based drugs is how to improve their efficacy for adequate dosage. To overcome this obstacle, the interactions of antiviral siRNAs, tested in vivo, were computationally examined within the RNA-induced silencing complex (RISC). Thermodynamics data show that a persistent RISC cofactor is significantly more exothermic for effective antiviral siRNAs than their ineffective counterparts. Detailed inspection of viral RNA secondary structures reveals that effective antiviral siRNAs target hairpin or pseudoknot loops. These structures are critical for initial RISC interactions since they partially lack intramolecular complementary base pairing. Importing two temporary RISC cofactors from magnesium-rich hairpins and/or pseudoknots then kickstarts full RNA hybridization and hydrolysis. Current siRNA design guidelines are based on RNA primary sequence data. Herein, the thermodynamics of RISC cofactors and targeting magnesium-rich RNA secondary structures provide additional guidelines for improving siRNA design.

Download Full-text

A Robust Implementation for Three-Dimensional Delaunay Triangulations

International Journal of Computational Geometry & Applications ◽

10.1142/s0218195998000138 ◽

1998 ◽

Vol 08 (02) ◽

pp. 255-276 ◽

Cited By ~ 18

Author(s):

Ernst P. Mücke

Keyword(s):

Time Complexity ◽

Source Code ◽

Three Dimensional ◽

The Internet ◽

Perturbation Scheme ◽

Worst Case ◽

Delaunay Triangulations ◽

Exact Arithmetic ◽

Robust Implementation ◽

Geometric Primitives

This paper presents an implementation for Delaunay triangulations of three-dimensional point sets. The code uses a variant of the randomized incremental flip algorithm and employs symbolic perturbation to achieve robustness. The algorithm's theoretical time complexity is quadratic in n, the number of input points, and this is optimal in the worst case. However, empirical running times are proportional to the number of triangles in the final triangulation, which is typically linear in n. Even though the symbolic perturbation scheme relies on exact arithmetic, the resulting code is efficient in practice. This is due to a careful implementation of the geometric primitives and the arithmetic module. The source code is available on the Internet.

Download Full-text

KEC: unique sequence search by K-mer exclusion

Bioinformatics ◽

10.1093/bioinformatics/btab196 ◽

2021 ◽

Author(s):

Pavel Beran ◽

Dagmar Stehlíková ◽

Stephen P Cohen ◽

Vladislav Čurn

Keyword(s):

Amino Acid ◽

Nucleic Acid ◽

Source Code ◽

Unique Sequence ◽

Supplementary Information ◽

Supplementary Data ◽

Laptop Computers ◽

Sequence Search ◽

Target Sequences ◽

Cross Reference

Abstract Summary Searching for amino acid or nucleic acid sequences unique to one organism may be challenging depending on size of the available datasets. K-mer elimination by cross-reference (KEC) allows users to quickly and easily find unique sequences by providing target and non-target sequences. Due to its speed, it can be used for datasets of genomic size and can be run on desktop or laptop computers with modest specifications. Availability and implementation KEC is freely available for non-commercial purposes. Source code and executable binary files compiled for Linux, Mac and Windows can be downloaded from https://github.com/berybox/KEC. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

BioCommons: a robust java library for RNA structural bioinformatics

Bioinformatics ◽

10.1093/bioinformatics/btab069 ◽

2021 ◽

Author(s):

Tomasz Zok

Keyword(s):

Source Code ◽

Structural Bioinformatics ◽

Supplementary Information ◽

Supplementary Data ◽

Bioinformatic Tools ◽

Data Formats ◽

Central Repository ◽

Diverse Data ◽

2D And 3D ◽

Java Library

Abstract Motivation Biomolecular structures come in multiple representations and diverse data formats. Their incompatibility with the requirements of data analysis programs significantly hinders the analytics and the creation of new structure-oriented bioinformatic tools. Therefore, the need for robust libraries of data processing functions is still growing. Results BioCommons is an open-source, Java library for structural bioinformatics. It contains many functions working with the 2D and 3D structures of biomolecules, with a particular emphasis on RNA. Availability and implementation The library is available in Maven Central Repository and its source code is hosted on GitHub: https://github.com/tzok/BioCommons Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text