Predicting Consensus Structures for RNA Alignments via Pseudo-Energy Minimization

Thermodynamic processes with free energy parameters are often used in algorithms that solve the free energy minimization problem to predict secondary structures of single RNA sequences. While results from these algorithms are promising, an observation is that single sequence-based methods have moderate accuracy and more information is needed to improve on RNA secondary structure prediction, such as covariance scores obtained from multiple sequence alignments. We present in this paper a new approach to predicting the consensus secondary structure of a set of aligned RNA sequences via pseudo-energy minimization. Our tool, called RSpredict, takes into account sequence covariation and employs effective heuristics for accuracy improvement. RSpredict accepts, as input data, a multiple sequence alignment in FASTA or ClustalW format and outputs the consensus secondary structure of the input sequences in both the Vienna style Dot Bracket format and the Connectivity Table format. Our method was compared with some widely used tools including KNetFold, Pfold and RNAalifold. A comprehensive test on different datasets including Rfam sequence alignments and a multiple sequence alignment obtained from our study on the Drosophila X chromosome reveals that RSpredict is competitive with the existing tools on the tested datasets. RSpredict is freely available online as a web server and also as a jar file for download at http://datalab.njit.edu/biology/RSpredict .

Download Full-text

Integrating Protein Secondary Structure Prediction and Multiple Sequence Alignment

Current Protein and Peptide Science ◽

10.2174/1389203043379675 ◽

2004 ◽

Vol 5 (4) ◽

pp. 249-266 ◽

Cited By ~ 35

Author(s):

V. Simossis ◽

J. Heringa

Keyword(s):

Secondary Structure ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Structure Prediction ◽

Secondary Structure Prediction ◽

Protein Secondary Structure ◽

Protein Secondary Structure Prediction ◽

Multiple Sequence

Download Full-text

A Novel Comparative Sequence Analysis Method for ncRNA Secondary Structure Prediction without Multiple Sequence Alignment

2008 Fourth International Conference on Natural Computation ◽

10.1109/icnc.2008.446 ◽

2008 ◽

Author(s):

Quan Zou ◽

Mao-Zu Guo ◽

Yang Liu ◽

Zhi-An Xing

Keyword(s):

Sequence Analysis ◽

Secondary Structure ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Structure Prediction ◽

Secondary Structure Prediction ◽

Comparative Sequence Analysis ◽

Analysis Method ◽

Multiple Sequence ◽

Comparative Sequence

Download Full-text

Application of multiple sequence alignment profiles to improve protein secondary structure prediction

Proteins Structure Function and Bioinformatics ◽

10.1002/1097-0134(20000815)40:3<502::aid-prot170>3.0.co;2-q ◽

2000 ◽

Vol 40 (3) ◽

pp. 502-511 ◽

Cited By ~ 484

Author(s):

James A. Cuff ◽

Geoffrey J. Barton

Keyword(s):

Secondary Structure ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Structure Prediction ◽

Secondary Structure Prediction ◽

Protein Secondary Structure ◽

Protein Secondary Structure Prediction ◽

Multiple Sequence

Download Full-text

The Limits of Protein Secondary Structure Prediction Accuracy from Multiple Sequence Alignment

Journal of Molecular Biology ◽

10.1006/jmbi.1993.1649 ◽

1993 ◽

Vol 234 (4) ◽

pp. 951-957 ◽

Cited By ~ 58

Author(s):

Robert B. Russell ◽

Geoffrey J. Barton

Keyword(s):

Secondary Structure ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Structure Prediction ◽

Prediction Accuracy ◽

Secondary Structure Prediction ◽

Protein Secondary Structure ◽

Protein Secondary Structure Prediction ◽

Multiple Sequence

Download Full-text

Protein Multiple Sequence Alignment Benchmarking through Secondary Structure Prediction

Bioinformatics ◽

10.1093/bioinformatics/btw840 ◽

2017 ◽

pp. btw840 ◽

Cited By ~ 5

Author(s):

Quan Le ◽

Fabian Sievers ◽

Desmond G. Higgins

Keyword(s):

Secondary Structure ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Structure Prediction ◽

Secondary Structure Prediction ◽

Multiple Sequence ◽

Protein Multiple Sequence Alignment

Download Full-text

Protein secondary structure prediction based on the GOR algorithm incorporating multiple sequence alignment information

Polymer ◽

10.1016/s0032-3861(01)00425-6 ◽

2002 ◽

Vol 43 (2) ◽

pp. 441-449 ◽

Cited By ~ 21

Author(s):

A. Kloczkowski ◽

K.-L. Ting ◽

R.L. Jernigan ◽

J. Garnier

Keyword(s):

Secondary Structure ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Structure Prediction ◽

Secondary Structure Prediction ◽

Protein Secondary Structure ◽

Protein Secondary Structure Prediction ◽

Multiple Sequence

Download Full-text

Tailor-made multiple sequence alignments using the PRALINE 2 alignment toolkit

Bioinformatics ◽

10.1093/bioinformatics/btz572 ◽

2019 ◽

Vol 35 (24) ◽

pp. 5315-5317 ◽

Cited By ~ 1

Author(s):

Maurits J J Dijkstra ◽

Atze J van der Ploeg ◽

K Anton Feenstra ◽

Wan J Fokkink ◽

Sanne Abeln ◽

...

Keyword(s):

Secondary Structure ◽

Open Source ◽

Sequence Alignment ◽

Open Source Software ◽

Multiple Sequence Alignment ◽

Multiple Alignment ◽

Sequence Alignments ◽

Multiple Sequence ◽

Dna Motifs ◽

Multiple Sequence Alignments

Abstract Summary PRALINE 2 is a toolkit for custom multiple sequence alignment workflows. It can be used to incorporate sequence annotations, such as secondary structure or (DNA) motifs, into the alignment scoring, as well as to customize many other aspects of a progressive multiple alignment workflow. Availability and implementation PRALINE 2 is implemented in Python and available as open source software on GitHub: https://github.com/ibivu/PRALINE/.

Download Full-text

Energy-Based RNA Consensus Secondary Structure Prediction in Multiple Sequence Alignments

Methods in Molecular Biology - RNA Sequence, Structure, and Function: Computational and Bioinformatic Methods ◽

10.1007/978-1-62703-709-9_7 ◽

2013 ◽

pp. 125-141

Author(s):

Stefan Washietl ◽

Stephan H. Bernhart ◽

Manolis Kellis

Keyword(s):

Secondary Structure ◽

Structure Prediction ◽

Secondary Structure Prediction ◽

Sequence Alignments ◽

Multiple Sequence ◽

Consensus Secondary Structure ◽

Multiple Sequence Alignments

Download Full-text

Molecular homology and multiple-sequence alignment: an analysis of concepts and practice

Australian Systematic Botany ◽

10.1071/sb15001 ◽

2015 ◽

Vol 28 (1) ◽

pp. 46 ◽

Cited By ~ 20

Author(s):

David A. Morrison ◽

Matthew J. Morgan ◽

Scot A. Kelchner

Keyword(s):

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Molecular Data ◽

Simple Relationship ◽

Sequence Alignments ◽

Multiple Sequence ◽

Molecular Change ◽

Nucleotide Homology ◽

Tree Building ◽

Molecular Homology

Sequence alignment is just as much a part of phylogenetics as is tree building, although it is often viewed solely as a necessary tool to construct trees. However, alignment for the purpose of phylogenetic inference is primarily about homology, as it is the procedure that expresses homology relationships among the characters, rather than the historical relationships of the taxa. Molecular homology is rather vaguely defined and understood, despite its importance in the molecular age. Indeed, homology has rarely been evaluated with respect to nucleotide sequence alignments, in spite of the fact that nucleotides are the only data that directly represent genotype. All other molecular data represent phenotype, just as do morphology and anatomy. Thus, efforts to improve sequence alignment for phylogenetic purposes should involve a more refined use of the homology concept at a molecular level. To this end, we present examples of molecular-data levels at which homology might be considered, and arrange them in a hierarchy. The concept that we propose has many levels, which link directly to the developmental and morphological components of homology. Of note, there is no simple relationship between gene homology and nucleotide homology. We also propose terminology with which to better describe and discuss molecular homology at these levels. Our over-arching conceptual framework is then used to shed light on the multitude of automated procedures that have been created for multiple-sequence alignment. Sequence alignment needs to be based on aligning homologous nucleotides, without necessary reference to homology at any other level of the hierarchy. In particular, inference of nucleotide homology involves deriving a plausible scenario for molecular change among the set of sequences. Our clarifications should allow the development of a procedure that specifically addresses homology, which is required when performing alignment for phylogenetic purposes, but which does not yet exist.

Download Full-text

Benchmarking Statistical Multiple Sequence Alignment

10.1101/304659 ◽

2018 ◽

Cited By ~ 1

Author(s):

Michael Nute ◽

Ehsan Saleh ◽

Tandy Warnow

Keyword(s):

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Structural Alignment ◽

Estimation Method ◽

Simulated Data ◽

Protein Sequences ◽

Data Sets ◽

Sequence Alignments ◽

Multiple Sequence ◽

Simulated Data Sets

AbstractThe estimation of multiple sequence alignments of protein sequences is a basic step in many bioinformatics pipelines, including protein structure prediction, protein family identification, and phylogeny estimation. Statistical co-estimation of alignments and trees under stochastic models of sequence evolution has long been considered the most rigorous technique for estimating alignments and trees, but little is known about the accuracy of such methods on biological benchmarks. We report the results of an extensive study evaluating the most popular protein alignment methods as well as the statistical co-estimation method BAli-Phy on 1192 protein data sets from established benchmarks as well as on 120 simulated data sets. Our study (which used more than 230 CPU years for the BAli-Phy analyses alone) shows that BAli-Phy is dramatically more accurate than the other alignment methods on the simulated data sets, but is among the least accurate on the biological benchmarks. There are several potential causes for this discordance, including model misspecification, errors in the reference alignments, and conflicts between structural alignment and evolutionary alignments; future research is needed to understand the most likely explanation for our observations. multiple sequence alignment, BAli-Phy, protein sequences, structural alignment, homology

Download Full-text