Secondary Structure Prediction for RNA Sequences Including N6-methyladenosine

AbstractThere is increasing interest in the roles played by covalently modified nucleotides in mRNAs and non-coding RNAs. New high-throughput sequencing technologies localize these modifications to exact nucleotide positions. There has been, however, and inability to account for these modifications in secondary structure prediction because of a lack of software tools for handling modifications and a lack of thermodynamic parameters for modifications. Here, we report that we solved these issues for N6-methyladenosine (m6A), for the first time allowing secondary structure prediction for a nucleotide alphabet of A, C, G, U, and m6A. We revised the RNAstructure software package to work with any user-defined alphabet of nucleotides. We also developed a set of nearest neighbor parameters for helices and loops containing m6A, using a set of 45 optical melting experiments. Interestingly, N6-methylation decreases the folding stability of structures with adenosines in the middle of a helix, has little effect on the folding stability of adenosines at the ends of helices, and stabilizes the folding stability for structures with unpaired adenosines stacked on the end of a helix. The parameters were tested against an additional two melting experiments, including a consensus sequence for methylation and an m6A dangling end. The utility of the new software was tested using predictions of the structure of a molecular switch in the MALAT1 lncRNA, for which a conformation change is triggered by methylation. Additionally, human transcriptome-wide calculations for the effect of N6-methylation on the probability of an adenosine being buried in a helix compare favorably with PARS structure mapping data. Now users of RNAstructure are able to develop hypothesis for structure-function relationships for RNAs with m6A, including conformational switching triggered by methylation.

Download Full-text

A range of complex probabilistic models for RNA secondary structure prediction that includes the nearest-neighbor model and more

RNA ◽

10.1261/rna.030049.111 ◽

2011 ◽

Vol 18 (2) ◽

pp. 193-212 ◽

Cited By ~ 50

Author(s):

E. Rivas ◽

R. Lang ◽

S. R. Eddy

Keyword(s):

Secondary Structure ◽

Structure Prediction ◽

Rna Secondary Structure ◽

Probabilistic Models ◽

Nearest Neighbor ◽

Secondary Structure Prediction ◽

Rna Secondary Structure Prediction

Download Full-text

Evaluation of the information content of RNA structure mapping data for secondary structure prediction

RNA ◽

10.1261/rna.1988510 ◽

2010 ◽

Vol 16 (6) ◽

pp. 1108-1117 ◽

Cited By ~ 39

Author(s):

S. Quarrier ◽

J. S. Martin ◽

L. Davis-Neulander ◽

A. Beauregard ◽

A. Laederach

Keyword(s):

Secondary Structure ◽

Information Content ◽

Rna Structure ◽

Structure Prediction ◽

Secondary Structure Prediction ◽

Structure Mapping ◽

Mapping Data ◽

Rna Structure Mapping

Download Full-text

PROTEIN SECONDARY STRUCTURE PREDICTION USING NMR CHEMICAL SHIFT DATA

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720010004987 ◽

2010 ◽

Vol 08 (05) ◽

pp. 867-884 ◽

Cited By ~ 17

Author(s):

YUZHONG ZHAO ◽

BABAK ALIPANAHI ◽

SHUAI CHENG LI ◽

MING LI

Keyword(s):

Chemical Shift ◽

Secondary Structure ◽

Structure Prediction ◽

Nearest Neighbor ◽

Tertiary Structure ◽

Secondary Structure Prediction ◽

Accurate Determination ◽

Protein Secondary Structure ◽

Sequence Information ◽

K Nearest Neighbor

Accurate determination of protein secondary structure from the chemical shift information is a key step for NMR tertiary structure determination. Relatively few work has been done on this subject. There needs to be a systematic investigation of algorithms that are (a) robust for large datasets; (b) easily extendable to (the dynamic) new databases; and (c) approaching to the limit of accuracy. We introduce new approaches using k-nearest neighbor algorithm to do the basic prediction and use the BCJR algorithm to smooth the predictions and combine different predictions from chemical shifts and based on sequence information only. Our new system, SUCCES, improves the accuracy of all existing methods on a large dataset of 805 proteins (at 86% Q3 accuracy and at 92.6% accuracy when the boundary residues are ignored), and it is easily extendable to any new dataset without requiring any new training. The software is publicly available at .

Download Full-text

A fast and efficient nearest neighbor method for protein secondary structure prediction

2011 3rd International Conference on Advanced Computer Control ◽

10.1109/icacc.2011.6016402 ◽

2011 ◽

Cited By ~ 1

Author(s):

Wei Yang ◽

Kuanquan Wang ◽

Wangmeng Zuo

Keyword(s):

Secondary Structure ◽

Structure Prediction ◽

Nearest Neighbor ◽

Secondary Structure Prediction ◽

Protein Secondary Structure ◽

Protein Secondary Structure Prediction

Download Full-text

Quantitative high-throughput tests of ubiquitous RNA secondary structure prediction algorithms via RNA/protein binding

10.1101/571588 ◽

2019 ◽

Cited By ~ 2

Author(s):

Winston R. Becker ◽

Inga Jarmoskaite ◽

Kalli Kappel ◽

Pavanapuresan P. Vaidyanathan ◽

Sarah K. Denny ◽

...

Keyword(s):

Secondary Structure ◽

Protein Binding ◽

Rna Structure ◽

Structure Prediction ◽

Rna Secondary Structure ◽

Nearest Neighbor ◽

Secondary Structure Prediction ◽

Structural Features ◽

Vast Number ◽

Prediction Algorithms

AbstractNearest-neighbor (NN) rules provide a simple and powerful quantitative framework for RNA structure prediction that is strongly supported for canonical Watson-Crick duplexes from a plethora of thermodynamic measurements. Predictions of RNA secondary structure based on nearest-neighbor (NN) rules are routinely used to understand biological function and to engineer and control new functions in biotechnology. However, NN applications to RNA structural features such as internal and terminal loops rely on approximations and assumptions, with sparse experimental coverage of the vast number of possible sequence and structural features. To test to what extent NN rules accurately predict thermodynamic stabilities across RNAs with non-WC features, we tested their predictions using a quantitative high-throughput assay platform, RNA-MaP. Using a thermodynamic assay with coupled protein binding, we carried out equilibrium measurements for over 1000 RNAs with a range of predicted secondary structure stabilities. Our results revealed substantial scatter and systematic deviations between NN predictions and observed stabilities. Solution salt effects and incorrect or omitted loop parameters contribute to these observed deviations. Our results demonstrate the need to independently and quantitatively test NN computational algorithms to identify their capabilities and limitations. RNA-MaP and related approaches can be used to test computational predictions and can be adapted to obtain experimental data to improve RNA secondary structure and other prediction algorithms.Significance statementRNA secondary structure prediction algorithms are routinely used to understand, predict and design functional RNA structures in biology and biotechnology. Given the vast number of RNA sequence and structural features, these predictions rely on a series of approximations, and independent tests are needed to quantitatively evaluate the accuracy of predicted RNA structural stabilities. Here we measure the stabilities of over 1000 RNA constructs by using a coupled protein binding assay. Our results reveal substantial deviations from the RNA stabilities predicted by popular algorithms, and identify factors contributing to the observed deviations. We demonstrate the importance of quantitative, experimental tests of computational RNA structure predictions and present an approach that can be used to routinely test and improve the prediction accuracy.

Download Full-text

PROFILES AND FUZZY K-NEAREST NEIGHBOR ALGORITHM FOR PROTEIN SECONDARY STRUCTURE PREDICTION

Proceedings of the 3rd Asia-Pacific Bioinformatics Conference ◽

10.1142/9781860947322_0009 ◽

2005 ◽

Cited By ~ 6

Author(s):

RAJKUMAR BONDUGULA ◽

OGNEN DUZLEVSKI ◽

DONG XU

Keyword(s):

Secondary Structure ◽

Structure Prediction ◽

Nearest Neighbor ◽

Secondary Structure Prediction ◽

Protein Secondary Structure ◽

K Nearest Neighbor ◽

Protein Secondary Structure Prediction ◽

Nearest Neighbor Algorithm ◽

K Nearest Neighbor Algorithm

Download Full-text

The ‘30K’ superfamily of viral movement proteins

Microbiology ◽

10.1099/0022-1317-81-1-257 ◽

2000 ◽

Vol 81 (1) ◽

pp. 257-266 ◽

Cited By ~ 206

Author(s):

Ulrich Melcher

Keyword(s):

Secondary Structure ◽

Structure Prediction ◽

Secondary Structure Prediction ◽

Consensus Sequence ◽

Amino Acid Sequences ◽

Movement Proteins ◽

Consensus Sequences ◽

Viral Movement ◽

Secondary Structure Predictions ◽

Α Helix

Relationships among the amino acid sequences of viral movement proteins related to the 30 kDa (‘30K’) movement protein of tobacco mosaic virus – the 30K superfamily – were explored. Sequences were grouped into 18 families. A comparison of secondary structure predictions for each family revealed a common predicted core structure flanked by variable N- and C-terminal domains. The core consisted of a series of β-elements flanked by an α-helix on each end. Consensus sequences for each of the families were generated and aligned with one another. From this alignment an overall secondary structure prediction was generated and a consensus sequence that can recognize each family in database searches was obtained. The analysis led to criteria that were used to evaluate other virus-encoded proteins for possible membership of the 30K superfamily. A rhabdoviral and a tenuiviral protein were identified as 30K superfamily members, as were plant-encoded phloem proteins. Parsimony analysis grouped tubule-forming movement proteins separate from others. Establishment of the alignment of residues of diverse families facilitates comparison of mutagenesis experiments done on different movement proteins and should serve as a guide for further such experiments.

Download Full-text