RNALL: AN EFFICIENT ALGORITHM FOR PREDICTING RNA LOCAL SECONDARY STRUCTURAL LANDSCAPE IN GENOMES

2006 ◽  
Vol 04 (05) ◽  
pp. 1015-1031 ◽  
Author(s):  
XIU-FENG WAN ◽  
GUOHUI LIN ◽  
DONG XU

Background: The information of RNA local secondary structures (LSSs) can help retrieve biologically important motifs and study functions of RNA molecules. Most of the current RNA secondary structure prediction tools are not suitable for RNA LSS prediction on the genome scale due to high computational complexity. Methods: We developed a new computer package Rnall based on a dynamic programming technique, which scans an RNA sequence with a sliding window and extracts all RNA LSSs with sizes no larger than the window size using the nearest neighbor thermodynamic parameters. The worst case running time of Rnall is O(W3L), where W is the window size and L is the query sequence length. In practice we observed a running time of O(W2L). We further introduced the concept of energy landscape for illustrating RNA LSS, which may facilitate RNA motif mining on the genomic scale. Results: Rnall shows better prediction accuracy than two other prediction tools Lfold and Quickfold. Rnall is also applied to scan for RNA LSSs in three genomes, and the prediction maps well with known RNA motifs. Conclusions: Rnall is designed for RNA LSS prediction and together with the energy landscape, it has unique features that could be used for RNA structural motif mining. Rnall is freely available for download at or .

2019 ◽  
Vol 16 (2) ◽  
pp. 159-172 ◽  
Author(s):  
Elaheh Kashani-Amin ◽  
Ozra Tabatabaei-Malazy ◽  
Amirhossein Sakhteman ◽  
Bagher Larijani ◽  
Azadeh Ebrahim-Habibi

Background: Prediction of proteins’ secondary structure is one of the major steps in the generation of homology models. These models provide structural information which is used to design suitable ligands for potential medicinal targets. However, selecting a proper tool between multiple Secondary Structure Prediction (SSP) options is challenging. The current study is an insight into currently favored methods and tools, within various contexts. Objective: A systematic review was performed for a comprehensive access to recent (2013-2016) studies which used or recommended protein SSP tools. Methods: Three databases, Web of Science, PubMed and Scopus were systematically searched and 99 out of the 209 studies were finally found eligible to extract data. Results: Four categories of applications for 59 retrieved SSP tools were: (I) prediction of structural features of a given sequence, (II) evaluation of a method, (III) providing input for a new SSP method and (IV) integrating an SSP tool as a component for a program. PSIPRED was found to be the most popular tool in all four categories. JPred and tools utilizing PHD (Profile network from HeiDelberg) method occupied second and third places of popularity in categories I and II. JPred was only found in the two first categories, while PHD was present in three fields. Conclusion: This study provides a comprehensive insight into the recent usage of SSP tools which could be helpful for selecting a proper tool.


2019 ◽  
Vol 100 (12) ◽  
pp. 1663-1673 ◽  
Author(s):  
Jyoti Rana ◽  
José Luis Slon Campos ◽  
Monica Poggianella ◽  
Oscar R. Burrone

The assembly and secretion of flaviviruses are part of an elegantly regulated process. During maturation, the viral polyprotein undergoes several co- and post-translational cleavages mediated by both viral and host proteases. Among these, sequential cleavage at the N and C termini of the hydrophobic capsid anchor (Ca) is crucial in deciding the fate of viral infection. Here, using a refined dengue pseudovirus production system, along with cleavage and furin inhibition assays, immunoblotting and secondary structure prediction analysis, we show that Ca plays a key role in the processing efficiency of dengue virus type 2 (DENV2) structural proteins and viral particle assembly. Replacement of the DENV2 Ca with the homologous regions from West nile or Zika viruses or, alternatively, increasing its length, improved cleavage and hence particle assembly. Further, we showed that substitution of the Ca conserved proline residue (P110) to alanine abolishes pseudovirus production, regardless of the Ca sequence length. Besides providing the results of a biochemical analysis of DENV2 structural polyprotein processing, this study also presents a system for efficient production of dengue pseudoviruses.


2019 ◽  
Author(s):  
Diksha Priya Lotun ◽  
Charlotte Cochard ◽  
Fabio R.J Vieira ◽  
Juliana Silva Bernardes

2dSS is a web-server for visualising and comparing secondary structure predictions. It provides two main functionalities: 2D-alignment and compare predictions. The “2D-alignment” has been designed to visualise conserved secondary structure elements in a multiple sequence alignment (MSA). From this we can study the secondary structure content of homologous proteins (a protein family) and highlight its structural patterns. The “compare predictions” has been designed to compare the output of several secondary structure prediction tools, and check their accuracy when compared with real secondary structure elements extracted from 3D-structure. 2dSS provides a comprehensive representation of protein secondary structure elements, and it can be used to visualise and compare secondary structures of any prediction tool.Availabilityhttp://genome.lcqb.upmc.fr/2dss/


2013 ◽  
Vol 2013 ◽  
pp. 1-8 ◽  
Author(s):  
Yin-Fu Huang ◽  
Shu-Ying Chen

We propose a protein secondary structure prediction method based on position-specific scoring matrix (PSSM) profiles and four physicochemical features including conformation parameters, net charges, hydrophobic, and side chain mass. First, the SVM with the optimal window size and the optimal parameters of the kernel function is found. Then, we train the SVM using the PSSM profiles generated from PSI-BLAST and the physicochemical features extracted from the CB513 data set. Finally, we use the filter to refine the predicted results from the trained SVM. For all the performance measures of our method,Q3reaches 79.52, SOV94 reaches 86.10, and SOV99 reaches 74.60; all the measures are higher than those of the SVMpsi method and the SVMfreq method. This validates that considering these physicochemical features in predicting protein secondary structure would exhibit better performances.


2019 ◽  
Vol 20 (S25) ◽  
Author(s):  
Weizhong Lu ◽  
Ye Tang ◽  
Hongjie Wu ◽  
Hongmei Huang ◽  
Qiming Fu ◽  
...  

Abstract Background RNA secondary structure prediction is an important issue in structural bioinformatics, and RNA pseudoknotted secondary structure prediction represents an NP-hard problem. Recently, many different machine-learning methods, Markov models, and neural networks have been employed for this problem, with encouraging results regarding their predictive accuracy; however, their performances are usually limited by the requirements of the learning model and over-fitting, which requires use of a fixed number of training features. Because most natural biological sequences have variable lengths, the sequences have to be truncated before the features are employed by the learning model, which not only leads to the loss of information but also destroys biological-sequence integrity. Results To address this problem, we propose an adaptive sequence length based on deep-learning model and integrate an energy-based filter to remove the over-fitting base pairs. Conclusions Comparative experiments conducted on an authoritative dataset RNA STRAND (RNA secondary STRucture and statistical Analysis Database) revealed a 12% higher accuracy relative to three currently used methods.


Viruses ◽  
2019 ◽  
Vol 11 (5) ◽  
pp. 401 ◽  
Author(s):  
Kiening ◽  
Ochsenreiter ◽  
Hellinger ◽  
Rattei ◽  
Hofacker ◽  
...  

RNA secondary structure in untranslated and protein coding regions has been shown to play an important role in regulatory processes and the viral replication cycle. While structures in non-coding regions have been investigated extensively, a thorough overview of the structural repertoire of protein coding mRNAs, especially for viruses, is lacking. Secondary structure prediction of large molecules, such as long mRNAs remains a challenging task, as the contingent of structures a sequence can theoretically fold into grows exponentially with sequence length. We applied a structure prediction pipeline to Viral Orthologous Groups that first identifies the local boundaries of potentially structured regions and subsequently predicts their functional importance. Using this procedure, the orthologous groups were split into structurally homogenous subgroups, which we call subVOGs. This is the first compilation of potentially functional conserved RNA structures in viral coding regions, covering the complete RefSeq viral database. We were able to recover structural elements from previous studies and discovered a variety of novel structured regions. The subVOGs are available through our web resource RNASIV (RNA structure in viruses; http://rnasiv.bio.wzw.tum.de).


2021 ◽  
Author(s):  
Christoph Flamm ◽  
Julia Wielach ◽  
Michael T. Wolfinger ◽  
Stefan Badelt ◽  
Ronny Lorenz ◽  
...  

Machine learning (ML) and in particular deep learning techniques have gained popularity for predicting structures from biopolymer sequences. An interesting case is the prediction of RNA secondary structures, where well established biophysics based methods exist. These methods even yield exact solutions under certain simplifying assumptions. Nevertheless, the accuracy of these classical methods is limited and has seen little improvement over the last decade. This makes it an attractive target for machine learning and consequently several deep learning models have been proposed in recent years. In this contribution we discuss limitations of current approaches, in particular due to biases in the training data. Furthermore, we propose to study capabilities and limitations of ML models by first applying them on synthetic data that can not only be generated in arbitrary amounts, but are also guaranteed to be free of biases. We apply this idea by testing several ML models of varying complexity. Finally, we show that the best models are capable of capturing many, but not all, properties of RNA secondary structures. Most severely, the number of predicted base pairs scales quadratically with sequence length, even though a secondary structure can only accommodate a linear number of pairs.


Sign in / Sign up

Export Citation Format

Share Document