scholarly journals Using small samples to estimate neutral component size and robustness in the genotype–phenotype map of RNA secondary structure

2020 ◽  
Vol 17 (166) ◽  
pp. 20190784 ◽  
Author(s):  
Marcel Weiß ◽  
Sebastian E. Ahnert

In genotype–phenotype (GP) maps, the genotypes that map to the same phenotype are usually not randomly distributed across the space of genotypes, but instead are predominantly connected through one-point mutations, forming network components that are commonly referred to as neutral components (NCs). Because of their impact on evolutionary processes, the characteristics of these NCs, like their size or robustness, have been studied extensively. Here, we introduce a framework that allows the estimation of NC size and robustness in the GP map of RNA secondary structure. The advantage of this framework is that it only requires small samples of genotypes and their local environment, which also allows experimental realizations. We verify our framework by applying it to the exhaustively analysable GP map of RNA sequence length L = 15, and benchmark it against an existing method by applying it to longer, naturally occurring functional non-coding RNA sequences. Although it is specific to the RNA secondary structure GP map in the first place, our framework can probably be transferred and adapted to other sequence-to-structure GP maps.

2020 ◽  
Vol 36 (9) ◽  
pp. 2920-2922
Author(s):  
Matan Drory Retwitzer ◽  
Vladimir Reinharz ◽  
Alexander Churkin ◽  
Yann Ponty ◽  
Jérôme Waldispühl ◽  
...  

Abstract Summary RNA design has conceptually evolved from the inverse RNA folding problem. In the classical inverse RNA problem, the user inputs an RNA secondary structure and receives an output RNA sequence that folds into it. Although modern RNA design methods are based on the same principle, a finer control over the resulting sequences is sought. As an important example, a substantial number of non-coding RNA families show high preservation in specific regions, while being more flexible in others and this information should be utilized in the design. By using the additional information, RNA design tools can help solve problems of practical interest in the growing fields of synthetic biology and nanotechnology. incaRNAfbinv 2.0 utilizes a fragment-based approach, enabling a control of specific RNA secondary structure motifs. The new version allows significantly more control over the general RNA shape, and also allows to express specific restrictions over each motif separately, in addition to other advanced features. Availability and implementation incaRNAfbinv 2.0 is available through a standalone package and a web-server at https://www.cs.bgu.ac.il/incaRNAfbinv. Source code, command-line and GUI wrappers can be found at https://github.com/matandro/RNAsfbinv. Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Vol 15 (2) ◽  
pp. 135-143
Author(s):  
Sha Shi ◽  
Xin-Li Zhang ◽  
Le Yang ◽  
Wei Du ◽  
Xian-Li Zhao ◽  
...  

Background: The prediction of RNA secondary structure using optimization algorithms is key to understand the real structure of an RNA. Evolutionary algorithms (EAs) are popular strategies for RNA secondary structure prediction. However, compared to most state-of-the-art software based on DPAs, the performances of EAs are a bit far from satisfactory. Objective: Therefore, a more powerful strategy is required to improve the performances of EAs when applied to the prediciton of RNA secondary structures. Methods: The idea of quantum computing is introduced here yielding a new strategy to find all possible legal paired-bases with the constraint of minimum free energy. The sate of a stem pool with size N is encoded as a population of QGA, which is represented by N quantum bits but not classical bits. The updating of populations is accomplished by so-called quantum crossover operations, quantum mutation operations and quantum rotation operations. Results: The numerical results show that the performances of traditional EAs are significantly improved by using QGA with regard to not only prediction accuracy and sensitivity but also complexity. Moreover, for RNA sequences with middle-short length, QGA even improves the state-of-art software based on DPAs in terms of both prediction accuracy and sensitivity. Conclusion: This work sheds an interesting light on the applications of quantum computing on RNA structure prediction.


2014 ◽  
Vol 4 (3) ◽  
Author(s):  
Mária Šimalová ◽  
Gabriela Andrejková

AbstractIn the paper, we describe and develop more effective solutions of two important problems in bioinformatics. The first problem is the multiple sequence alignment problem and the second problem is RNA secondary structure prediction (folding) problem. Each of these problems should be solved with better results if we know the solution of the other one, but usually we only have sequences and we know neither the alignment nor the secondary structure. Precise algorithms solving both of these problems simultaneously are computationally pretentious according to the big length of RNA sequences. In this paper, we have described the method of speeding up the Sankoff’s simultaneous alignment and folding algorithm using the Carrillo-Lipman approach to cut off those computations, that can never lead to an optimal solution.


Science ◽  
2010 ◽  
Vol 327 (5962) ◽  
pp. 202-206 ◽  
Author(s):  
Maximillian H. Bailor ◽  
Xiaoyan Sun ◽  
Hashim M. Al-Hashimi

Thermodynamic rules that link RNA sequences to secondary structure are well established, but the link between secondary structure and three-dimensional global conformation remains poorly understood. We constructed comprehensive three-dimensional maps depicting the orientation of A-form helices across RNA junctions in the Protein Data Bank and rationalized our findings with modeling and nuclear magnetic resonance spectroscopy. We show that the secondary structures of junctions encode readily computable topological constraints that accurately predict the three-dimensional orientation of helices across all two-way junctions. Our results suggest that RNA global conformation is largely defined by topological constraints encoded at the secondary structural level and that tertiary contacts and intermolecular interactions serve to stabilize specific conformers within the topologically allowed ensemble.


2020 ◽  
Vol 17 (171) ◽  
pp. 20200608
Author(s):  
Marcel Weiß ◽  
Sebastian E. Ahnert

Genotype–phenotype (GP) maps describe the relationship between biological sequences and structural or functional outcomes. They can be represented as networks in which genotypes are the nodes, and one-point mutations between them are the edges. The genotypes that map to the same phenotype form subnetworks consisting of one or multiple disjoint connected components–so-called neutral components (NCs). For the GP map of RNA secondary structure, the NCs have been found to exhibit distinctive network features that can affect the dynamical processes taking place on them. Here, we focus on the community structure of RNA secondary structure NCs. Building on previous findings, we introduce a method to reveal the hierarchical community structure solely from the sequence constraints and composition of the genotypes that form a given NC. Thereby, we obtain modularity values similar to common community detection algorithms, which are much more complex. From this knowledge, we endorse a sampling method that allows a fast exploration of the different communities of a given NC. Furthermore, we introduce a way to estimate the community structure from genotype samples, which is useful when an exhaustive analysis of the NC is not feasible, as is the case for longer sequence lengths.


2019 ◽  
Vol 17 (05) ◽  
pp. 1950031 ◽  
Author(s):  
Abdelhakim El Fatmi ◽  
M. Ali Bekri ◽  
Said Benhlima

The prediction of the optimal secondary structure for a given RNA sequence represents a challenging computational problem in bioinformatics. This challenge becomes harder especially with the discovery of different pseudoknot classes, which is a complex topology that plays diverse roles in biological processes. Many recent studies have been proposed to predict RNA secondary structure with some pseudoknot classes, but only a few of them have reached satisfying results in terms of both complexity and accuracy. Here we present RNAknot, a new method for predicting RNA secondary structure that contains the following components: stems, hairpin loops, multi-branched loops or multi-loops, bulge loops, and internal loops, in addition to two types of pseudoknots, H-type pseudoknot and Hairpin kissing. RNAknot is based on a genetic algorithm and Greedy Randomized Adaptive Search Procedure (GRASP), and it uses the free energy as fitness function to evaluate the obtained structures. In order to validate the performance of the presented method 131 tests have been performed using two datasets of 26 and 105 RNA sequences, which have been taken from the two data bases RNAstrand and Pseudobase respectively. The obtained results are compared with those of some RNA secondary structure prediction programs such as Vs_subopt, CyloFold, IPknot, Kinefold, RNAstructure, and Sfold. The results of this comparative study show that the prediction accuracy of our proposed approach is significantly improved compared to those obtained by the other programs. For the first dataset, RNAknot has the highest specificity (SP) (71.23%) and sensitivity (SN) (72.15%) averages compared to the other programs. Concerning the second dataset, the RNA secondary structure predictions obtained by the RNAknot correspond to the highest averages of SP (85.49%) and F-measure (79.97%) compared to the other programs. The program is available as a jar file in the link: www.bachmek.umi.ac.ma/wp-content/uploads/RNAknot.0.0.2.rar .


2013 ◽  
Vol 16 (01) ◽  
pp. 1250052 ◽  
Author(s):  
SUSANNA C. MANRUBIA ◽  
RAFAEL SANJUÁN

A suitable model to dive into the properties of genotype-phenotype landscapes is the relationship between RNA sequences and their corresponding minimum free energy secondary structures. Relevant issues related to molecular evolvability and robustness to mutations have been studied in this framework. Here, we analyze the one-mutant neighborhood of the predicted secondary structure of 46 different RNAs, including tRNAs, viroids, larger molecules such as Hepatitis-δ virus, and several random sequences. The probability distribution of the effect of point mutations in linear structural motifs of the secondary structure is well fit by Pareto or Lognormal probability distributions functions, independent of the origin of the RNA molecule. This extends previous results to the case of natural sequences of diverse origins. We introduce a new indicator of robustness, the average weighted length of linear motifs (AwL) and demonstrate that it correlates with the average effect of point mutations in RNA secondary structures. Structures with a high AwL value display the highest structural robustness and cluster in particular regions of sequence space.


2015 ◽  
Vol 5 (6) ◽  
pp. 20150053 ◽  
Author(s):  
Kamaludin Dingle ◽  
Steffen Schaper ◽  
Ard A. Louis

The prevalence of neutral mutations implies that biological systems typically have many more genotypes than phenotypes. But, can the way that genotypes are distributed over phenotypes determine evolutionary outcomes? Answering such questions is difficult, in part because the number of genotypes can be hyper-astronomically large. By solving the genotype–phenotype (GP) map for RNA secondary structure (SS) for systems up to length L = 126 nucleotides (where the set of all possible RNA strands would weigh more than the mass of the visible universe), we show that the GP map strongly constrains the evolution of non-coding RNA (ncRNA). Simple random sampling over genotypes predicts the distribution of properties such as the mutational robustness or the number of stems per SS found in naturally occurring ncRNA with surprising accuracy. Because we ignore natural selection, this strikingly close correspondence with the mapping suggests that structures allowing for functionality are easily discovered, despite the enormous size of the genetic spaces. The mapping is extremely biased: the majority of genotypes map to an exponentially small portion of the morphospace of all biophysically possible structures. Such strong constraints provide a non-adaptive explanation for the convergent evolution of structures such as the hammerhead ribozyme. These results present a particularly clear example of bias in the arrival of variation strongly shaping evolutionary outcomes and may be relevant to Mayr's distinction between proximate and ultimate causes in evolutionary biology.


Sign in / Sign up

Export Citation Format

Share Document