scholarly journals Neutral components show a hierarchical community structure in the genotype–phenotype map of RNA secondary structure

2020 ◽  
Vol 17 (171) ◽  
pp. 20200608
Author(s):  
Marcel Weiß ◽  
Sebastian E. Ahnert

Genotype–phenotype (GP) maps describe the relationship between biological sequences and structural or functional outcomes. They can be represented as networks in which genotypes are the nodes, and one-point mutations between them are the edges. The genotypes that map to the same phenotype form subnetworks consisting of one or multiple disjoint connected components–so-called neutral components (NCs). For the GP map of RNA secondary structure, the NCs have been found to exhibit distinctive network features that can affect the dynamical processes taking place on them. Here, we focus on the community structure of RNA secondary structure NCs. Building on previous findings, we introduce a method to reveal the hierarchical community structure solely from the sequence constraints and composition of the genotypes that form a given NC. Thereby, we obtain modularity values similar to common community detection algorithms, which are much more complex. From this knowledge, we endorse a sampling method that allows a fast exploration of the different communities of a given NC. Furthermore, we introduce a way to estimate the community structure from genotype samples, which is useful when an exhaustive analysis of the NC is not feasible, as is the case for longer sequence lengths.

2020 ◽  
Vol 17 (166) ◽  
pp. 20190784 ◽  
Author(s):  
Marcel Weiß ◽  
Sebastian E. Ahnert

In genotype–phenotype (GP) maps, the genotypes that map to the same phenotype are usually not randomly distributed across the space of genotypes, but instead are predominantly connected through one-point mutations, forming network components that are commonly referred to as neutral components (NCs). Because of their impact on evolutionary processes, the characteristics of these NCs, like their size or robustness, have been studied extensively. Here, we introduce a framework that allows the estimation of NC size and robustness in the GP map of RNA secondary structure. The advantage of this framework is that it only requires small samples of genotypes and their local environment, which also allows experimental realizations. We verify our framework by applying it to the exhaustively analysable GP map of RNA sequence length L = 15, and benchmark it against an existing method by applying it to longer, naturally occurring functional non-coding RNA sequences. Although it is specific to the RNA secondary structure GP map in the first place, our framework can probably be transferred and adapted to other sequence-to-structure GP maps.


2015 ◽  
Vol 12 (113) ◽  
pp. 20150724 ◽  
Author(s):  
S. F. Greenbury ◽  
S. E. Ahnert

Biological information is stored in DNA, RNA and protein sequences, which can be understood as genotypes that are translated into phenotypes. The properties of genotype–phenotype (GP) maps have been studied in great detail for RNA secondary structure. These include a highly biased distribution of genotypes per phenotype, negative correlation of genotypic robustness and evolvability, positive correlation of phenotypic robustness and evolvability, shape-space covering, and a roughly logarithmic scaling of phenotypic robustness with phenotypic frequency. More recently similar properties have been discovered in other GP maps, suggesting that they may be fundamental to biological GP maps, in general, rather than specific to the RNA secondary structure map. Here we propose that the above properties arise from the fundamental organization of biological information into ‘constrained' and ‘unconstrained' sequences, in the broadest possible sense. As ‘constrained' we describe sequences that affect the phenotype more immediately, and are therefore more sensitive to mutations, such as, e.g. protein-coding DNA or the stems in RNA secondary structure. ‘Unconstrained' sequences, on the other hand, can mutate more freely without affecting the phenotype, such as, e.g. intronic or intergenic DNA or the loops in RNA secondary structure. To test our hypothesis we consider a highly simplified GP map that has genotypes with ‘coding' and ‘non-coding' parts. We term this the Fibonacci GP map, as it is equivalent to the Fibonacci code in information theory. Despite its simplicity the Fibonacci GP map exhibits all the above properties of much more complex and biologically realistic GP maps. These properties are therefore likely to be fundamental to many biological GP maps.


2013 ◽  
Vol 16 (01) ◽  
pp. 1250052 ◽  
Author(s):  
SUSANNA C. MANRUBIA ◽  
RAFAEL SANJUÁN

A suitable model to dive into the properties of genotype-phenotype landscapes is the relationship between RNA sequences and their corresponding minimum free energy secondary structures. Relevant issues related to molecular evolvability and robustness to mutations have been studied in this framework. Here, we analyze the one-mutant neighborhood of the predicted secondary structure of 46 different RNAs, including tRNAs, viroids, larger molecules such as Hepatitis-δ virus, and several random sequences. The probability distribution of the effect of point mutations in linear structural motifs of the secondary structure is well fit by Pareto or Lognormal probability distributions functions, independent of the origin of the RNA molecule. This extends previous results to the case of natural sequences of diverse origins. We introduce a new indicator of robustness, the average weighted length of linear motifs (AwL) and demonstrate that it correlates with the average effect of point mutations in RNA secondary structures. Structures with a high AwL value display the highest structural robustness and cluster in particular regions of sequence space.


2001 ◽  
Vol 82 (6) ◽  
pp. 1339-1348 ◽  
Author(s):  
Charlotta Polacek ◽  
A. Michael Lindberg

The secondary structure of the 3′ untranslated region (3′UTR) of picornaviruses is thought to be important for the initiation of negative-strand RNA synthesis. In this study, genetic and biological analyses of the 3′ terminus of coxsackievirus B2 (CVB2), which differs from other enteroviruses due to the presence of five additional nucleotides prior to the poly(A) tail, is reported. The importance of this extension was investigated using a 3′UTR mutant lacking the five nucleotides prior to the poly(A) tail and containing two point mutations. The predicted secondary structure within the 3′UTR of this mutant was less energetically favourable compared with that of the wild-type (wt) genotype. This mutant clone was transfected into green monkey kidney cells in four parallel experiments and propagated for multiple passages, enabling the virus to establish a stable revertant genotype. Genetic analysis of the virus progeny from these different passages revealed two major types of revertant. Both types showed wt-like growth properties and more stable and wt-like predicted secondary structures than the parent mutant clone. The first type of revertant neutralized the introduced point mutation with a compensatory second-site mutation, whereas the second type of revertant partly compensated for the deletion of the five proximal nucleotides by the insertion of nucleotides that matched the wt sequence. Therefore, the extended 3′ end of CVB2 may be considered to be a stabilizing sequence for RNA secondary structure and an important feature for the virus.


Sign in / Sign up

Export Citation Format

Share Document