scholarly journals Insertions and deletions in the RNA sequence–structure map

2021 ◽  
Vol 18 (183) ◽  
Author(s):  
Nora S. Martin ◽  
Sebastian E. Ahnert

Genotype–phenotype maps link genetic changes to their fitness effect and are thus an essential component of evolutionary models. The map between RNA sequences and their secondary structures is a key example and has applications in functional RNA evolution. For this map, the structural effect of substitutions is well understood, but models usually assume a constant sequence length and do not consider insertions or deletions. Here, we expand the sequence–structure map to include single nucleotide insertions and deletions by using the RNAshapes concept. To quantify the structural effect of insertions and deletions, we generalize existing definitions for robustness and non-neutral mutation probabilities. We find striking similarities between substitutions, deletions and insertions: robustness to substitutions is correlated with robustness to insertions and, for most structures, to deletions. In addition, frequent structural changes after substitutions also tend to be common for insertions and deletions. This is consistent with the connection between energetically suboptimal folds and possible structural transitions. The similarities observed hold both for genotypic and phenotypic robustness and mutation probabilities, i.e. for individual sequences and for averages over sequences with the same structure. Our results could have implications for the rate of neutral and non-neutral evolution.

2020 ◽  
Vol 17 (166) ◽  
pp. 20190784 ◽  
Author(s):  
Marcel Weiß ◽  
Sebastian E. Ahnert

In genotype–phenotype (GP) maps, the genotypes that map to the same phenotype are usually not randomly distributed across the space of genotypes, but instead are predominantly connected through one-point mutations, forming network components that are commonly referred to as neutral components (NCs). Because of their impact on evolutionary processes, the characteristics of these NCs, like their size or robustness, have been studied extensively. Here, we introduce a framework that allows the estimation of NC size and robustness in the GP map of RNA secondary structure. The advantage of this framework is that it only requires small samples of genotypes and their local environment, which also allows experimental realizations. We verify our framework by applying it to the exhaustively analysable GP map of RNA sequence length L = 15, and benchmark it against an existing method by applying it to longer, naturally occurring functional non-coding RNA sequences. Although it is specific to the RNA secondary structure GP map in the first place, our framework can probably be transferred and adapted to other sequence-to-structure GP maps.


Blood ◽  
2009 ◽  
Vol 114 (22) ◽  
pp. 2591-2591
Author(s):  
Josef Davidsson ◽  
Kajsa Paulsson ◽  
David Lindgren ◽  
Henrik Lilljebjörn ◽  
Tracy Chaplin ◽  
...  

Abstract Abstract 2591 Poster Board II-567 Although childhood high hyperdiploid acute lymphoblastic leukemia is associated with a favorable outcome, 20% relapse. This makes it important to identify these patients already at diagnosis to ensure proper risk-stratification. To identify changes associated with relapse and ascertain the genetic evolution patterns, SNP array and mutation analyses of FLT3, KRAS, NRAS, and PTPN11 were performed on 11 paired diagnostic/relapse samples. The “triples trisomies” +4, +10, and +17 were detected in 64%, a frequency similar to the one generally observed at diagnosis, thus questioning their favorable prognostic impact. Structural changes, mainly cryptic hemizygous deletions, were significantly more common at relapse (P<0.05). No single aberration was linked to relapse, but four deletions, involving IKZF1, PAX5, CDKN2A/B or AK3, were recurrent. Based on the genetic relationship between the paired samples, three groups were delineated: 1) identical genetic changes at diagnosis and relapse (18%), 2) clonal evolution with all changes at diagnosis being present at relapse (18%), and 3) clonal evolution with some changes conserved, lost or gained (64%), suggesting the presence of a preleukemic clone. This ancestral clone was characterized by numerical changes only, with structural changes and RTK-RAS mutations being secondary to the high hyperdiploid pattern and perhaps necessary for overt leukemia. Disclosures: No relevant conflicts of interest to declare.


2021 ◽  
Author(s):  
Stephane Emond ◽  
Florian Hollfelder

Abstract Insertions and deletions (InDels) are among the most frequent changes observed in natural protein evolution, yet their potential has hardly been harnessed in directed evolution experiments. Here we describe the standard protocol for TRIAD (Transposition-based Random Insertion And Deletion mutagenesis), a simple and efficient Mu transposon mutagenesis approach for generating libraries of single InDel variants with one, two or three triplet nucleotide insertions or deletions. This method has recently been employed in three published examples of InDel-based directed evolution of proteins, including a phosphotriesterase, a scFv antibody and an ancestral luciferase.


2021 ◽  
Vol 3 (1) ◽  
pp. 229-238
Author(s):  
Long-Ni Liang ◽  
◽  
Ming-Xu Wang ◽  
Weast-Siu Siu ◽  
◽  
...  

From 2007 to 2017, Guangdong exports grew at an average rate of 9.6%, while the energy consumption and carbon emission embodied in these trades demonstrated a declining trend. Is total real pollution embodied in exports showing the same trend? If so, what accounts for these changes? Prior studies have provided three explanations, producing greater amount of goods (“the scale effect”), adopting cleaner technologies in production processes (“the technology effect”), and producing proportionally more goods that are environmental-friendly (“the structural effect”). Question then arises as which factor is the driving force of such cleanup in the export business? To answer these questions, an EIO-LMDI (Environmental Input-Output and Logarithmic Mean Divisia Index) model is built to conduct a structural decomposition analysis of pollution embodied in Guangdong exports. We calculate that the pollution embodied in Guangdong export fell by 63 to 85 percent, depending on the pollutants. We further conclude that these pollution reductions are primarily driven by the technology advancement, with some industries, including the clothing industry, communications, computers and other electronic equipment, being more sensitive to the changes in technologies than others. The structural effect is more ambiguous. It only contributes to pollution reduction when the industry itself is pollution intensive.


2019 ◽  
Author(s):  
Alejandro Berrio ◽  
Ralph Haygood ◽  
Gregory A Wray

AbstractAdaptive changes in cis-regulatory elements are an essential component of evolution by natural selection. Identifying adaptive and functional noncoding DNA elements throughout the genome is therefore crucial for understanding the relationship between phenotype and genotype. Here, we introduce a method we called adaptyPhy, which adds significant improvements to our earlier method that tests for branch-specific directional selection in noncoding sequences. The motivation for these improvements is to provide a more sensitive and better targeted characterization of directional selection and neutral evolution across the genome. We use ENCODE annotations to identify appropriate proxy neutral sequences and demonstrate that the conservativeness of the test can be modulated during the filtration of reference alignments. We apply the method to noncoding Human Accelerated Elements as well as open chromatin elements previously identified in 125 human tissues and cell lines to demonstrate its utility. We also simulate sequence alignments under different classes of evolution in order to validate the ability of adaptiPhy to distinguish positive selection from relaxation of constraint and neutral evolution. Finally, we evaluate the impact of query region length, proxy neutral sequence length, and branch count on test sensitivity.


2015 ◽  
Vol 86 (11) ◽  
pp. e4.154-e4
Author(s):  
WO Pickrell ◽  
CHD Hope ◽  
AT Higgins ◽  
JGL Mullins ◽  
PEM Smith ◽  
...  

BackgroundWe identified a family with autosomal dominant lateral temporal lobe epilepsy (ADLTLE). Given that LGI1 mutations account for around 50% of families with ADLTLE, we screened family members for LGI1 variants.MethodWe sequenced all exonic regions of LGI1 and used in-silico analysis tools to assess the potential affect of the novel variant. We screened 106 control samples for the variant and assessed the structural effect of the variant using a protein modelling platform.ResultsThe proband's seizures consist of an unilateral ‘buzzing’ sensation which progresses to unilateral limb numbness and secondarily generalised seizures. Some noises can provoke seizures. Her mother also has epilepsy with identical seizure semiology. We identified a novel heterozygous missense LGI1 variant in the proband and her mother which was not present in other family members or control samples. This variant is close to the splice site region of LGI1 exon 4 and is predicted to be deleterious. Protein modelling suggests that the variant causes conformational structural changes.ConclusionWe present a family with ADLTLE caused by a novel variant in LGI1. This variant is predicted to be deleterious, alters protein function and adds additional evidence for the role of LGI1 in ADLTLE.


Cells ◽  
2019 ◽  
Vol 8 (9) ◽  
pp. 958
Author(s):  
Lee ◽  
Chang ◽  
Russell ◽  
Lipsitch ◽  
Maurer-Stroh

Animal studies aimed at understanding influenza virus mutations that change host specificity to adapt to replication in mammalian hosts are necessarily limited in sample numbers due to high cost and safety requirements. As a safe, higher-throughput alternative, we explore the possibility of using readily available passage bias data obtained mostly from seasonal H1 and H3 influenza strains that were differentially grown in mammalian (MDCK) and avian cells (eggs). Using a statistical approach over 80,000 influenza hemagglutinin sequences with passage information, we found that passage bias sites are most commonly found in three regions: (i) the globular head domain around the receptor binding site, (ii) the region that undergoes pH-dependent structural changes and (iii) the unstructured N-terminal region harbouring the signal peptide. Passage bias sites were consistent among different passage cell types as well as between influenza A subtypes. We also find epistatic interactions of site pairs supporting the notion of host-specific dependency of mutations on virus genomic background. The sites identified from our large-scale sequence analysis substantially overlap with known host adaptation sites in the WHO H5N1 genetic changes inventory suggesting information from passage bias can provide candidate sites for host specificity changes to aid in risk assessment for emerging strains.


2016 ◽  
Vol 30 (02) ◽  
pp. 1550255
Author(s):  
Sing-Guan Kong ◽  
Hong-Da Chen ◽  
Andrew Torda ◽  
H. C. Lee

We propose an order index, [Formula: see text], which quantifies the notion of “life at the edge of chaos” when applied to genome sequences. It maps genomes to a number from 0 (random and of infinite length) to 1 (fully ordered) and applies regardless of sequence length and base composition. The 786 complete genomic sequences in GenBank were found to have [Formula: see text] values in a very narrow range, 0.037 ± 0.027. We show this implies that genomes are halfway towards being completely random, namely, at the edge of chaos. We argue that this narrow range represents the neighborhood of a fixed-point in the space of sequences, and genomes are driven there by the dynamics of a robust, predominantly neutral evolution process.


2021 ◽  
Vol 16 (1) ◽  
Author(s):  
Fenix W. Huang ◽  
Christopher L. Barrett ◽  
Christian M. Reidys

Abstract Background Genotype-phenotype maps provide a meaningful filtration of sequence space and RNA secondary structures are particular such phenotypes. Compatible sequences, which satisfy the base-pairing constraints of a given RNA structure, play an important role in the context of neutral evolution. Sequences that are simultaneously compatible with two given structures (bicompatible sequences), are beacons in phenotypic transitions, induced by erroneously replicating populations of RNA sequences. RNA riboswitches, which are capable of expressing two distinct secondary structures without changing the underlying sequence, are one example of bicompatible sequences in living organisms. Results We present a full loop energy model Boltzmann sampler of bicompatible sequences for pairs of structures. The sequence sampler employs a dynamic programming routine whose time complexity is polynomial when assuming the maximum number of exposed vertices, $$\kappa $$ κ , is a constant. The parameter $$\kappa $$ κ depends on the two structures and can be very large. We introduce a novel topological framework encapsulating the relations between loops that sheds light on the understanding of $$\kappa $$ κ . Based on this framework, we give an algorithm to sample sequences with minimum $$\kappa $$ κ on a particular topologically classified case as well as giving hints to the solution in the other cases. As a result, we utilize our sequence sampler to study some established riboswitches. Conclusion Our analysis of riboswitch sequences shows that a pair of structures needs to satisfy key properties in order to facilitate phenotypic transitions and that pairs of random structures are unlikely to do so. Our analysis observes a distinct signature of riboswitch sequences, suggesting a new criterion for identifying native sequences and sequences subjected to evolutionary pressure. Our free software is available at: https://github.com/FenixHuang667/Bifold.


Sign in / Sign up

Export Citation Format

Share Document