scholarly journals Emergence of function from single RNA sequences by Darwinian evolution

2021 ◽  
Author(s):  
Falk Wachowius ◽  
Benjamin T. Porebski ◽  
Christopher M. Johnson ◽  
Philipp Holliger

AbstractThe spontaneous emergence of function from pools of random sequence RNA is widely considered an important transition in the origin of life. However, the plausibility of this hypothetical process and the number of productive evolutionary trajectories in sequence space are unknown. Here we demonstrate that function can arise starting from a single RNA sequence by an iterative process of mutation and selection. Specifically, we describe the discovery of both specific ATP or GTP aptamers - with micromolar affinity for their nucleotide ligand - starting each from a single, homopolymeric poly-A sequence flanked by conserved primer binding sites. Our results indicate that the ab initio presence of large, diverse random sequence pools is not a prerequisite for the emergence of functional RNAs and that the process of Darwinian evolution has the capacity to generate function even from single, largely unstructured RNA sequences with minimal molecular and informational complexity.

2020 ◽  
Vol 117 (11) ◽  
pp. 5741-5748 ◽  
Author(s):  
Travis Walton ◽  
Saurja DasGupta ◽  
Daniel Duzdevich ◽  
Seung Soo Oh ◽  
Jack W. Szostak

The hypothesized central role of RNA in the origin of life suggests that RNA propagation predated the advent of complex protein enzymes. A critical step of RNA replication is the template-directed synthesis of a complementary strand. Two experimental approaches have been extensively explored in the pursuit of demonstrating protein-free RNA synthesis: template-directed nonenzymatic RNA polymerization using intrinsically reactive monomers and ribozyme-catalyzed polymerization using more stable substrates such as biological 5′-triphosphates. Despite significant progress in both approaches in recent years, the assembly and copying of functional RNA sequences under prebiotic conditions remains a challenge. Here, we explore an alternative approach to RNA-templated RNA copying that combines ribozyme catalysis with RNA substrates activated with a prebiotically plausible leaving group, 2-aminoimidazole (2AI). We applied in vitro selection to identify ligase ribozymes that catalyze phosphodiester bond formation between a template-bound primer and a phosphor-imidazolide–activated oligomer. Sequencing revealed the progressive enrichment of 10 abundant sequences from a random sequence pool. Ligase activity was detected in all 10 RNA sequences; all required activation of the ligator with 2AI and generated a 3′-5′ phosphodiester bond. We propose that ribozyme catalysis of phosphodiester bond formation using intrinsically reactive RNA substrates, such as imidazolides, could have been an evolutionary step connecting purely nonenzymatic to ribozyme-catalyzed RNA template copying during the origin of life.


2006 ◽  
Vol 34 (4) ◽  
pp. 560-561 ◽  
Author(s):  
R.A. Watson ◽  
D.M. Weinreich ◽  
J. Wakeley

Whereas spontaneous point mutation operates on nucleotides individually, sexual recombination manipulates the set of nucleotides within an allele as an essentially particulate unit. In principle, these two different scales of variation enable selection to follow fitness gradients in two different spaces: in nucleotide sequence space and allele sequence space respectively. Epistasis for fitness at these two scales, between nucleotides and between genes, may be qualitatively different and may significantly influence the advantage of mutation-based and recombination-based evolutionary trajectories respectively. We examine scenarios where the genetic sequence within a gene strongly influences the fitness effect of a mutation in that gene, whereas epistatic interactions between sites in different genes are weak or absent. We find that, in cases where beneficial alleles of a gene differ from one another at several nucleotide sites, sexual populations can exhibit enormous benefit compared with asexual populations: not only discovering fit genotypes faster than asexual populations, but also discovering high-fitness genotypes that are effectively not evolvable in asexual populations.


2008 ◽  
Vol 73 (1) ◽  
pp. 41-53
Author(s):  
Aleksandra Rakic ◽  
Petar Mitrasinovic

The present study characterizes using molecular dynamics simulations the behavior of the GAA (1186-1188) hairpin triloops with their closing c-g base pairs in large ribonucleoligand complexes (PDB IDs: 1njn, 1nwy, 1jzx). The relative energies of the motifs in the complexes with respect to that in the reference structure (unbound form of rRNA; PDB ID: 1njp) display the trends that agree with those of the conformational parameters reported in a previous study1 utilizing the de novo pseudotorsional (?,?) approach. The RNA regions around the actual RNA-ligand contacts, which experience the most substantial conformational changes upon formation of the complexes were identified. The thermodynamic parameters, based on a two-state conformational model of RNA sequences containing 15, 21 and 27 nucleotides in the immediate vicinity of the particular binding sites, were evaluated. From a more structural standpoint, the strain of a triloop, being far from the specific contacts and interacting primarily with other parts of the ribosome, was established as a structural feature which conforms to the trend of the average values of the thermodynamic variables corresponding to the three motifs defined by the 15-, 21- and 27-nucleotide sequences. From a more functional standpoint, RNA-ligand recognition is suggested to be presumably dictated by the types of ligands in the complexes.


Molecules ◽  
2021 ◽  
Vol 26 (6) ◽  
pp. 1671
Author(s):  
Ráchel Sgallová ◽  
Edward A. Curtis

Methods of artificial evolution such as SELEX and in vitro selection have made it possible to isolate RNA and DNA motifs with a wide range of functions from large random sequence libraries. Once the primary sequence of a functional motif is known, the sequence space around it can be comprehensively explored using a combination of random mutagenesis and selection. However, methods to explore the sequence space of a secondary structure are not as well characterized. Here we address this question by describing a method to construct libraries in a single synthesis which are enriched for sequences with the potential to form a specific secondary structure, such as that of an aptamer, ribozyme, or deoxyribozyme. Although interactions such as base pairs cannot be encoded in a library using conventional DNA synthesizers, it is possible to modulate the probability that two positions will have the potential to pair by biasing the nucleotide composition at these positions. Here we show how to maximize this probability for each of the possible ways to encode a pair (in this study defined as A-U or U-A or C-G or G-C or G.U or U.G). We then use these optimized coding schemes to calculate the number of different variants of model stems and secondary structures expected to occur in a library for a series of structures in which the number of pairs and the extent of conservation of unpaired positions is systematically varied. Our calculations reveal a tradeoff between maximizing the probability of forming a pair and maximizing the number of possible variants of a desired secondary structure that can occur in the library. They also indicate that the optimal coding strategy for a library depends on the complexity of the motif being characterized. Because this approach provides a simple way to generate libraries enriched for sequences with the potential to form a specific secondary structure, we anticipate that it should be useful for the optimization and structural characterization of functional nucleic acid motifs.


2011 ◽  
Vol 2011 ◽  
pp. 1-6
Author(s):  
Junji Kawakami ◽  
Yoshie Yamaguchi ◽  
Naoki Sugimoto

We developed a novel method for analyzing RNA sequences, deemed triplet analysis, and applied the method in anin vitroRNA selection experiment in which HIV-1 Tat was the target. Aptamers are nucleic acids that bind a desired target (bait), and to date, many aptamers have been identified byin vitroselection from enough concentrated libraries in which many RNAs had an obvious consensus primary sequence after sufficient cycles of the selection. Therefore, the higher-order structural features of the aptamers that are indispensable for interaction with the bait must be determined by additional investigation of the aptamers. In contrast, our triplet analysis enabled us to extract important information on functional primary and secondary structure from minimally concentrated RNA libraries. As a result, by using our method, an important unpaired region that is similar to the bulge of TAR was readily predicted from a partially concentrated library in which no consensus sequence was revealed by a conventional sequence analysis. Moreover, our analysis method may be used to assess a variety of structural motifs with desired function.


1992 ◽  
Vol 118 (1) ◽  
pp. 11-21 ◽  
Author(s):  
C Kambach ◽  
I W Mattaj

Nuclear transport of the U1 snRNP-specific protein U1A has been examined. U1A moves to the nucleus by an active process which is independent of interaction with U1 snRNA. Nuclear localization requires an unusually large sequence element situated between amino acids 94 and 204 of the protein. U1A transport is not unidirectional. The protein shuttles between nucleus and cytoplasm. At equilibrium, the concentration of the protein in the nucleus and cytoplasm is not, however, determined solely by transport rates, but can be perturbed by introducing RNA sequences that can specifically bind U1A in either the nuclear or cytoplasmic compartment. Thus, U1A represents a novel class of protein which shuttles between cytoplasm and nucleus and whose intracellular distribution can be altered by the number of free binding sites for the protein present in the cytoplasm or the nucleus.


2010 ◽  
Vol 10 (3) ◽  
pp. M110.000786 ◽  
Author(s):  
Rebecca F. Halperin ◽  
Phillip Stafford ◽  
Stephen Albert Johnston

2018 ◽  
Author(s):  
Kaiming Zhang ◽  
Xiaoyong Pan ◽  
Yang Yang ◽  
Hong-Bin Shen

AbstractCircular RNAs (circRNAs), with their crucial roles in gene regulation and disease development, have become a rising star in the RNA world. A lot of previous wet-lab studies focused on the interaction mechanisms between circRNAs and RNA-binding proteins (RBPs), as the knowledge of circRNA-RBP association is very important for understanding functions of circRNAs. Recently, the abundant CLIP-Seq experimental data has made the large-scale identification and analysis of circRNA-RBP interactions possible, while no computational tool based on machine learning has been developed yet.We present a new deep learning-based method, CRIP (CircRNAs Interact with Proteins), for the prediction of RBP binding sites on circRNAs, using only the RNA sequences. In order to fully exploit the sequence information, we propose a stacked codon-based encoding scheme and a hybrid deep learning architecture, in which a convolutional neural network (CNN) learns high-level abstract features and a recurrent neural network (RNN) learns long dependency in the sequences. We construct 37 datasets including sequence fragments of binding sites on circRNAs, and each set corresponds to one RBP. The experimental results show that the new encoding scheme is superior to the existing feature representation methods for RNA sequences, and the hybrid network outperforms conventional classifiers by a large margin, where both the CNN and RNN components contribute to the performance improvement. To the best of our knowledge, CRIP is the first machine learning-based tool specialized in the prediction of circRNA-RBP interactions, which is expected to play an important role for large-scale function analysis of circRNAs.


2021 ◽  
Author(s):  
Joseph M Taft ◽  
Cedric R Weber ◽  
Beichen Gao ◽  
Roy A Ehling ◽  
Jiami Han ◽  
...  

The continual evolution of the severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) and the emergence of variants that show resistance to vaccines and neutralizing antibodies threaten to prolong the coronavirus disease 2019 (COVID-19) pandemic. Selection and emergence of SARS-CoV-2 variants are driven in part by mutations within the viral spike protein and in particular the ACE2 receptor-binding domain (RBD), a primary target site for neutralizing antibodies. Here, we develop deep mutational learning (DML), a machine learning-guided protein engineering technology, which is used to interrogate a massive sequence space of combinatorial mutations, representing billions of RBD variants, by accurately predicting their impact on ACE2 binding and antibody escape. A highly diverse landscape of possible SARS-CoV-2 variants is identified that could emerge from a multitude of evolutionary trajectories. DML may be used for predictive profiling on current and prospective variants, including highly mutated variants such as omicron (B.1.1.529), thus supporting decision making for public heath as well as guiding the development of therapeutic antibody treatments and vaccines for COVID-19.


Sign in / Sign up

Export Citation Format

Share Document