An iterated learning approach to the origins of the standard genetic code can help to explain its sequence of amino acid assignments

2018 ◽  
Author(s):  
Tom Froese ◽  
Jorge I. Campos ◽  
Nathaniel Virgo
2016 ◽  
Vol 14 (3) ◽  
pp. 275-298 ◽  
Author(s):  
Natasa Misic

This paper represents the preliminary results and conclusions on the one of fundamental questions of the genetic code related to the underlying selective mechanisms involved in its origin and evolution, in particular their hypothetical different nature, originally considered in [1,2,3]. A novel approach is introduced, based on known arithmetic regularities inside the genetic code, determined by the nucleon balances of amino acids and their divisibility by the decimal number 37 [4]. As a parameter of the genetic code systematization is introduced an aggregate nucleon number of amino acid and cognate codon, while divisibility test is carried out not only by the number 37, but also by 13.7, the selfsimilarity constant of decimal scaling [5]. Relevant nucleon sums were obtained for the most prominent divisions of the standard genetic code (SGC) according to p-adic model of the vertebrate mitochondrial code (VMC) in [6]. The nucleon number divisibility pattern of 37 and 13.7 for the RNA and DNA codon space, as well as for the amino acid space is also analyzed. The obtained results, particularly a general higher divisibility of the nucleon sums by the numbers 37 and 13.7 in SGC than in VMC, as well as a correspondence between the nucleon number divisibility pattern of both the RNA codon space and the amino acid space of SGC, how separately so conjointly, with the code degeneracy pattern, suggest some conclusions: support the hypothesis [1,2,3,7] that the selective driving forces acting during an emergence (an ancient phase) and an evolution (a modern phase) of the genetic code are different, imply the existence of an environmental-dependent stereochemical mechanism throughout the entire period of the genetic code emergence and support a mineral-mediated origin of the genetic code [7,8].


Genes ◽  
2021 ◽  
Vol 12 (3) ◽  
pp. 409
Author(s):  
Tamara L. Hendrickson ◽  
Whitney N. Wood ◽  
Udumbara M. Rathnayake

The twenty amino acids in the standard genetic code were fixed prior to the last universal common ancestor (LUCA). Factors that guided this selection included establishment of pathways for their metabolic synthesis and the concomitant fixation of substrate specificities in the emerging aminoacyl-tRNA synthetases (aaRSs). In this conceptual paper, we propose that the chemical reactivity of some amino acid side chains (e.g., lysine, cysteine, homocysteine, ornithine, homoserine, and selenocysteine) delayed or prohibited the emergence of the corresponding aaRSs and helped define the amino acids in the standard genetic code. We also consider the possibility that amino acid chemistry delayed the emergence of the glutaminyl- and asparaginyl-tRNA synthetases, neither of which are ubiquitous in extant organisms. We argue that fundamental chemical principles played critical roles in fixation of some aspects of the genetic code pre- and post-LUCA.


2019 ◽  
Vol 464 ◽  
pp. 21-32 ◽  
Author(s):  
Paweł Błażej ◽  
Małgorzata Wnętrzak ◽  
Dorota Mackiewicz ◽  
Przemysław Gagat ◽  
Paweł Mackiewicz

2021 ◽  
Author(s):  
Louis R Nemzer

A three-dimensional representation of the twenty proteinogenic amino acids in a physicochemical space is presented. Vectors corresponding to amino acid substitutions are classified based on whether they are accessible via a single-nucleotide mutation. It is shown that the standard genetic code establishes a "choice architecture" that permits nearly independent tuning of the properties related with size and those related with hydrophobicity. This work sheds light on the metarules of evolvability that may have shaped the standard genetic code to increase the probability that adaptive point mutations will be generated. An illustration of the usefulness of visualizing amino acid substitutions in a 3D physicochemical space is shown using data collected from the SARS-CoV-2 receptor binding domain. The substitutions most responsible for antibody escape are almost always inaccessible via single nucleotide mutation, and also change multiple properties concurrently. The results of this research can extend our understanding of certain hereditary disorders caused by point mutations, as well as guide the development of rational protein and vaccine design.


2018 ◽  
Vol 2 (4) ◽  
pp. 607-618 ◽  
Author(s):  
Jean-François Brugère ◽  
John F. Atkins ◽  
Paul W. O'Toole ◽  
Guillaume Borrel

The 22nd amino acid discovered to be directly encoded, pyrrolysine, is specified by UAG. Until recently, pyrrolysine was only known to be present in archaea from a methanogenic lineage (Methanosarcinales), where it is important in enzymes catalysing anoxic methylamines metabolism, and a few anaerobic bacteria. Relatively new discoveries have revealed wider presence in archaea, deepened functional understanding, shown remarkable carbon source-dependent expression of expanded decoding and extended exploitation of the pyrrolysine machinery for synthetic code expansion. At the same time, other studies have shown the presence of pyrrolysine-containing archaea in the human gut and this has prompted health considerations. The article reviews our knowledge of this fascinating exception to the ‘standard’ genetic code.


Life ◽  
2021 ◽  
Vol 11 (8) ◽  
pp. 773
Author(s):  
Ádám Radványi ◽  
Ádám Kun

The genetic code was evolved, to some extent, to minimize the effects of mutations. The effects of mutations depend on the amino acid repertoire, the structure of the genetic code and frequencies of amino acids in proteomes. The amino acid compositions of proteins and corresponding codon usages are still under selection, which allows us to ask what kind of environment the standard genetic code is adapted to. Using simple computational models and comprehensive datasets comprising genomic and environmental data from all three domains of Life, we estimate the expected severity of non-synonymous genomic mutations in proteins, measured by the change in amino acid physicochemical properties. We show that the fidelity in these physicochemical properties is expected to deteriorate with extremophilic codon usages, especially in thermophiles. These findings suggest that the genetic code performs better under non-extremophilic conditions, which not only explains the low substitution rates encountered in halophiles and thermophiles but the revealed relationship between the genetic code and habitat allows us to ponder on earlier phases in the history of Life.


Life ◽  
2021 ◽  
Vol 11 (9) ◽  
pp. 975
Author(s):  
Alexander Nesterov-Mueller ◽  
Roman Popov

Combinatorial fusion cascade was proposed as a transition stage between prebiotic chemistry and early forms of life. The combinatorial fusion cascade consists of three stages: eight initial complimentary pairs of amino acids, four protocodes, and the standard genetic code. The initial complimentary pairs and the protocodes are divided into dominant and recessive entities. The transitions between these stages obey the same combinatorial fusion rules for all amino acids. The combinatorial fusion cascade mathematically describes the codon assignments in the standard genetic code. It explains the availability of amino acids with the even and odd numbers of codons, the appearance of stop codons, inclusion of novel canonical amino acids, exceptional high numbers of codons for amino acids arginine, leucine, and serine, and the temporal order of amino acid inclusion into the genetic code. The temporal order of amino acids within the cascade is congruent with the consensus temporal order previously derived from the similarities between the available hypotheses. The control over the combinatorial fusion cascades would open the road for a novel technology to develop artificial microorganisms.


2021 ◽  
Author(s):  
Michael Yarus

AbstractMinimally-evolved codes are constructed with randomly chosen Standard Genetic Code (SGC) triplets, and completed with completely random triplet assignments. Such “genetic codes” have not evolved, but retain SGC qualities. Retained qualities are inescapable, part of the logic of code evolution. For example, sensitivity of coding to arbitrary assignments, which must be <≈ 10%, is intrinsic. Such sensitivity comes from elementary combinatorial properties of coding, and constrains any SGC evolution hypothesis. Similarly, evolution of last-evolved functions is difficult, due to late kinetic phenomena, likely common across codes. Census of minimally-evolved code assignments shows that shape and size of wobble domains controls packing into a coding table, shifting the accuracy of codon assignments. Access to the SGC therefore requires a plausible pathway to limited randomness, avoiding difficult completion while packing a highly ordered, degenerate code into a fixed three-dimensional space. Late Crick wobble in a 3-dimensional genetic code previously assembled by lateral transfer satisfies these varied, simultaneous requirements. By allowing parallel evolution of SGC domains, it can yield shortened evolution to SGC-level order, and allow the code to arise in smaller populations. It effectively yields full codes. Less obviously, it unifies well-studied sources for order in amino acid coding, including a minority of stereochemical triplet-amino acid associations. Finally, fusion of its intermediates into the definitive SGC is credible, mirroring broadly-accepted later events in cellular evolution.


2016 ◽  
Author(s):  
Xiaolong Wang ◽  
Quanjiang Dong ◽  
Gang Chen ◽  
Jianye Zhang ◽  
Yongqiang Liu ◽  
...  

AbstractFrameshift mutation yields truncated, dysfunctional product proteins, leading to loss-of-function, genetic disorders or even death. Frameshift mutations have been considered as mostly harmful and of little importance for the molecular evolution of proteins. Frameshift protein sequences, encoded by the alternative reading frames of a coding gene, have been therefore considered as meaningless. However, existing studies had shown that frameshift genes/proteins are widely existing and sometimes functional. It is puzzling how a frameshift kept its structure and functionality while its amino-acid sequence is changed substantially. Here we demonstrate that the protein sequences of the frameshifts are highly conservative when compared with the wild-type protein sequence, and the similarities among the three protein sequences encoded in the three reading frames of a coding gene are defined mainly by the genetic code. In the standard genetic code, amino acid substitutions assigned to frameshift codon substitutions are far more conservative than those assigned to random substitutions. The frameshift tolerability of the standard genetic code ranks in the top 1.0-5.0% of all possible genetic codes, showing that the genetic code is optimal in terms of frameshift tolerance. In some higher species, the shiftability is further optimized at gene- or genome-level by a biased usage of codons and codon pairs, in which frameshift-tolerable codons/codon pairs are overrepresented in their genomes.


2017 ◽  
Author(s):  
Miloje M. Rakocevic

In the work it is shown that 20 protein amino acids ("the canonical amino acids" within the genetic code) appear to be a whole and very symmetrical system, in many ways, all based on strict chemical distinctions from the aspect of their similarity, complexity, stereochemical and diversity types. By this, all distinctions are accompanied by specific arithmetical and algebraic regularities, including the existence of amino acid ordinal numbers from 1 to 20. The classification of amino acids into two decades (1-10 and 11-20) appears to be in a strict correspondence with the atom number balances. From the presented "ideal" and "intelligent" structures and arrangements follow the conclusions that the genetic code was complete even in prebiotic conditions (as a set of 20 canonical amino acids and the set of 2+2 pyrimidine / purine canonical bases, respectively); and the notion "evolution" of the genetic code can only mean the degree of freedom of standard genetic code, i.e. the possible exceptions and deviations from the standard genetic code. [This is the second version with minimal interventions in the text. In addition, one passage was added in front of the second star, with quoting of T. Jukes. Added is Remark 4 and a more adequate shading in the Table inside Box 2.]


Sign in / Sign up

Export Citation Format

Share Document