An iterated learning approach to the origins of the standard genetic code can help to explain its sequence of amino acid assignments

2016 ◽

Vol 14 (3) ◽

pp. 275-298 ◽

Cited By ~ 2

Author(s):

Natasa Misic

Keyword(s):

Amino Acid ◽

Genetic Code ◽

Driving Forces ◽

Standard Genetic Code ◽

Origin And Evolution ◽

Novel Approach ◽

Nucleon Number ◽

The One ◽

Mitochondrial Code ◽

Rna And Dna

This paper represents the preliminary results and conclusions on the one of fundamental questions of the genetic code related to the underlying selective mechanisms involved in its origin and evolution, in particular their hypothetical different nature, originally considered in [1,2,3]. A novel approach is introduced, based on known arithmetic regularities inside the genetic code, determined by the nucleon balances of amino acids and their divisibility by the decimal number 37 [4]. As a parameter of the genetic code systematization is introduced an aggregate nucleon number of amino acid and cognate codon, while divisibility test is carried out not only by the number 37, but also by 13.7, the selfsimilarity constant of decimal scaling [5]. Relevant nucleon sums were obtained for the most prominent divisions of the standard genetic code (SGC) according to p-adic model of the vertebrate mitochondrial code (VMC) in [6]. The nucleon number divisibility pattern of 37 and 13.7 for the RNA and DNA codon space, as well as for the amino acid space is also analyzed. The obtained results, particularly a general higher divisibility of the nucleon sums by the numbers 37 and 13.7 in SGC than in VMC, as well as a correspondence between the nucleon number divisibility pattern of both the RNA codon space and the amino acid space of SGC, how separately so conjointly, with the code degeneracy pattern, suggest some conclusions: support the hypothesis [1,2,3,7] that the selective driving forces acting during an emergence (an ancient phase) and an evolution (a modern phase) of the genetic code are different, imply the existence of an environmental-dependent stereochemical mechanism throughout the entire period of the genetic code emergence and support a mineral-mediated origin of the genetic code [7,8].

Download Full-text

Did Amino Acid Side Chain Reactivity Dictate the Composition and Timing of Aminoacyl-tRNA Synthetase Evolution?

Genes ◽

10.3390/genes12030409 ◽

2021 ◽

Vol 12 (3) ◽

pp. 409

Author(s):

Tamara L. Hendrickson ◽

Whitney N. Wood ◽

Udumbara M. Rathnayake

Keyword(s):

Amino Acids ◽

Amino Acid ◽

Genetic Code ◽

Chemical Reactivity ◽

Trna Synthetase ◽

Amino Acid Side Chain ◽

Standard Genetic Code ◽

Last Universal Common Ancestor ◽

Trna Synthetases ◽

Universal Common Ancestor

The twenty amino acids in the standard genetic code were fixed prior to the last universal common ancestor (LUCA). Factors that guided this selection included establishment of pathways for their metabolic synthesis and the concomitant fixation of substrate specificities in the emerging aminoacyl-tRNA synthetases (aaRSs). In this conceptual paper, we propose that the chemical reactivity of some amino acid side chains (e.g., lysine, cysteine, homocysteine, ornithine, homoserine, and selenocysteine) delayed or prohibited the emergence of the corresponding aaRSs and helped define the amino acids in the standard genetic code. We also consider the possibility that amino acid chemistry delayed the emergence of the glutaminyl- and asparaginyl-tRNA synthetases, neither of which are ubiquitous in extant organisms. We argue that fundamental chemical principles played critical roles in fixation of some aspects of the genetic code pre- and post-LUCA.

Download Full-text

Many alternative and theoretical genetic codes are more robust to amino acid replacements than the standard genetic code

Journal of Theoretical Biology ◽

10.1016/j.jtbi.2018.12.030 ◽

2019 ◽

Vol 464 ◽

pp. 21-32 ◽

Cited By ~ 11

Author(s):

Paweł Błażej ◽

Małgorzata Wnętrzak ◽

Dorota Mackiewicz ◽

Przemysław Gagat ◽

Paweł Mackiewicz

Keyword(s):

Amino Acid ◽

Genetic Code ◽

Standard Genetic Code ◽

Genetic Codes

Download Full-text

Visualizing Amino Acid Substitutions in a Physicochemical Vector Space

10.1101/2021.07.15.452549 ◽

2021 ◽

Author(s):

Louis R Nemzer

Keyword(s):

Amino Acid ◽

Genetic Code ◽

Three Dimensional ◽

Point Mutations ◽

Amino Acid Substitutions ◽

Standard Genetic Code ◽

Single Nucleotide ◽

Single Nucleotide Mutation ◽

Nucleotide Mutation ◽

Hereditary Disorders

A three-dimensional representation of the twenty proteinogenic amino acids in a physicochemical space is presented. Vectors corresponding to amino acid substitutions are classified based on whether they are accessible via a single-nucleotide mutation. It is shown that the standard genetic code establishes a "choice architecture" that permits nearly independent tuning of the properties related with size and those related with hydrophobicity. This work sheds light on the metarules of evolvability that may have shaped the standard genetic code to increase the probability that adaptive point mutations will be generated. An illustration of the usefulness of visualizing amino acid substitutions in a 3D physicochemical space is shown using data collected from the SARS-CoV-2 receptor binding domain. The substitutions most responsible for antibody escape are almost always inaccessible via single nucleotide mutation, and also change multiple properties concurrently. The results of this research can extend our understanding of certain hereditary disorders caused by point mutations, as well as guide the development of rational protein and vaccine design.

Download Full-text

Pyrrolysine in archaea: a 22nd amino acid encoded through a genetic code expansion

Emerging Topics in Life Sciences ◽

10.1042/etls20180094 ◽

2018 ◽

Vol 2 (4) ◽

pp. 607-618 ◽

Cited By ~ 3

Author(s):

Jean-François Brugère ◽

John F. Atkins ◽

Paul W. O'Toole ◽

Guillaume Borrel

Keyword(s):

Amino Acid ◽

Carbon Source ◽

Genetic Code ◽

Anaerobic Bacteria ◽

Standard Genetic Code ◽

Human Gut ◽

Functional Understanding ◽

Genetic Code Expansion

The 22nd amino acid discovered to be directly encoded, pyrrolysine, is specified by UAG. Until recently, pyrrolysine was only known to be present in archaea from a methanogenic lineage (Methanosarcinales), where it is important in enzymes catalysing anoxic methylamines metabolism, and a few anaerobic bacteria. Relatively new discoveries have revealed wider presence in archaea, deepened functional understanding, shown remarkable carbon source-dependent expression of expanded decoding and extended exploitation of the pyrrolysine machinery for synthetic code expansion. At the same time, other studies have shown the presence of pyrrolysine-containing archaea in the human gut and this has prompted health considerations. The article reviews our knowledge of this fascinating exception to the ‘standard’ genetic code.

Download Full-text

The Mutational Robustness of the Genetic Code and Codon Usage in Environmental Context: A Non-Extremophilic Preference?

Life ◽

10.3390/life11080773 ◽

2021 ◽

Vol 11 (8) ◽

pp. 773

Author(s):

Ádám Radványi ◽

Ádám Kun

Keyword(s):

Amino Acids ◽

Amino Acid ◽

Physicochemical Properties ◽

Genetic Code ◽

Computational Models ◽

Environmental Data ◽

Standard Genetic Code ◽

Mutational Robustness ◽

Domains Of Life ◽

History Of

The genetic code was evolved, to some extent, to minimize the effects of mutations. The effects of mutations depend on the amino acid repertoire, the structure of the genetic code and frequencies of amino acids in proteomes. The amino acid compositions of proteins and corresponding codon usages are still under selection, which allows us to ask what kind of environment the standard genetic code is adapted to. Using simple computational models and comprehensive datasets comprising genomic and environmental data from all three domains of Life, we estimate the expected severity of non-synonymous genomic mutations in proteins, measured by the change in amino acid physicochemical properties. We show that the fidelity in these physicochemical properties is expected to deteriorate with extremophilic codon usages, especially in thermophiles. These findings suggest that the genetic code performs better under non-extremophilic conditions, which not only explains the low substitution rates encountered in halophiles and thermophiles but the revealed relationship between the genetic code and habitat allows us to ponder on earlier phases in the history of Life.

Download Full-text

The Combinatorial Fusion Cascade to Generate the Standard Genetic Code

Life ◽

10.3390/life11090975 ◽

2021 ◽

Vol 11 (9) ◽

pp. 975

Author(s):

Alexander Nesterov-Mueller ◽

Roman Popov

Keyword(s):

Amino Acids ◽

Amino Acid ◽

Genetic Code ◽

Temporal Order ◽

Prebiotic Chemistry ◽

Standard Genetic Code ◽

The Road ◽

Novel Technology ◽

Three Stages ◽

Combinatorial Fusion

Combinatorial fusion cascade was proposed as a transition stage between prebiotic chemistry and early forms of life. The combinatorial fusion cascade consists of three stages: eight initial complimentary pairs of amino acids, four protocodes, and the standard genetic code. The initial complimentary pairs and the protocodes are divided into dominant and recessive entities. The transitions between these stages obey the same combinatorial fusion rules for all amino acids. The combinatorial fusion cascade mathematically describes the codon assignments in the standard genetic code. It explains the availability of amino acids with the even and odd numbers of codons, the appearance of stop codons, inclusion of novel canonical amino acids, exceptional high numbers of codons for amino acids arginine, leucine, and serine, and the temporal order of amino acid inclusion into the genetic code. The temporal order of amino acids within the cascade is congruent with the consensus temporal order previously derived from the similarities between the available hypotheses. The control over the combinatorial fusion cascades would open the road for a novel technology to develop artificial microorganisms.

Download Full-text

Packing the Standard Genetic Code in its box: 3-dimensional late Crick wobble

10.1101/2021.01.18.427168 ◽

2021 ◽

Author(s):

Michael Yarus

Keyword(s):

Amino Acid ◽

Genetic Code ◽

Dimensional Space ◽

Parallel Evolution ◽

Three Dimensional ◽

Standard Genetic Code ◽

3 Dimensional ◽

Combinatorial Properties ◽

Genetic Codes ◽

Level Order

AbstractMinimally-evolved codes are constructed with randomly chosen Standard Genetic Code (SGC) triplets, and completed with completely random triplet assignments. Such “genetic codes” have not evolved, but retain SGC qualities. Retained qualities are inescapable, part of the logic of code evolution. For example, sensitivity of coding to arbitrary assignments, which must be <≈ 10%, is intrinsic. Such sensitivity comes from elementary combinatorial properties of coding, and constrains any SGC evolution hypothesis. Similarly, evolution of last-evolved functions is difficult, due to late kinetic phenomena, likely common across codes. Census of minimally-evolved code assignments shows that shape and size of wobble domains controls packing into a coding table, shifting the accuracy of codon assignments. Access to the SGC therefore requires a plausible pathway to limited randomness, avoiding difficult completion while packing a highly ordered, degenerate code into a fixed three-dimensional space. Late Crick wobble in a 3-dimensional genetic code previously assembled by lateral transfer satisfies these varied, simultaneous requirements. By allowing parallel evolution of SGC domains, it can yield shortened evolution to SGC-level order, and allow the code to arise in smaller populations. It effectively yields full codes. Less obviously, it unifies well-studied sources for order in amino acid coding, including a minority of stereochemical triplet-amino acid associations. Finally, fusion of its intermediates into the definitive SGC is credible, mirroring broadly-accepted later events in cellular evolution.

Download Full-text

Frameshifts and wild-type protein sequences are always highly similar because the genetic code is optimal for frameshift tolerance

10.1101/067736 ◽

2016 ◽

Cited By ~ 3

Author(s):

Xiaolong Wang ◽

Quanjiang Dong ◽

Gang Chen ◽

Jianye Zhang ◽

Yongqiang Liu ◽

...

Keyword(s):

Amino Acid ◽

Genetic Code ◽

Protein Sequences ◽

Loss Of Function ◽

Wild Type ◽

Standard Genetic Code ◽

Type Protein ◽

Wild Type Protein ◽

Codon Pairs ◽

Reading Frames

AbstractFrameshift mutation yields truncated, dysfunctional product proteins, leading to loss-of-function, genetic disorders or even death. Frameshift mutations have been considered as mostly harmful and of little importance for the molecular evolution of proteins. Frameshift protein sequences, encoded by the alternative reading frames of a coding gene, have been therefore considered as meaningless. However, existing studies had shown that frameshift genes/proteins are widely existing and sometimes functional. It is puzzling how a frameshift kept its structure and functionality while its amino-acid sequence is changed substantially. Here we demonstrate that the protein sequences of the frameshifts are highly conservative when compared with the wild-type protein sequence, and the similarities among the three protein sequences encoded in the three reading frames of a coding gene are defined mainly by the genetic code. In the standard genetic code, amino acid substitutions assigned to frameshift codon substitutions are far more conservative than those assigned to random substitutions. The frameshift tolerability of the standard genetic code ranks in the top 1.0-5.0% of all possible genetic codes, showing that the genetic code is optimal in terms of frameshift tolerance. In some higher species, the shiftability is further optimized at gene- or genome-level by a biased usage of codons and codon pairs, in which frameshift-tolerable codons/codon pairs are overrepresented in their genomes.

Download Full-text

Genetic code: Chemical Distinctions of Protein Amino Acids

10.31219/osf.io/86rjt ◽

2017 ◽

Author(s):

Miloje M. Rakocevic

Keyword(s):

Amino Acids ◽

Amino Acid ◽

Genetic Code ◽

Standard Genetic Code ◽

Atom Number ◽

Canonical Bases ◽

Ordinal Numbers ◽

Intelligent Structures ◽

Protein Amino Acids

In the work it is shown that 20 protein amino acids ("the canonical amino acids" within the genetic code) appear to be a whole and very symmetrical system, in many ways, all based on strict chemical distinctions from the aspect of their similarity, complexity, stereochemical and diversity types. By this, all distinctions are accompanied by specific arithmetical and algebraic regularities, including the existence of amino acid ordinal numbers from 1 to 20. The classification of amino acids into two decades (1-10 and 11-20) appears to be in a strict correspondence with the atom number balances. From the presented "ideal" and "intelligent" structures and arrangements follow the conclusions that the genetic code was complete even in prebiotic conditions (as a set of 20 canonical amino acids and the set of 2+2 pyrimidine / purine canonical bases, respectively); and the notion "evolution" of the genetic code can only mean the degree of freedom of standard genetic code, i.e. the possible exceptions and deviations from the standard genetic code. [This is the second version with minimal interventions in the text. In addition, one passage was added in front of the second star, with quoting of T. Jukes. Added is Remark 4 and a more adequate shading in the Table inside Box 2.]

Download Full-text

An iterated learning approach to the origins of the standard genetic code can help to explain its sequence of amino acid assignments

Standard genetic code: p-adic modelling, nucleon balances and selfsimilarity

Did Amino Acid Side Chain Reactivity Dictate the Composition and Timing of Aminoacyl-tRNA Synthetase Evolution?

Many alternative and theoretical genetic codes are more robust to amino acid replacements than the standard genetic code

Visualizing Amino Acid Substitutions in a Physicochemical Vector Space

Pyrrolysine in archaea: a 22nd amino acid encoded through a genetic code expansion

The Mutational Robustness of the Genetic Code and Codon Usage in Environmental Context: A Non-Extremophilic Preference?

The Combinatorial Fusion Cascade to Generate the Standard Genetic Code

Packing the Standard Genetic Code in its box: 3-dimensional late Crick wobble

Frameshifts and wild-type protein sequences are always highly similar because the genetic code is optimal for frameshift tolerance

Genetic code: Chemical Distinctions of Protein Amino Acids

Export Citation Format