Basic principles of the genetic code extension

AbstractCompounds including non-canonical amino acids or other artificially designed molecules can find a lot of applications in medicine, industry and biotechnology. They can be produced thanks to the modification or extension of the standard genetic code (SGC). Such peptides or proteins including the non-canonical amino acids can be constantly delivered in a stable way by organisms with the customized genetic code. Among several methods of engineering the code, using non-canonical base pairs is especially promising, because it enables generating many new codons, which can be used to encode any new amino acid. Since even one pair of new bases can extend the SGC up to 216 codons generated by six-letter nucleotide alphabet, the extension of the SGC can be achieved in many ways. Here, we proposed a stepwise procedure of the SGC extension with one pair of non-canonical bases to minimize the consequences of point mutations. We reported relationships between codons in the framework of graph theory. All 216 codons were represented as nodes of the graph, whereas its edges were induced by all possible single nucleotide mutations occurring between codons. Therefore, every set of canonical and newly added codons induces a specific subgraph. We characterized the properties of the induced subgraphs generated by selected sets of codons. Thanks to that, we were able to describe a procedure for incremental addition of the set of meaningful codons up to the full coding system consisting of three pairs of bases. The procedure of gradual extension of the SGC makes the whole system robust to changing genetic information due to mutations and is compatible with the views assuming that codons and amino acids were added successively to the primordial SGC, which evolved to minimize harmful consequences of mutations or mistranslations of encoded proteins.

Download Full-text

Basic principles of the genetic code extension

Royal Society Open Science ◽

10.1098/rsos.191384 ◽

2020 ◽

Vol 7 (2) ◽

pp. 191384

Author(s):

Paweł Błażej ◽

Małgorzata Wnetrzak ◽

Dorota Mackiewicz ◽

Paweł Mackiewicz

Keyword(s):

Amino Acids ◽

Genetic Code ◽

Point Mutations ◽

Coding System ◽

Base Pairs ◽

Induced Subgraphs ◽

Single Nucleotide ◽

Basic Principles ◽

Code Extension ◽

Incremental Addition

Compounds including non-canonical amino acids (ncAAs) or other artificially designed molecules can find a lot of applications in medicine, industry and biotechnology. They can be produced thanks to the modification or extension of the standard genetic code (SGC). Such peptides or proteins including the ncAAs can be constantly delivered in a stable way by organisms with the customized genetic code. Among several methods of engineering the code, using non-canonical base pairs is especially promising, because it enables generating many new codons, which can be used to encode any new amino acid. Since even one pair of new bases can extend the SGC up to 216 codons generated by a six-letter nucleotide alphabet, the extension of the SGC can be achieved in many ways. Here, we proposed a stepwise procedure of the SGC extension with one pair of non-canonical bases to minimize the consequences of point mutations. We reported relationships between codons in the framework of graph theory. All 216 codons were represented as nodes of the graph, whereas its edges were induced by all possible single nucleotide mutations occurring between codons. Therefore, every set of canonical and newly added codons induces a specific subgraph. We characterized the properties of the induced subgraphs generated by selected sets of codons. Thanks to that, we were able to describe a procedure for incremental addition of the set of meaningful codons up to the full coding system consisting of three pairs of bases. The procedure of gradual extension of the SGC makes the whole system robust to changing genetic information due to mutations and is compatible with the views assuming that codons and amino acids were added successively to the primordial SGC, which evolved minimizing harmful consequences of mutations or mistranslations of encoded proteins.

Download Full-text

Some theoretical aspects of reprogramming the standard genetic code

10.1101/2020.09.12.294553 ◽

2020 ◽

Author(s):

Kuba Nowak ◽

Paweł Błażej ◽

Małgorzata Wnetrzak ◽

Dorota Mackiewicz ◽

Paweł Mackiewicz

Keyword(s):

Amino Acids ◽

Genetic Code ◽

Dna Sequences ◽

Theoretical Perspective ◽

Optimal Number ◽

Optimal Size ◽

Coding System ◽

Standard Genetic Code ◽

Nucleotide Mutation ◽

Code Extension

1AbstractReprogramming of the standard genetic code in order to include non-canonical amino acids (ncAAs) opens a new perspective in medicine, industry and biotechnology. There are several methods of engineering the code, which allow us for storing new genetic information in DNA sequences and transmitting it into the protein world. Here, we investigate the problem of optimal genetic code extension from theoretical perspective. We assume that the new coding system should encode both canonical and new ncAAs using 64 classical codons. What is more, the extended genetic code should be robust to point nucleotide mutation and minimize the possibility of reversion from new to old information. In order to do so, we follow graph theory to study the properties of optimal codon sets, which can encode 20 canonical amino acids and stop coding signal. Finally, we describe the set of vacant codons that could be assigned to new amino acids. Moreover, we discuss the optimal number of the newly incorporated ncAAs and also the optimal size of codon blocks that are assigned to ncAAs.

Download Full-text

Some theoretical aspects of reprogramming the standard genetic code

Genetics ◽

10.1093/genetics/iyab040 ◽

2021 ◽

Author(s):

Kuba Nowak ◽

Paweł Błażej ◽

Małgorzata Wnetrzak ◽

Dorota Mackiewicz ◽

Paweł Mackiewicz

Keyword(s):

Amino Acids ◽

Genetic Code ◽

Dna Sequences ◽

Point Mutations ◽

Theoretical Background ◽

Optimal Number ◽

Optimal Size ◽

Coding System ◽

Standard Genetic Code ◽

Formal Procedure

AbstractReprogramming of the standard genetic code to include non-canonical amino acids (ncAAs) opens new prospects for medicine, industry, and biotechnology. There are several methods of code engineering, which allow us for storing new genetic information in DNA sequences and producing proteins with new properties. Here, we provided a theoretical background for the optimal genetic code expansion, which may find application in the experimental design of the genetic code. We assumed that the expanded genetic code includes both canonical and non-canonical information stored in 64 classical codons. What is more, the new coding system is robust to point mutations and minimizes the possibility of reversion from the new to old information. In order to find such codes, we applied graph theory to analyze the properties of optimal codon sets. We presented the formal procedure in finding the optimal codes with various number of vacant codons that could be assigned to new amino acids. Finally, we discussed the optimal number of the newly incorporated ncAAs and also the optimal size of codon groups that can be assigned to ncAAs.

Download Full-text

Making Sense of “Nonsense” and More: Challenges and Opportunities in the Genetic Code Expansion, in the World of tRNA Modifications

International Journal of Molecular Sciences ◽

10.3390/ijms23020938 ◽

2022 ◽

Vol 23 (2) ◽

pp. 938

Author(s):

Olubodun Michael Lateef ◽

Michael Olawale Akintubosun ◽

Olamide Tosin Olaoba ◽

Sunday Ocholi Samson ◽

Malgorzata Adamczyk

Keyword(s):

Amino Acids ◽

Genetic Code ◽

Protein Design ◽

Side Chains ◽

Vast Number ◽

Trna Modifications ◽

Wide Range ◽

Challenges And Opportunities ◽

Code Extension ◽

Genetic Code Expansion

The evolutional development of the RNA translation process that leads to protein synthesis based on naturally occurring amino acids has its continuation via synthetic biology, the so-called rational bioengineering. Genetic code expansion (GCE) explores beyond the natural translational processes to further enhance the structural properties and augment the functionality of a wide range of proteins. Prokaryotic and eukaryotic ribosomal machinery have been proven to accept engineered tRNAs from orthogonal organisms to efficiently incorporate noncanonical amino acids (ncAAs) with rationally designed side chains. These side chains can be reactive or functional groups, which can be extensively utilized in biochemical, biophysical, and cellular studies. Genetic code extension offers the contingency of introducing more than one ncAA into protein through frameshift suppression, multi-site-specific incorporation of ncAAs, thereby increasing the vast number of possible applications. However, different mediating factors reduce the yield and efficiency of ncAA incorporation into synthetic proteins. In this review, we comment on the recent advancements in genetic code expansion to signify the relevance of systems biology in improving ncAA incorporation efficiency. We discuss the emerging impact of tRNA modifications and metabolism in protein design. We also provide examples of the latest successful accomplishments in synthetic protein therapeutics and show how codon expansion has been employed in various scientific and biotechnological applications.

Download Full-text

Refactoring the Genetic Code for Increased Evolvability

10.1101/128058 ◽

2017 ◽

Author(s):

Gur Pines ◽

James D. Winkler ◽

Assaf Pines ◽

Ryan T. Gill

Keyword(s):

Genetic Code ◽

Directed Evolution ◽

Single Gene ◽

Point Mutations ◽

Saturation Mutagenesis ◽

Mutagenic Potential ◽

Single Nucleotide ◽

Common Error ◽

Mutational Landscape ◽

Genetic Codes

AbstractThe standard genetic code is robust to mutations and base-pairing errors during transcription and translation. Point mutations are most likely to be synonymous or preserve the chemical properties of the original amino acid. Saturation mutagenesis experiments suggest that in some cases the best performing mutant requires a replacement of more than a single nucleotide within a codon. These replacements are essentially inaccessible to common error-based laboratory engineering techniques that alter single nucleotide per mutation event, due to the extreme rarity of adjacent mutations. In this theoretical study, we suggest a radical reordering of the genetic code that maximizes the mutagenic potential of single nucleotide replacements. We explore several possible genetic codes that allow a greater degree of accessibility to the mutational landscape and may result in a hyper-evolvable organism serving as an ideal platform for directed evolution experiments. We then conclude by evaluating potential applications for recoded organisms within the synthetic biology field.Significance StatementThe conservative nature of the genetic code prevents bioengineers from efficiently accessing the full mutational landscape of a gene using common error-prone methods. Here we present two computational approaches to generate alternative genetic codes with increased accessibility. These new codes allow mutational transition to a larger pool of amino acids and with a greater degree of chemical differences, using a single nucleotide replacement within the codon, thus increasing evolvability both at the single gene and at the genome levels. Given the widespread use of these techniques for strain and protein improvement along with more fundamental evolutionary biology questions, the use of recoded organisms that maximize evolvability should significantly improve the efficiency of directed evolution, library generation and fitness maximization.

Download Full-text

Basic principles of genetic disease

ESC CardioMed ◽

10.1093/med/9780198784906.003.0148 ◽

2018 ◽

pp. 669-671

Author(s):

Eric Schulze-Bahr

Keyword(s):

Genetic Disease ◽

Copy Number ◽

Copy Number Variants ◽

Single Nucleotide Variants ◽

Individual Genome ◽

Base Pairs ◽

Single Nucleotide ◽

Basic Principles ◽

Bona Fide ◽

Genomic Regions

The human genome consists of approximately 3 billion (3 × 109) base pairs of DNA (around 20,000 genes), organized as 23 chromosomes (diploid parental set), and a small mitochondrial genome (37 genes, including 13 proteins; 16,589 base pairs) of maternal origin. Most human genetic variation is natural, that is, common or rare (minor allele frequency >0.1%) and does not cause disease—apart from every true disease-causing (bona fide) mutation each individual genome harbours more than 3.5 million single nucleotide variants (including >10,000 non-synonymous changes causing amino acid substitutions) and 200–300 large structural or copy number variants (insertions/deletions, up to several thousands of base-pairs) that are non-disease-causing variations and scattered throughout coding and non-coding genomic regions.

Download Full-text

Computational Analysis of Genetic Code Variations Optimized for the Robustness against Point Mutations with Wobble-like Effects

Life ◽

10.3390/life11121338 ◽

2021 ◽

Vol 11 (12) ◽

pp. 1338

Author(s):

Elena Fimmel ◽

Markus Gumbel ◽

Martin Starman ◽

Lutz Strüngmann

Keyword(s):

Genetic Code ◽

Computational Analysis ◽

Point Mutations ◽

Weighted Graph ◽

Single Nucleotide Variants ◽

Negative Effects ◽

Standard Genetic Code ◽

Single Nucleotide ◽

Random Code ◽

Optimal Weights

It is believed that the codon–amino acid assignments of the standard genetic code (SGC) help to minimize the negative effects caused by point mutations. All possible point mutations of the genetic code can be represented as a weighted graph with weights that correspond to the probabilities of these mutations. The robustness of a code against point mutations can be described then by means of the so-called conductance measure. This paper quantifies the wobble effect, which was investigated previously by applying the weighted graph approach, and seeks optimal weights using an evolutionary optimization algorithm to maximize the code’s robustness. One result of our study is that the robustness of the genetic code is least influenced by mutations in the third position—like with the wobble effect. Moreover, the results clearly demonstrate that point mutations in the first, and even more importantly, in the second base of a codon have a very large influence on the robustness of the genetic code. These results were compared to single nucleotide variants (SNV) in coding sequences which support our findings. Additionally, it was analyzed which structure of a genetic code evolves from random code tables when the robustness is maximized. Our calculations show that the resulting code tables are very close to the standard genetic code. In conclusion, the results illustrate that the robustness against point mutations seems to be an important factor in the evolution of the standard genetic code.

Download Full-text

Visualizing Amino Acid Substitutions in a Physicochemical Vector Space

10.1101/2021.07.15.452549 ◽

2021 ◽

Author(s):

Louis R Nemzer

Keyword(s):

Amino Acid ◽

Genetic Code ◽

Three Dimensional ◽

Point Mutations ◽

Amino Acid Substitutions ◽

Standard Genetic Code ◽

Single Nucleotide ◽

Single Nucleotide Mutation ◽

Nucleotide Mutation ◽

Hereditary Disorders

A three-dimensional representation of the twenty proteinogenic amino acids in a physicochemical space is presented. Vectors corresponding to amino acid substitutions are classified based on whether they are accessible via a single-nucleotide mutation. It is shown that the standard genetic code establishes a "choice architecture" that permits nearly independent tuning of the properties related with size and those related with hydrophobicity. This work sheds light on the metarules of evolvability that may have shaped the standard genetic code to increase the probability that adaptive point mutations will be generated. An illustration of the usefulness of visualizing amino acid substitutions in a 3D physicochemical space is shown using data collected from the SARS-CoV-2 receptor binding domain. The substitutions most responsible for antibody escape are almost always inaccessible via single nucleotide mutation, and also change multiple properties concurrently. The results of this research can extend our understanding of certain hereditary disorders caused by point mutations, as well as guide the development of rational protein and vaccine design.

Download Full-text

BE4max and AncBE4max Are Efficient in Germline Conversion of C:G to T:A Base Pairs in Zebrafish

Cells ◽

10.3390/cells9071690 ◽

2020 ◽

Vol 9 (7) ◽

pp. 1690

Author(s):

Blake Carrington ◽

Rachel N. Weinstein ◽

Raman Sood

Keyword(s):

Mammalian Cells ◽

Disease Modeling ◽

Gene Knockout ◽

Point Mutations ◽

Ease Of Use ◽

Nucleotide Substitutions ◽

Base Pairs ◽

Single Nucleotide ◽

Base Editing ◽

Highly Active

The ease of use and robustness of genome editing by CRISPR/Cas9 has led to successful use of gene knockout zebrafish for disease modeling. However, it still remains a challenge to precisely edit the zebrafish genome to create single-nucleotide substitutions, which account for ~60% of human disease-causing mutations. Recently developed base editing nucleases provide an excellent alternate to CRISPR/Cas9-mediated homology dependent repair for generation of zebrafish with point mutations. A new set of cytosine base editors, termed BE4max and AncBE4max, demonstrated improved base editing efficiency in mammalian cells but have not been evaluated in zebrafish. Therefore, we undertook this study to evaluate their efficiency in converting C:G to T:A base pairs in zebrafish by somatic and germline analysis using highly active sgRNAs to twist and ntl genes. Our data demonstrated that these improved BE4max set of plasmids provide desired base substitutions at similar efficiency and without any indels compared to the previously reported BE3 and Target-AID plasmids in zebrafish. Our data also showed that AncBE4max produces fewer incorrect and bystander edits, suggesting that it can be further improved by codon optimization of its components for use in zebrafish.

Download Full-text

Formation of the Codon Degeneracy during Interdependent Development between Metabolism and Replication

Genes ◽

10.3390/genes12122023 ◽

2021 ◽

Vol 12 (12) ◽

pp. 2023

Author(s):

Dirson Jian Li

Keyword(s):

Amino Acids ◽

Genetic Code ◽

Driving Force ◽

Sequence Evolution ◽

Base Pairs ◽

Random Sequences ◽

Relative Stabilities ◽

Base Substitutions ◽

Codon Degeneracy

Nirenberg’s genetic code chart shows a profound correspondence between codons and amino acids. The aim of this article is to try to explain the primordial formation of the codon degeneracy. It remains a puzzle how informative molecules arose from the supposed prebiotic random sequences. If introducing an initial driving force based on the relative stabilities of triplex base pairs, the prebiotic sequence evolution became innately nonrandom. Thus, the primordial assignment of the 64 codons to the 20 amino acids has been explained in detail according to base substitutions during the coevolution of tRNAs with aaRSs; meanwhile, the classification of aaRSs has also been explained.

Download Full-text