Codon-specific Ramachandran plots show amino acid backbone conformation depends on identity of the translated codon.

Abstract Synonymous codons translate into chemically identical amino acids. Once considered inconsequential to the formation of the protein product, there is now significant evidence to suggest that codon usage affects co-translational protein folding and the final structure of the expressed protein. Here we develop a method for computing and comparing codon-specific Ramachandran plots and demonstrate that the backbone dihedral angle distributions of some synonymous codons are distinguishable with statistical significance for some secondary structures. This shows that there exists a dependence between codon identity and backbone torsion of the translated amino acid. Although these findings cannot pinpoint the causal direction of this dependence, we discuss the vast biological implications should coding be shown to directly shape protein conformation and demonstrate the usefulness of this method as a tool for probing associations between codon usage and protein structure. Finally, we urge for the inclusion of exact genetic information into structural databases.

Download Full-text

Codon usage of highly expressed genes affects proteome-wide translation efficiency

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1719375115 ◽

2018 ◽

Vol 115 (21) ◽

pp. E4940-E4949 ◽

Cited By ~ 57

Author(s):

Idan Frumkin ◽

Marc J. Lajoie ◽

Christopher J. Gregg ◽

Gil Hornung ◽

George M. Church ◽

...

Keyword(s):

Escherichia Coli ◽

Amino Acid ◽

Codon Usage ◽

Codon Usage Bias ◽

Protein Translation ◽

Translation Efficiency ◽

Synonymous Codons ◽

Codon Composition ◽

Theoretical Predictions ◽

Highly Expressed Genes

Although the genetic code is redundant, synonymous codons for the same amino acid are not used with equal frequencies in genomes, a phenomenon termed “codon usage bias.” Previous studies have demonstrated that synonymous changes in a coding sequence can exert significantciseffects on the gene’s expression level. However, whether the codon composition of a gene can also affect the translation efficiency of other genes has not been thoroughly explored. To study how codon usage bias influences the cellular economy of translation, we massively converted abundant codons to their rare synonymous counterpart in several highly expressed genes inEscherichia coli. This perturbation reduces both the cellular fitness and the translation efficiency of genes that have high initiation rates and are naturally enriched with the manipulated codon, in agreement with theoretical predictions. Interestingly, we could alleviate the observed phenotypes by increasing the supply of the tRNA for the highly demanded codon, thus demonstrating that the codon usage of highly expressed genes was selected in evolution to maintain the efficiency of global protein translation.

Download Full-text

A global perspective of codon usage

10.1101/076679 ◽

2016 ◽

Author(s):

Bohdan B. Khomtchouk ◽

Claes Wahlestedt ◽

Wolfgang Nonner

Keyword(s):

Amino Acids ◽

Amino Acid ◽

Codon Usage ◽

Genetic Code ◽

Common Ancestor ◽

Tree Of Life ◽

Last Universal Common Ancestor ◽

Synonymous Codons ◽

Universal Common Ancestor ◽

Evolutionary Progression

Codon usage in 2730 genomes is analyzed for evolutionary patterns in the usage of synonymous codons and amino acids across prokaryotic and eukaryotic taxa. We group genomes together that have similar amounts of intra-genomic bias in their codon usage, and then compare how usage of particular different codons is diversified across each genome group, and how that usage varies from group to group. Inter-genomic diversity of codon usage increases with intra-genomic usage bias, following a universal pattern. The frequencies of the different codons vary in robust mutual correlation, and the implied synonymous codon and amino acid usages drift together. This kind of correlation indicates that the variation of codon usage across organisms is chiefly a consequence of lateral DNA transfer among diverse organisms. The group of genomes with the greatest intra-genomic bias comprises two distinct subgroups, with each one restricting its codon usage to essentially one unique half of the genetic code table. These organisms include eubacteria and archaea thought to be closest to the hypothesized last universal common ancestor (LUCA). Their codon usages imply genetic diversity near the hypothesized base of the tree of life. There is a continuous evolutionary progression across taxa from the two extremely diversified usages toward balanced usage of different codons (as approached, e.g. in mammals). In that progression, codon frequency variations are correlated as expected from a blending of the two extreme codon usages seen in prokaryotes.AUTHOR SUMMARYThe redundancy intrinsic to the genetic code allows different amino acids to be encoded by up to six synonymous codons. Genomes of different organisms prefer different synonymous codons, a phenomenon known as ‘codon usage bias.’ The phenomenon of codon usage bias is of fundamental interest for evolutionary biology, and is important in a variety of applied settings (e.g., transgene expression). The spectrum of codon usage biases seen in current organisms is commonly thought to have arisen by the combined actions of mutations and selective pressures. This view focuses on codon usage in specific genomes and the consequences of that usage for protein expression.Here we investigate an unresolved question of molecular genetics: are there global rules governing the usage of synonymous codons made by genomic DNA across organisms? To answer this question, we employed a data-driven approach to surveying 2730 species from all kingdoms of the ‘tree of life’ in order to classify their codon usage. A first major result was that the large majority of these organisms use codons rather uniformly on the genome-wide scale, without giving preference to particular codons among possible synonymous alternatives. A second major result was that two compartments of codon usage seem to co-exist and to be expressed in different proportions by different organisms. As such, we investigate how individual different codons are used in different organisms from all taxa. Whereas codon usage is generally believed to be the evolutionary result of both mutations and natural selection, our results suggest a different perspective: the usage of different codons (and amino acids) by different organisms follows a superposition of two distinct patterns of usage. One distinction locates to the third base pair of all different codons, which in one pattern is U or A, and in the other pattern is G or C. This result has two major implications: (1) the variation of codon usage as seen across different organisms is best accounted for by lateral gene transfer among diverse organisms; (2) the organisms that are by protein homology grouped near the base of the ‘tree of life’ comprise two genetically distinct lineages.We find that, over evolutionary time, codon usages have converged from two distinct, non-overlapping usages (e.g., as evident in bacteria and archaea) to a near-uniform, balanced usage of synonymous codons (e.g., in mammals). This shows that the variations of codon (and amino acid) biases reveal a distinct evolutionary progression. We also find that codon usage in bacteria and archaea is most diverse between organisms thought to be closest to the hypothesized last universal common ancestor (LUCA). The dichotomy in codon (and amino acid usages) present near the origin of the current ‘tree of life’ might provide information about the evolutionary development of the genetic code.

Download Full-text

Codon Usage Patterns inCorynebacterium glutamicum: Mutational Bias, Natural Selection and Amino Acid Conservation

Comparative and Functional Genomics ◽

10.1155/2010/343569 ◽

2010 ◽

Vol 2010 ◽

pp. 1-7 ◽

Cited By ~ 8

Author(s):

Guiming Liu ◽

Jinyu Wu ◽

Huanming Yang ◽

Qiyu Bao

Keyword(s):

Amino Acid ◽

Codon Usage ◽

Mutational Bias ◽

Multivariate Statistical ◽

Synonymous Codons ◽

Usage Patterns ◽

Highly Expressed Genes ◽

Leading Strand ◽

Synonymous And Nonsynonymous Substitutions ◽

Amino Acid Conservation

The alternative synonymous codons inCorynebacterium glutamicum, a well-known bacterium used in industry for the production of amino acid, have been investigated by multivariate analysis. AsC. glutamicumis a GC-rich organism, G and C are expected to predominate at the third position of codons. Indeed, overall codon usage analyses have indicated that C and/or G ending codons are predominant in this organism. Through multivariate statistical analysis, apart from mutational selection, we identified three other trends of codon usage variation among the genes. Firstly, the majority of highly expressed genes are scattered towards the positive end of the first axis, whereas the majority of lowly expressed genes are clustered towards the other end of the first axis. Furthermore, the distinct difference in the two sets of genes was that the C ending codons are predominate in putatively highly expressed genes, suggesting that the C ending codons are translationally optimal in this organism. Secondly, the majority of the putatively highly expressed genes have a tendency to locate on the leading strand, which indicates that replicational and transciptional selection might be invoked. Thirdly, highly expressed genes are more conserved than lowly expressed genes by synonymous and nonsynonymous substitutions among orthologous genes fromthe genomes ofC. glutamicumandC. diphtheriae. We also analyzed other factors such as the length of genes and hydrophobicity that might influence codon usage and found their contributions to be weak.

Download Full-text

Codon harmonization reduces amino acid misincorporation in bacterially expressed P. falciparum proteins and improves their immunogenicity

AMB Express ◽

10.1186/s13568-019-0890-6 ◽

2019 ◽

Vol 9 (1) ◽

Cited By ~ 1

Author(s):

Neeraja Punde ◽

Jennifer Kooken ◽

Dagmar Leary ◽

Patricia M. Legler ◽

Evelina Angov

Keyword(s):

Protein Structure ◽

Amino Acid ◽

Codon Usage ◽

Dna Sequences ◽

Structural Integrity ◽

Host Cells ◽

Loss Of Function ◽

Species Specific ◽

And Function ◽

The Impact

Abstract Codon usage frequency influences protein structure and function. The frequency with which codons are used potentially impacts primary, secondary and tertiary protein structure. Poor expression, loss of function, insolubility, or truncation can result from species-specific differences in codon usage. “Codon harmonization” more closely aligns native codon usage frequencies with those of the expression host particularly within putative inter-domain segments where slower rates of translation may play a role in protein folding. Heterologous expression of Plasmodium falciparum genes in Escherichia coli has been a challenge due to their AT-rich codon bias and the highly repetitive DNA sequences. Here, codon harmonization was applied to the malarial antigen, CelTOS (Cell-traversal protein for ookinetes and sporozoites). CelTOS is a highly conserved P. falciparum protein involved in cellular traversal through mosquito and vertebrate host cells. It reversibly refolds after thermal denaturation making it a desirable malarial vaccine candidate. Protein expressed in E. coli from a codon harmonized sequence of P. falciparum CelTOS (CH-PfCelTOS) was compared with protein expressed from the native codon sequence (N-PfCelTOS) to assess the impact of codon usage on protein expression levels, solubility, yield, stability, structural integrity, recognition with CelTOS-specific mAbs and immunogenicity in mice. While the translated proteins were expected to be identical, the translated products produced from the codon-harmonized sequence differed in helical content and showed a smaller distribution of polypeptides in mass spectra indicating lower heterogeneity of the codon harmonized version and fewer amino acid misincorporations. Substitutions of hydrophobic-to-hydrophobic amino acid were observed more commonly than any other. CH-PfCelTOS induced significantly higher antibody levels compared with N-PfCelTOS; however, no significant differences in either IFN-γ or IL-4 cellular responses were detected between the two antigens.

Download Full-text

How Optimized Is the Translational Machinery in Escherichia coli, Salmonella typhimurium and Saccharomyces cerevisiae?

Genetics ◽

10.1093/genetics/149.1.37 ◽

1998 ◽

Vol 149 (1) ◽

pp. 37-44 ◽

Cited By ~ 1

Author(s):

Xuhua Xia

Keyword(s):

Escherichia Coli ◽

Saccharomyces Cerevisiae ◽

Amino Acid ◽

Salmonella Typhimurium ◽

Codon Usage ◽

Translational Efficiency ◽

Square Root ◽

Translational Machinery ◽

Mutual Adaptation ◽

Trna Species

Abstract The optimization of the translational machinery in cells requires the mutual adaptation of codon usage and tRNA concentration, and the adaptation of tRNA concentration to amino acid usage. Two predictions were derived based on a simple deterministic model of translation which assumes that elongation of the peptide chain is rate-limiting. The highest translational efficiency is achieved when the codon recognized by the most abundant tRNA reaches the maximum frequency. For each codon family, the tRNA concentration is optimally adapted to codon usage when the concentration of different tRNA species matches the square-root of the frequency of their corresponding synonymous codons. When tRNA concentration and codon usage are well adapted to each other, the optimal content of all tRNA species carrying the same amino acid should match the square-root of the frequency of the amino acid. These predictions are examined against empirical data from Escherichia coli, Salmonella typhimurium, and Saccharomyces cerevisiae.

Download Full-text

Disruptive mutations in TANC2 define a neurodevelopmental syndrome associated with psychiatric disorders

Nature Communications ◽

10.1038/s41467-019-12435-8 ◽

2019 ◽

Vol 10 (1) ◽

Cited By ~ 5

Author(s):

Hui Guo ◽

◽

Elisa Bettella ◽

Paul C. Marcogliese ◽

Rongjuan Zhao ◽

...

Keyword(s):

Psychiatric Disorders ◽

Glial Cells ◽

Behavioral Problems ◽

Motor Development ◽

Statistical Significance ◽

Behavioral Outcomes ◽

Protein Product ◽

Facial Dysmorphism ◽

Variable Degree ◽

Restricted Pattern

Abstract Postsynaptic density (PSD) proteins have been implicated in the pathophysiology of neurodevelopmental and psychiatric disorders. Here, we present detailed clinical and genetic data for 20 patients with likely gene-disrupting mutations in TANC2—whose protein product interacts with multiple PSD proteins. Pediatric patients with disruptive mutations present with autism, intellectual disability, and delayed language and motor development. In addition to a variable degree of epilepsy and facial dysmorphism, we observe a pattern of more complex psychiatric dysfunction or behavioral problems in adult probands or carrier parents. Although this observation requires replication to establish statistical significance, it also suggests that mutations in this gene are associated with a variety of neuropsychiatric disorders consistent with its postsynaptic function. We find that TANC2 is expressed broadly in the human developing brain, especially in excitatory neurons and glial cells, but shows a more restricted pattern in Drosophila glial cells where its disruption affects behavioral outcomes.

Download Full-text

The effect of expression levels on codon usage inPlasmodium falciparum

Parasitology ◽

10.1017/s0031182003004517 ◽

2004 ◽

Vol 128 (3) ◽

pp. 245-251 ◽

Cited By ~ 26

Author(s):

L. PEIXOTO ◽

V. FERNÁNDEZ ◽

H. MUSTO

Keyword(s):

Amino Acids ◽

Plasmodium Falciparum ◽

Natural Selection ◽

Codon Usage ◽

Complete Sequence ◽

Expression Data ◽

Expression Levels ◽

Synonymous Codons ◽

Translational Selection ◽

Highly Expressed Genes

The usage of alternative synonymous codons in the completely sequenced, extremely A+T-rich parasitePlasmodium falciparumwas studied. Confirming previous studies obtained with less than 3% of the total genes recently described, we found that A- and U-ending triplets predominate but translational selection increases the frequency of a subset of codons in highly expressed genes. However, some new results come from the analysis of the complete sequence. First, there is more variation in GC3 than previously described; second, the effect of natural selection acting at the level of translation has been analysed with real expression data at 4 different stages and third, we found that highly expressed proteins increment the frequency of energetically less expensive amino acids. The implications of these results are discussed.

Download Full-text

Glycosylation of prions and its effects on protein conformation relevant to amino acid mutations

Journal of Molecular Graphics and Modelling ◽

10.1016/s1093-3263(00)00044-9 ◽

2000 ◽

Vol 18 (2) ◽

pp. 126-134 ◽

Cited By ~ 13

Author(s):

Nicky K.C Wong ◽

David V Renouf ◽

Sylvain Lehmann ◽

Elizabeth F Hounsell

Keyword(s):

Amino Acid ◽

Protein Conformation ◽

Amino Acid Mutations

Download Full-text

Codon usage bias creates a ramp of hydrogen bonding at the 5′-end in prokaryotic ORFeomes

10.1101/811612 ◽

2019 ◽

Author(s):

Juan C. Villada ◽

Maria F. Duran ◽

Patrick K. H. Lee

Keyword(s):

Hydrogen Bonding ◽

Codon Usage ◽

Codon Usage Bias ◽

Translation Efficiency ◽

Molecular Processes ◽

Molecular Feature ◽

Web Based ◽

Synonymous Codons ◽

Double Stranded Dna ◽

Codon Positions

Codon usage bias exerts control over a wide variety of molecular processes. The positioning of synonymous codons within coding sequences (CDSs) dictates protein expression by mechanisms such as local translation efficiency, mRNA Gibbs free energy, and protein co-translational folding. In this work, we explore how codon variants affect the position-dependent content of hydrogen bonding, which in turn influences energy requirements for unwinding double-stranded DNA. By analyzing over 14,000 bacterial, archaeal, and fungal ORFeomes, we found that Bacteria and Archaea exhibit an exponential ramp of hydrogen bonding at the 5′-end of CDSs, while a similar ramp was not found in Fungi. The ramp develops within the first 20 codon positions in prokaryotes, eventually reaching a steady carrying capacity of hydrogen bonding that does not differ from Fungi. Selection against uniformity tests proved that selection acts against synonymous codons with high content of hydrogen bonding at the 5′-end of prokaryotic ORFeomes. Overall, this study provides novel insights into the molecular feature of hydrogen bonding that is governed by the genetic code at the 5′-end of CDSs. A web-based application to analyze the position-dependent hydrogen bonding of ORFeomes has been developed and is publicly available (https://juanvillada.shinyapps.io/hbonds/).

Download Full-text

Analysis of computational codon usage models and their association with translationally slow codons

10.1101/2020.03.26.010488 ◽

2020 ◽

Author(s):

Gabriel Wright ◽

Anabel Rodriguez ◽

Jun Li ◽

Patricia L. Clark ◽

Tijana Milenković ◽

...

Keyword(s):

Codon Usage ◽

Computational Models ◽

Selective Pressure ◽

Synonymous Codon ◽

Ground Truth ◽

Protein Translation ◽

Weak Correlation ◽

Experimental Conditions ◽

Synonymous Codons ◽

Genome Wide

AbstractImproved computational modeling of protein translation rates, including better prediction of where translational slowdowns along an mRNA sequence may occur, is critical for understanding co-translational folding. Because codons within a synonymous codon group are translated at different rates, many computational translation models rely on analyzing synonymous codons. Some models rely on genome-wide codon usage bias (CUB), believing that globally rare and common codons are the most informative of slow and fast translation, respectively. Others use the CUB observed only in highly expressed genes, which should be under selective pressure to be translated efficiently (and whose CUB may therefore be more indicative of translation rates). No prior work has analyzed these models for their ability to predict translational slowdowns. Here, we evaluate five models for their association with slowly translated positions as denoted by two independent ribosome footprint (RFP) count experiments from S. cerevisiae, because RFP data is often considered as a “ground truth” for translation rates across mRNA sequences. We show that all five considered models strongly associate with the RFP data and therefore have potential for estimating translational slowdowns. However, we also show that there is a weak correlation between RFP counts for the same genes originating from independent experiments, even when their experimental conditions are similar. This raises concerns about the efficacy of using current RFP experimental data for estimating translation rates and highlights a potential advantage of using computational models to understand translation rates instead.

Download Full-text