scholarly journals Phylogenetic analysis of mutational robustness based on codon usage supports that the standard genetic code does not prefer extreme environments

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Ádám Radványi ◽  
Ádám Kun

AbstractThe mutational robustness of the genetic code is rarely discussed in the context of biological diversity, such as codon usage and related factors, often considered as independent of the actual organism’s proteome. Here we put the living beings back to picture and use distortion as a metric of mutational robustness. Distortion estimates the expected severities of non-synonymous mutations measuring it by amino acid physicochemical properties and weighting for codon usage. Using the biological variance of codon frequencies, we interpret the mutational robustness of the standard genetic code with regards to their corresponding environments and genomic compositions (GC-content). Employing phylogenetic analyses, we show that coding fidelity in physicochemical properties can deteriorate with codon usages adapted to extreme environments and these putative effects are not the artefacts of phylogenetic bias. High temperature environments select for codon usages with decreased mutational robustness of hydrophobic, volumetric, and isoelectric properties. Selection at high saline concentrations also leads to reduced fidelity in polar and isoelectric patterns. These show that the genetic code performs best with mesophilic codon usages, strengthening the view that LUCA or its ancestors preferred lower temperature environments. Taxonomic implications, such as rooting the tree of life, are also discussed.

Life ◽  
2021 ◽  
Vol 11 (8) ◽  
pp. 773
Author(s):  
Ádám Radványi ◽  
Ádám Kun

The genetic code was evolved, to some extent, to minimize the effects of mutations. The effects of mutations depend on the amino acid repertoire, the structure of the genetic code and frequencies of amino acids in proteomes. The amino acid compositions of proteins and corresponding codon usages are still under selection, which allows us to ask what kind of environment the standard genetic code is adapted to. Using simple computational models and comprehensive datasets comprising genomic and environmental data from all three domains of Life, we estimate the expected severity of non-synonymous genomic mutations in proteins, measured by the change in amino acid physicochemical properties. We show that the fidelity in these physicochemical properties is expected to deteriorate with extremophilic codon usages, especially in thermophiles. These findings suggest that the genetic code performs better under non-extremophilic conditions, which not only explains the low substitution rates encountered in halophiles and thermophiles but the revealed relationship between the genetic code and habitat allows us to ponder on earlier phases in the history of Life.


2000 ◽  
Vol 81 (9) ◽  
pp. 2313-2325 ◽  
Author(s):  
David B. Levin ◽  
Beatrixe Whittome

Phylogenetic analyses based on baculovirus polyhedrin nucleotide and amino acid sequences revealed two major nucleopolyhedrovirus (NPV) clades, designated Group I and Group II. Subsequent phylogenetic analyses have revealed three Group II subclades, designated A, B and C. Variations in amino acid frequencies determine the extent of dissimilarity for divergent but structurally and functionally conserved genes and therefore significantly influence the analysis of phylogenetic relationships. Hence, it is important to consider variations in amino acid codon usage. The Genome Hypothesis postulates that genes in any given genome use the same coding pattern with respect to synonymous codons and that genes in phylogenetically related species generally show the same pattern of codon usage. We have examined codon usage in six genes from six NPVs and found that: (1) there is significant variation in codon use by genes within the same virus genome; (2) there is significant variation in the codon usage of homologous genes encoded by different NPVs; (3) there is no correlation between the level of gene expression and codon bias in NPVs; (4) there is no correlation between gene length and codon bias in NPVs; and (5) that while codon use bias appears to be conserved between viruses that are closely related phylogenetically, the patterns of codon usage also appear to be a direct function of the GC-content of the virus-encoded genes.


2020 ◽  
Author(s):  
Martin Schwersensky ◽  
Marianne Rooman ◽  
Fabrizio Pucci

AbstractThe question of how natural evolution acts on DNA and protein sequences to ensure mutational robustness and evolvability has been asked for decades without definitive answer. We tackled this issue through a structurome-scale computational investigation, in which we estimated the change in folding free energy upon all possible single-site mutations introduced in more than 20,000 protein structures. The validity of our results are supported by a very good agreement with experimental mutagenesis data. At the amino acid level, we found the protein surface to be more robust to mutations than the core, in a protein length-dependent manner. About 4% of all mutations were shown to be stabilizing, and a majority of mutations on the surface and in the core to be neutral and destabilizing, respectively. At the nucleobase level, single base substitutions were shown to yield on average less destabilizing amino acid mutations than multiple base substitutions. More precisely, the smallest average destabilization occurs for substitutions of base III in the codon, followed by base I, bases I+III, and base II. This ranking highly anticorrelates with the frequency of codon-anticodon mispairing, and suggests that the standard genetic code is optimized more to limit translation errors than the impact of random mutations. Moreover, the codon usage also appears to be optimized for minimizing the errors at the protein level, especially for surface residues that evolve faster and have therefore been under stronger selection, and for biased codons, suggesting that the codon usage bias also partly aims to optimize protein mutational robustness.


BMC Biology ◽  
2020 ◽  
Vol 18 (1) ◽  
Author(s):  
Martin Schwersensky ◽  
Marianne Rooman ◽  
Fabrizio Pucci

Abstract Background How, and the extent to which, evolution acts on DNA and protein sequences to ensure mutational robustness and evolvability is a long-standing open question in the field of molecular evolution. We addressed this issue through the first structurome-scale computational investigation, in which we estimated the change in folding free energy upon all possible single-site mutations introduced in more than 20,000 protein structures, as well as through available experimental stability and fitness data. Results At the amino acid level, we found the protein surface to be more robust against random mutations than the core, this difference being stronger for small proteins. The destabilizing and neutral mutations are more numerous in the core and on the surface, respectively, whereas the stabilizing mutations are about 4% in both regions. At the genetic code level, we observed smallest destabilization for mutations that are due to substitutions of base III in the codon, followed by base I, bases I+III, base II, and other multiple base substitutions. This ranking highly anticorrelates with the codon-anticodon mispairing frequency in the translation process. This suggests that the standard genetic code is optimized to limit the impact of random mutations, but even more so to limit translation errors. At the codon level, both the codon usage and the usage bias appear to optimize mutational robustness and translation accuracy, especially for surface residues. Conclusion Our results highlight the non-universality of mutational robustness and its multiscale dependence on protein features, the structure of the genetic code, and the codon usage. Our analyses and approach are strongly supported by available experimental mutagenesis data.


2020 ◽  
Author(s):  
Stefan Wichmann ◽  
Siegfried Scherer ◽  
Zachary Ardern

AbstractOverlapping genes (OLGs) with long protein-coding overlapping sequences are often excluded by genome annotation programs, with the exception of virus genomes. A recent study used a novel algorithm to construct OLGs from arbitrary protein domain pairs and concluded that virus genes are best suited for creating OLGs, a result which fitted with common assumptions. However, improving sequence evaluation using Hidden Markov Models shows that the previous result is an artifact originating from dataset-database biases. When parameters for OLG design and evaluation are optimized we find that 94.5% of the constructed OLG pairs score at least as highly as naturally occurring sequences, while 9.6% of the artificial OLGs cannot be distinguished from typical sequences in their protein family. Constructed OLG sequences are also indistinguishable from natural sequences in terms of amino acid identity and secondary structure, while the minimum nucleotide change required for overprinting an overlapping sequence can be as low as 1.8% of the sequence. Separate analysis of datasets containing only sequences from either archaea, bacteria, eukaryotes or viruses showed that, surprisingly, virus genes are much less suitable for designing OLGs than bacterial or eukaryotic genes. An important factor influencing OLG design is the structure of the standard genetic code. Success rates in different reading frames strongly correlate with their code-determined respective amino acid constraints. There is a tendency indicating that the structure of the standard genetic code could be optimized in its ability to create OLGs while conserving mutational robustness. The findings reported here add to the growing evidence that OLGs should no longer be excluded in prokaryotic genome annotations. Determining the factors facilitating the computational design of artificial overlapping genes may improve our understanding of the origin of these remarkable genetic constructs and may also open up exciting possibilities for synthetic biology.


2012 ◽  
Vol 12 (5) ◽  
pp. 623-632 ◽  
Author(s):  
Adam S. Lauring ◽  
Ashley Acevedo ◽  
Samantha B. Cooper ◽  
Raul Andino

Genes ◽  
2021 ◽  
Vol 12 (8) ◽  
pp. 1169
Author(s):  
Xin Li ◽  
Xiaocen Wang ◽  
Pengtao Gong ◽  
Nan Zhang ◽  
Xichen Zhang ◽  
...  

Giardia duodenalis, a flagellated parasitic protozoan, the most common cause of parasite-induced diarrheal diseases worldwide. Codon usage bias (CUB) is an important evolutionary character in most species. However, G. duodenalis CUB remains unclear. Thus, this study analyzes codon usage patterns to assess the restriction factors and obtain useful information in shaping G. duodenalis CUB. The neutrality analysis result indicates that G. duodenalis has a wide GC3 distribution, which significantly correlates with GC12. ENC-plot result—suggesting that most genes were close to the expected curve with only a few strayed away points. This indicates that mutational pressure and natural selection played an important role in the development of CUB. The Parity Rule 2 plot (PR2) result demonstrates that the usage of GC and AT was out of proportion. Interestingly, we identified 26 optimal codons in the G. duodenalis genome, ending with G or C. In addition, GC content, gene expression, and protein size also influence G. duodenalis CUB formation. This study systematically analyzes G. duodenalis codon usage pattern and clarifies the mechanisms of G. duodenalis CUB. These results will be very useful to identify new genes, molecular genetic manipulation, and study of G. duodenalis evolution.


Biomedicines ◽  
2021 ◽  
Vol 9 (4) ◽  
pp. 342
Author(s):  
Ahmed R. Sofy ◽  
Noha K. El-Dougdoug ◽  
Ehab E. Refaey ◽  
Rehab A. Dawoud ◽  
Ahmed A. Hmed

Klebsiella pneumoniae is a hazardous opportunistic pathogen that is involved in many serious human diseases and is considered to be an important foodborne pathogen found in many food types. Multidrug resistance (MDR) K. pneumoniae strains have recently spread and increased, making bacteriophage therapy an effective alternative to multiple drug-resistant pathogens. As a consequence, this research was conducted to describe the genome and basic biological characteristics of a novel phage capable of lysing MDR K. pneumoniae isolated from food samples in Egypt. The host range revealed that KPP-5 phage had potent lytic activity and was able to infect all selected MDR K. pneumoniae strains from different sources. Electron microscopy images showed that KPP-5 lytic phage was a podovirus morphology. The one-step growth curve exhibited that KPP-5 phage had a relatively short latent period of 25 min, and the burst size was about 236 PFU/infected cells. In addition, KPP-5 phage showed high stability at different temperatures and pH levels. KPP-5 phage has a linear dsDNA genome with a length of 38,245 bp with a GC content of 50.8% and 40 predicted open reading frames (ORFs). Comparative genomics and phylogenetic analyses showed that KPP-5 is most closely associated with the Teetrevirus genus in the Autographviridae family. No tRNA genes have been identified in the KPP-5 phage genome. In addition, phage-borne virulence genes or drug resistance genes were not present, suggesting that KPP-5 could be used safely as a phage biocontrol agent.


Sign in / Sign up

Export Citation Format

Share Document