scholarly journals Elucidation of Codon Usage Signatures across the Domains of Life

2018 ◽  
Author(s):  
Eva Maria Novoa ◽  
Olivier Jaillon ◽  
Irwin Jungreis ◽  
Manolis Kellis

AbstractDue to the degeneracy of the genetic code, multiple codons are translated into the same amino acid. Despite being ‘synonymous’, these codons are not equally used. Selective pressures are thought to drive the choice among synonymous codons within a genome, while GC content, which is generally attributed to mutational drift, is the major determinant of interspecies codon usage bias. Here we find that in addition to the bias caused by GC content, inter-species codon usage signatures can also be detected. More specifically, we show that a single amino acid, arginine, is the major contributor to codon usage bias differences across domains of life. We then exploit this finding, and show that the identified domain-specific codon bias signatures can be used to classify a given sequence into its corresponding domain with high accuracy. Considering that species belonging to the same domain share similar tRNA decoding strategies, we then wondered whether the inclusion of codon autocorrelation patterns might improve the classification performance of our algorithm. However, we find that autocorrelation patterns are not domain-specific, and surprisingly, are unrelated to tRNA reusage, in contrast to the common belief. Instead, our results reveal that codon autocorrelation patterns are a consequence of codon optimality throughout a sequence, where highly expressed genes display autocorrelated ‘optimal’ codons, whereas lowly expressed genes display autocorrelated ‘non-optimal’ codons.

2019 ◽  
Vol 36 (10) ◽  
pp. 2328-2339 ◽  
Author(s):  
Eva Maria Novoa ◽  
Irwin Jungreis ◽  
Olivier Jaillon ◽  
Manolis Kellis

Abstract Because of the degeneracy of the genetic code, multiple codons are translated into the same amino acid. Despite being “synonymous,” these codons are not equally used. Selective pressures are thought to drive the choice among synonymous codons within a genome, while GC content, which is typically attributed to mutational drift, is the major determinant of variation across species. Here, we find that in addition to GC content, interspecies codon usage signatures can also be detected. More specifically, we show that a single amino acid, arginine, is the major contributor to codon usage bias differences across domains of life. We then exploit this finding and show that domain-specific codon bias signatures can be used to classify a given sequence into its corresponding domain of life with high accuracy. We then wondered whether the inclusion of codon usage codon autocorrelation patterns, which reflects the nonrandom distribution of codon occurrences throughout a transcript, might improve the classification performance of our algorithm. However, we find that autocorrelation patterns are not domain-specific, and surprisingly, are unrelated to tRNA reusage, in contrast to previous reports. Instead, our results suggest that codon autocorrelation patterns are a by-product of codon optimality throughout a sequence, where highly expressed genes display autocorrelated “optimal” codons, whereas lowly expressed genes display autocorrelated “nonoptimal” codons.


2020 ◽  
Vol 21 (11) ◽  
Author(s):  
Redi Aditama ◽  
Zulfikar Achmad Tanjung ◽  
Widyartini Made Sudania ◽  
Yogo Adhi Nugroho ◽  
Condro Utomo ◽  
...  

Abstract. Aditama R, Tanjung ZA, Sudania WM, Nugroho YA, Utomo C, Liwang T. 2020. Analysis of codon usage bias reveals optimal codons in Elaeis guineensis. Biodiversitas 21: 5331-5337. Codon usage bias of oil palm genome was reported employing several indices, including GC content, relative synonymous codon usage (RSCU), the effective number of codons (ENC), and codon adaptation index (CAI). Unimodal distribution of GC content was observed and matched with non-grass monocots characteristics. Correspondence analysis (COA) on synonymous codon usage bias showed that the main axis was strongly driven by GC content. The ENC and neutrality plot of oil palm genes indicating that natural selection played more vital role compared to mutational bias on shaping codon usage bias. A positive correlation between calculated CAI and experimental data of oil palm gene expression was detected indicating good ability of this index. Finally, eighteen codons were defined as “optimal codons” that may provide a useful reference for heterogeneous expression and genome editing studies.


2014 ◽  
Vol 2014 ◽  
pp. 1-7 ◽  
Author(s):  
Hoda Mirsafian ◽  
Adiratna Mat Ripen ◽  
Aarti Singh ◽  
Phaik Hwan Teo ◽  
Amir Feisal Merican ◽  
...  

Synonymous codon usage bias is an inevitable phenomenon in organismic taxa across the three domains of life. Though the frequency of codon usage is not equal across species and within genome in the same species, the phenomenon is non random and is tissue-specific. Several factors such as GC content, nucleotide distribution, protein hydropathy, protein secondary structure, and translational selection are reported to contribute to codon usage preference. The synonymous codon usage patterns can be helpful in revealing the expression pattern of genes as well as the evolutionary relationship between the sequences. In this study, synonymous codon usage bias patterns were determined for the evolutionarily close proteins of albumin superfamily, namely, albumin,α-fetoprotein, afamin, and vitamin D-binding protein. Our study demonstrated that the genes of the four albumin superfamily members have low GC content and high values of effective number of codons (ENC) suggesting high expressivity of these genes and less bias in codon usage preferences. This study also provided evidence that the albumin superfamily members are not subjected to mutational selection pressure.


Genes ◽  
2021 ◽  
Vol 12 (8) ◽  
pp. 1169
Author(s):  
Xin Li ◽  
Xiaocen Wang ◽  
Pengtao Gong ◽  
Nan Zhang ◽  
Xichen Zhang ◽  
...  

Giardia duodenalis, a flagellated parasitic protozoan, the most common cause of parasite-induced diarrheal diseases worldwide. Codon usage bias (CUB) is an important evolutionary character in most species. However, G. duodenalis CUB remains unclear. Thus, this study analyzes codon usage patterns to assess the restriction factors and obtain useful information in shaping G. duodenalis CUB. The neutrality analysis result indicates that G. duodenalis has a wide GC3 distribution, which significantly correlates with GC12. ENC-plot result—suggesting that most genes were close to the expected curve with only a few strayed away points. This indicates that mutational pressure and natural selection played an important role in the development of CUB. The Parity Rule 2 plot (PR2) result demonstrates that the usage of GC and AT was out of proportion. Interestingly, we identified 26 optimal codons in the G. duodenalis genome, ending with G or C. In addition, GC content, gene expression, and protein size also influence G. duodenalis CUB formation. This study systematically analyzes G. duodenalis codon usage pattern and clarifies the mechanisms of G. duodenalis CUB. These results will be very useful to identify new genes, molecular genetic manipulation, and study of G. duodenalis evolution.


2021 ◽  
Author(s):  
Neetu Tyagi ◽  
Rahila Sardar ◽  
Dinesh Gupta

AbstractThe Coronavirus disease 2019 (COVID-19) outbreak caused by Severe Acute Respiratory Syndrome Coronavirus 2 virus (SARS-CoV-2) poses a worldwide human health crisis, causing respiratory illness with a high mortality rate. To investigate the factors governing codon usage bias in all the respiratory viruses, including SARS-CoV-2 isolates from different geographical locations (~62K), including two recently emerging strains from the United Kingdom (UK), i.e., VUI202012/01 and South Africa (SA), i.e., 501.Y.V2 codon usage bias (CUBs) analysis was performed. The analysis includes RSCU analysis, GC content calculation, ENC analysis, dinucleotide frequency and neutrality plot analysis. We were motivated to conduct the study to fulfil two primary aims: first, to identify the difference in codon usage bias amongst all SARS-CoV-2 genomes and, secondly, to compare their CUBs properties with other respiratory viruses. A biased nucleotide composition was found as most of the highly preferred codons were A/U-ending in all the respiratory viruses studied here. Compared with the human host, the RSCU analysis led to the identification of 11 over-represented codons and 9 under-represented codons in SARS-CoV-2 genomes. Correlation analysis of ENC and GC3s revealed that mutational pressure is the leading force determining the CUBs. The present study results yield a better understanding of codon usage preferences for SARS-CoV-2 genomes and discover the possible evolutionary determinants responsible for the biases found among the respiratory viruses, thus unveils a unique feature of the SARS-CoV-2 evolution and adaptation. To the best of our knowledge, this is the first attempt at comparative CUBs analysis on the worldwide genomes of SARS-CoV-2, including novel emerged strains and other respiratory viruses.


mBio ◽  
2014 ◽  
Vol 5 (2) ◽  
Author(s):  
Wenqi Ran ◽  
David M. Kristensen ◽  
Eugene V. Koonin

ABSTRACT The relationship between the selection affecting codon usage and selection on protein sequences of orthologous genes in diverse groups of bacteria and archaea was examined by using the Alignable Tight Genome Clusters database of prokaryote genomes. The codon usage bias is generally low, with 57.5% of the gene-specific optimal codon frequencies (F opt ) being below 0.55. This apparent weak selection on codon usage contrasts with the strong purifying selection on amino acid sequences, with 65.8% of the gene-specific dN/dS ratios being below 0.1. For most of the genomes compared, a limited but statistically significant negative correlation between F opt and dN/dS was observed, which is indicative of a link between selection on protein sequence and selection on codon usage. The strength of the coupling between the protein level selection and codon usage bias showed a strong positive correlation with the genomic GC content. Combined with previous observations on the selection for GC-rich codons in bacteria and archaea with GC-rich genomes, these findings suggest that selection for translational fine-tuning could be an important factor in microbial evolution that drives the evolution of genome GC content away from mutational equilibrium. This type of selection is particularly pronounced in slowly evolving, “high-status” genes. A significantly stronger link between the two aspects of selection is observed in free-living bacteria than in parasitic bacteria and in genes encoding metabolic enzymes and transporters than in informational genes. These differences might reflect the special importance of translational fine-tuning for the adaptability of gene expression to environmental changes. The results of this work establish the coupling between protein level selection and selection for translational optimization as a distinct and potentially important factor in microbial evolution. IMPORTANCE Selection affects the evolution of microbial genomes at many levels, including both the structure of proteins and the regulation of their production. Here we demonstrate the coupling between the selection on protein sequences and the optimization of codon usage in a broad range of bacteria and archaea. The strength of this coupling varies over a wide range and strongly and positively correlates with the genomic GC content. The cause(s) of the evolution of high GC content is a long-standing open question, given the universal mutational bias toward AT. We propose that optimization of codon usage could be one of the key factors that determine the evolution of GC-rich genomes. This work establishes the coupling between selection at the level of protein sequence and at the level of codon choice optimization as a distinct aspect of genome evolution.


2018 ◽  
Vol 115 (21) ◽  
pp. E4940-E4949 ◽  
Author(s):  
Idan Frumkin ◽  
Marc J. Lajoie ◽  
Christopher J. Gregg ◽  
Gil Hornung ◽  
George M. Church ◽  
...  

Although the genetic code is redundant, synonymous codons for the same amino acid are not used with equal frequencies in genomes, a phenomenon termed “codon usage bias.” Previous studies have demonstrated that synonymous changes in a coding sequence can exert significantciseffects on the gene’s expression level. However, whether the codon composition of a gene can also affect the translation efficiency of other genes has not been thoroughly explored. To study how codon usage bias influences the cellular economy of translation, we massively converted abundant codons to their rare synonymous counterpart in several highly expressed genes inEscherichia coli. This perturbation reduces both the cellular fitness and the translation efficiency of genes that have high initiation rates and are naturally enriched with the manipulated codon, in agreement with theoretical predictions. Interestingly, we could alleviate the observed phenotypes by increasing the supply of the tRNA for the highly demanded codon, thus demonstrating that the codon usage of highly expressed genes was selected in evolution to maintain the efficiency of global protein translation.


Author(s):  
Boyun Yang ◽  
Huolin Luo ◽  
Yuan Tao ◽  
Wenjing Yu ◽  
Liping Luo

Cymbidium kanran is an important commercially grown member of the Chinese orchid family. However, little information regarding the molecular biology of this species is available. In this study, the C. kanran root, shoot, stem, leaf, and flower transcriptomes were sequenced with the Illumina HiSeq 4000 system, which resulted in 8.9 Gb of clean reads that were assembled into 74,620 unigenes, with an average length and N50 of 983 bp and 1,640 bp, respectively. The screening of seven databases (NR, NT, GO, KOG, KEGG, Swiss-Prot, and InterPro) for similar sequences resulted in the functional annotation of 49,813 unigenes. Additionally, 173 MADS-box genes, which help to control major aspects of plant development, were identified and their codon usage bias was analyzed. Only 26 genes had a low ENC (less than or equal to 35), suggesting the codon usage bias was weak. Base mutations were the major determinants of codon usage, although natural selection pressure also influenced codon usage bias. Moreover, 22 optimal codons were identified based on ΔRSCU, and 20 codons ended with A/U. The results of this study provide the foundation for the molecular breeding of new varieties


Genomics ◽  
2020 ◽  
Vol 112 (6) ◽  
pp. 4657-4665
Author(s):  
Zhiyi Ge ◽  
Xuerui Li ◽  
Xiaoan Cao ◽  
Rui Wang ◽  
Wen Hu ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document