Codon Usage Bias Covaries With Expression Breadth and the Rate of Synonymous Evolution in Humans, but This Is Not Evidence for Selection

Araxi O Urrutia; Laurence D Hurst

doi:10.1093/genetics/159.3.1191

Codon Usage Bias Covaries With Expression Breadth and the Rate of Synonymous Evolution in Humans, but This Is Not Evidence for Selection

Genetics ◽

10.1093/genetics/159.3.1191 ◽

2001 ◽

Vol 159 (3) ◽

pp. 1191-1199

Author(s):

Araxi O Urrutia ◽

Laurence D Hurst

Keyword(s):

Codon Usage ◽

Codon Bias ◽

Synonymous Codon ◽

Nucleotide Composition ◽

Synonymous Codon Usage ◽

Synonymous Substitutions ◽

Numerous Species ◽

Nucleotide Content ◽

Expression Breadth ◽

Human Genes

Abstract In numerous species, from bacteria to Drosophila, evidence suggests that selection acts even on synonymous codon usage: codon bias is greater in more abundantly expressed genes, the rate of synonymous evolution is lower in genes with greater codon bias, and there is consistency between genes in the same species in which codons are preferred. In contrast, in mammals, while nonequal use of alternative codons is observed, the bias is attributed to the background variance in nucleotide concentrations, reflected in the similar nucleotide composition of flanking noncoding and exonic third sites. However, a systematic examination of the covariants of codon usage controlling for background nucleotide content has yet to be performed. Here we present a new method to measure codon bias that corrects for background nucleotide content and apply this to 2396 human genes. Nearly all (99%) exhibit a higher amount of codon bias than expected by chance. The patterns associated with selectively driven codon bias are weakly recovered: Broadly expressed genes have a higher level of bias than do tissue-specific genes, the bias is higher for genes with lower rates of synonymous substitutions, and certain codons are repeatedly preferred. However, while these patterns are suggestive, the first two patterns appear to be methodological artifacts. The last pattern reflects in part biases in usage of nucleotide pairs. We conclude that we find no evidence for selection on codon usage in humans.

Download Full-text

Effect of genome composition and codon bias on infectious bronchitis virus evolution and adaptation to target tissues

BMC Genomics ◽

10.1186/s12864-021-07559-5 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Giovanni Franzo ◽

Claudia Maria Tucciarone ◽

Matteo Legnardi ◽

Mattia Cecchinato

Keyword(s):

Codon Usage ◽

Codon Bias ◽

Synonymous Codon ◽

Synonymous Codon Usage ◽

Accessory Proteins ◽

Effective Number ◽

Genome Composition ◽

Infectious Bronchitis ◽

Selective Forces ◽

Effective Number Of Codons

Abstract Background Infectious bronchitis virus (IBV) is one of the most relevant viruses affecting the poultry industry, and several studies have investigated the factors involved in its biological cycle and evolution. However, very few of those studies focused on the effect of genome composition and the codon bias of different IBV proteins, despite the remarkable increase in available complete genomes. In the present study, all IBV complete genomes were downloaded (n = 383), and several statistics representative of genome composition and codon bias were calculated for each protein-coding sequence, including but not limited to, the nucleotide odds ratio, relative synonymous codon usage and effective number of codons. Additionally, viral codon usage was compared to host codon usage based on a collection of highly expressed genes in IBV target and nontarget tissues. Results The results obtained demonstrated a significant difference among structural, non-structural and accessory proteins, especially regarding dinucleotide composition, which appears under strong selective forces. In particular, some dinucleotide pairs, such as CpG, a probable target of the host innate immune response, are underrepresented in genes coding for pp1a, pp1ab, S and N. Although genome composition and dinucleotide bias appear to affect codon usage, additional selective forces may act directly on codon bias. Variability in relative synonymous codon usage and effective number of codons was found for different proteins, with structural proteins and polyproteins being more adapted to the codon bias of host target tissues. In contrast, accessory proteins had a more biased codon usage (i.e., lower number of preferred codons), which might contribute to the regulation of their expression level and timing throughout the cell cycle. Conclusions The present study confirms the existence of selective forces acting directly on the genome and not only indirectly through phenotype selection. This evidence might help understanding IBV biology and in developing attenuated strains without affecting the protein phenotype and therefore immunogenicity.

Download Full-text

Intragenomic variation in mutation biases causes underestimation of selection on synonymous codon usage

10.1101/2021.10.29.466462 ◽

2021 ◽

Author(s):

Alexander L Cope ◽

Premal Shah

Keyword(s):

Population Genetics ◽

Natural Selection ◽

Codon Usage ◽

Codon Bias ◽

Synonymous Codon ◽

Synonymous Codon Usage ◽

Mutation Bias ◽

Biased Gene Conversion ◽

Intragenomic Variation ◽

The Impact

Patterns of non-uniform usage of synonymous codons (codon bias) varies across genes in an organism and across species from all domains of life. The bias in codon usage is due to a combination of both non-adaptive (e.g. mutation biases) and adaptive (e.g. natural selection for translation efficiency/accuracy) evolutionary forces. Most population genetics models quantify the effects of mutation bias and selection on shaping codon usage patterns assuming a uniform mutation bias across the genome. However, mutation biases can vary both along and across chromosomes due to processes such as biased gene conversion, potentially obfuscating signals of translational selection. Moreover, estimates of variation in genomic mutation biases are often lacking for non-model organisms. Here, we combine an unsupervised learning method with a population genetics model of synonymous codon bias evolution to assess the impact of intragenomic variation in mutation bias on the strength and direction of natural selection on synonymous codon usage across 49 Saccharomycotina budding yeasts. We find that in the absence of a priori information, unsupervised learning approaches can be used to identify regions evolving under different mutation biases. We find that the impact of intragenomic variation in mutation bias varies widely, even among closely-related species. We show that the overall strength and direction of selection on codon usage can be underestimated by failing to account for intragenomic variation in mutation biases. Interestingly, genes falling into clusters identified by machine learning are also often physically clustered across chromosomes, consistent with processes such as biased gene conversion. Our results indicate the need for more nuanced models of sequence evolution that systematically incorporate the effects of variable mutation biases on codon frequencies.

Download Full-text

Host Adaptation of Codon Usage in SARS-CoV-2 From Mammals Indicate Natural Selection

10.21203/rs.3.rs-1125942/v1 ◽

2021 ◽

Author(s):

Yanan Fu ◽

Yanping Huang ◽

Jingjing Rao ◽

Feng Zeng ◽

Ruiping Yang ◽

...

Keyword(s):

Natural Selection ◽

Codon Usage ◽

Binding Affinity ◽

Codon Bias ◽

Synonymous Codon ◽

Host Adaptation ◽

Synonymous Codon Usage ◽

Mutation Pressure ◽

Usage Analysis ◽

Human Receptor

Abstract The outbreak of COVID-19, caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infections, spread across hosts from humans to animals, transmitting particularly effectively in mink. How SARS-CoV-2 selects and evolves in the host, and the differences in the evolution of different animals are still unclear. To analysis the mutation and codon usage bias of SARS-CoV-2 in infected humans and animals. The SARS-CoV-2 sequence in mink (Mink-SARS2) and binding energy with receptor were calculated compared with human. The relative synonymous codon usage of viral encoded gene was analyzed to characterize the differences and the evolutionary characteristics. A synonymous codon usage analysis showed that SARS-CoV-2 is optimized to adapt in the animals in which it is currently reported, and all of the animals showed decreased adaptability relative to that of humans, except for mink. The neutrality plot showed that the effect of natural selection on different SARS-CoV-2 sequences is stronger than mutation pressure. A binding affinity analysis indicated that the spike protein of the SARS-CoV-2 variant in mink showed a greater preference for binding with the mink receptor ACE2 than with the human receptor, especially as the mutation Y453F and N501T in Mink-SARS2 lead to improvement of binding affinity for mink receptor. In summary, mutations Y453F and N501T in Mink-SARS2 lead to improvement of binding affinity with mink receptor, indicating possible natural selection and current host adaptation. Monitoring the variation and codon bias of SARS-CoV-2 provides a theoretical basis for tracing the epidemic, evolution and cross-species spread of SARS-CoV-2.

Download Full-text

Synonymous Codon Usage Analysis of Thirty Two Mycobacteriophage Genomes

Advances in Bioinformatics ◽

10.1155/2009/316936 ◽

2009 ◽

Vol 2009 ◽

pp. 1-11 ◽

Cited By ~ 15

Author(s):

Sameer Hassan ◽

Vasantha Mahalingam ◽

Vanaja Kumar

Keyword(s):

Codon Usage ◽

Synonymous Codon ◽

Nucleotide Composition ◽

Synonymous Codon Usage ◽

Compositional Bias ◽

Trna Genes ◽

Translation Efficiency ◽

Multivariate Statistical ◽

Strong Negative Correlation ◽

Highly Expressed Genes

Synonymous codon usage of protein coding genes of thirty two completely sequenced mycobacteriophage genomes was studied using multivariate statistical analysis. One of the major factors influencing codon usage is identified to be compositional bias. Codons ending with either C or G are preferred in highly expressed genes among which C ending codons are highly preferred over G ending codons. A strong negative correlation between effective number of codons (Nc) and GC3s content was also observed, showing that the codon usage was effected by gene nucleotide composition. Translational selection is also identified to play a role in shaping the codon usage operative at the level of translational accuracy. High level of heterogeneity is seen among and between the genomes. Length of genes is also identified to influence the codon usage in 11 out of 32 phage genomes. Mycobacteriophage Cooper is identified to be the highly biased genome with better translation efficiency comparing well with the host specific tRNA genes.

Download Full-text

Nucleotide composition and synonymous codon usage of open reading frames in Norovirus GII.4 variants

Journal of Biomolecular Structure and Dynamics ◽

10.1080/07391102.2019.1689171 ◽

2019 ◽

Vol 38 (16) ◽

pp. 4764-4773

Author(s):

Wei Dan ◽

Yan Jin ◽

Zizhong Tang ◽

Yongmin Li ◽

Huipeng Yao

Keyword(s):

Codon Usage ◽

Synonymous Codon ◽

Nucleotide Composition ◽

Synonymous Codon Usage ◽

Open Reading Frames ◽

Norovirus Gii ◽

Reading Frames

Download Full-text

CUBAP: an interactive web portal for analyzing codon usage biases across populations

Nucleic Acids Research ◽

10.1093/nar/gkaa863 ◽

2020 ◽

Vol 48 (19) ◽

pp. 11030-11039

Author(s):

Matthew W Hodgman ◽

Justin B Miller ◽

Taylor E Meurs ◽

John S K Kauwe

Keyword(s):

Codon Usage ◽

Synonymous Codon ◽

Association Studies ◽

East Asian ◽

Nucleotide Composition ◽

Synonymous Codon Usage ◽

Genome Wide Association Studies ◽

Genome Wide ◽

African Populations ◽

Place Of Origin

Abstract Synonymous codon usage significantly impacts translational and transcriptional efficiency, gene expression, the secondary structure of both mRNA and proteins, and has been implicated in various diseases. However, population-specific differences in codon usage biases remain largely unexplored. Here, we present a web server, https://cubap.byu.edu, to facilitate analyses of codon usage biases across populations (CUBAP). Using the 1000 Genomes Project, we calculated and visually depict population-specific differences in codon frequencies, codon aversion, identical codon pairing, co-tRNA codon pairing, ramp sequences, and nucleotide composition in 17,634 genes. We found that codon pairing significantly differs between populations in 35.8% of genes, allowing us to successfully predict the place of origin for African and East Asian individuals with 98.8% and 100% accuracy, respectively. We also used CUBAP to identify a significant bias toward decreased CTG pairing in the immunity related GTPase M (IRGM) gene in East Asian and African populations, which may contribute to the decreased association of rs10065172 with Crohn's disease in those populations. CUBAP facilitates in-depth gene-specific and codon-specific visualization that will aid in analyzing candidate genes identified in genome-wide association studies, identifying functional implications of synonymous variants, predicting population-specific impacts of synonymous variants and categorizing genetic biases unique to certain populations.

Download Full-text

The Effects of the Context-Dependent Codon Usage Bias on the Structure of the nsp1αof Porcine Reproductive and Respiratory Syndrome Virus

BioMed Research International ◽

10.1155/2014/765320 ◽

2014 ◽

Vol 2014 ◽

pp. 1-10 ◽

Cited By ~ 1

Author(s):

Yao-zhong Ding ◽

Ya-nan You ◽

Dong-jie Sun ◽

Hao-tai Chen ◽

Yong-lu Wang ◽

...

Keyword(s):

Codon Usage ◽

Codon Bias ◽

Structural Information ◽

Synonymous Codon ◽

Data Bank ◽

Synonymous Codon Usage ◽

Respiratory Syndrome Virus ◽

Translation Speed ◽

Context Dependent ◽

Insight Into

The information about the crystal structure of porcine reproductive and respiratory syndrome virus (PRRSV) leader protease nsp1αis available to analyze the roles of tRNA abundance of pigs and codon usage of thensp1αgene in the formation of this protease. The effects of tRNA abundance of the pigs and the synonymous codon usage and the context-dependent codon bias (CDCB) of thensp1αon shaping the specific folding units (α-helix,β-strand, and the coil) in the nsp1αwere analyzed based on the structural information about this protease from protein data bank (PDB: 3IFU) and thensp1αof the 191 PRRSV strains. By mapping the overall tRNA abundance along thensp1α, we found that there is no link between the fluctuation of the overall tRNA abundance and the specific folding units in the nsp1α, and the low translation speed of ribosome caused by the tRNA abundance exists in thensp1α. The strong correlation between some synonymous codon usage and the specific folding units in the nsp1αwas found, and the phenomenon of CDCB exists in the specific folding units of the nsp1α. These findings provide an insight into the roles of the synonymous codon usage and CDCB in the formation of PRRSV nsp1αstructure.

Download Full-text

Recombination, meiotic expression and human codon usage

eLife ◽

10.7554/elife.27344 ◽

2017 ◽

Vol 6 ◽

Cited By ~ 23

Author(s):

Fanny Pouyet ◽

Dominique Mouchiroud ◽

Laurent Duret ◽

Marie Sémon

Keyword(s):

Codon Usage ◽

Large Scale ◽

Synonymous Codon ◽

Gc Content ◽

Synonymous Codon Usage ◽

Translation Efficiency ◽

Functional Categories ◽

Human Genes ◽

Biased Gene Conversion ◽

Mammalian Genomes

Synonymous codon usage (SCU) varies widely among human genes. In particular, genes involved in different functional categories display a distinct codon usage, which was interpreted as evidence that SCU is adaptively constrained to optimize translation efficiency in distinct cellular states. We demonstrate here that SCU is not driven by constraints on tRNA abundance, but by large-scale variation in GC-content, caused by meiotic recombination, via the non-adaptive process of GC-biased gene conversion (gBGC). Expression in meiotic cells is associated with a strong decrease in recombination within genes. Differences in SCU among functional categories reflect differences in levels of meiotic transcription, which is linked to variation in recombination and therefore in gBGC. Overall, the gBGC model explains 70% of the variance in SCU among genes. We argue that the strong heterogeneity of SCU induced by gBGC in mammalian genomes precludes any optimization of the tRNA pool to the demand in codon usage.

Download Full-text

Codon Usage Bias in Autophagy-Related Gene 13 in Eukaryotes: Uncovering the Genetic Divergence by the Interplay Between Nucleotides and Codon Usages

Frontiers in Cellular and Infection Microbiology ◽

10.3389/fcimb.2021.771010 ◽

2021 ◽

Vol 11 ◽

Author(s):

Yicong Li ◽

Rui Wang ◽

Huihui Wang ◽

Feiyang Pu ◽

Xili Feng ◽

...

Keyword(s):

Amino Acid ◽

Codon Usage ◽

Codon Usage Bias ◽

Essential Gene ◽

Synonymous Codon ◽

Phylogenetic Analyses ◽

Nucleotide Composition ◽

Synonymous Codon Usage ◽

Related Gene ◽

Codon Positions

Synonymous codon usage bias is a universal characteristic of genomes across various organisms. Autophagy-related gene 13 (atg13) is one essential gene for autophagy initiation, yet the evolutionary trends of the atg13 gene at the usages of nucleotide and synonymous codon remains unexplored. According to phylogenetic analyses for the atg13 gene of 226 eukaryotic organisms at the nucleotide and amino acid levels, it is clear that their nucleotide usages exhibit more genetic information than their amino acid usages. Specifically, the overall nucleotide usage bias quantified by information entropy reflected that the usage biases at the first and second codon positions were stronger than those at the third position of the atg13 genes. Furthermore, the bias level of nucleotide ‘G’ usage is highest, while that of nucleotide ‘C’ usage is lowest in the atg13 genes. On top of that, genetic features represented by synonymous codon usage exhibits a species-specific pattern on the evolution of the atg13 genes to some extent. Interestingly, the codon usages of atg13 genes in the ancestor animals (Latimeria chalumnae, Petromyzon marinus, and Rhinatrema bivittatum) are strongly influenced by mutation pressure from nucleotide composition constraint. However, the distributions of nucleotide composition at different codon positions in the atg13 gene display that natural selection still dominates atg13 codon usages during organisms’ evolution.

Download Full-text

Entropy and codon bias in HIV-1

10.1101/052274 ◽

2016 ◽

Author(s):

Aakash Pandey

Keyword(s):

Codon Usage ◽

Codon Usage Bias ◽

Codon Bias ◽

Nucleotide Composition ◽

Threshold Values ◽

Minimum Entropy ◽

Maximum Information ◽

Human Genes ◽

Efficient Expression ◽

Hiv 1

AbstractFor the heterologous gene expression systems, the codon bias has to be optimized according to the host for efficient expression. Although DNA viruses show a correlation on codon bias with their hosts, HIV genes show low correlation for both nucleotide composition and codon usage bias with its human host which limits the efficient expression of HIV genes. Despite this variation, HIV is efficient at infecting hosts and multiplying in large number. In this study, first, the degree of codon adaptation is calculated as codon adaptation index (CAI) and compared with the expected threshold value (eCAI) determined from the sequences with the same nucleotide composition as that of the HIV-1 genome. Then, information theoretic analysis of nine genes of HIV-1 based on codon statistics of the HIV-1 genome, individual genes and codon usage of human genes is done. Comparison of codon adaptation indices with their respective threshold values shows that the CAI lies very close to the threshold values. Despite not being well adapted to the codon usage bias of human hosts, it was found that the Shannon entropies of the nine genes based on overall codon statistics of HIV-1 genome are very similar to the entropies calculated from codon usage of human genes. Similarly, for the HIV-1 genome sequence analyzed, the codon statistics of the third reading frame has the highest bias representing minimum entropy and hence the maximum information.

Download Full-text