scholarly journals Faculty Opinions recommendation of Mutation bias within oncogene families is related to proliferation-specific codon usage.

Author(s):  
Chava Kimchi-Sarfaty ◽  
Douglas Meyer ◽  
Upendra Katneni
Keyword(s):  
2021 ◽  
Author(s):  
Alexander L Cope ◽  
Premal Shah

Patterns of non-uniform usage of synonymous codons (codon bias) varies across genes in an organism and across species from all domains of life. The bias in codon usage is due to a combination of both non-adaptive (e.g. mutation biases) and adaptive (e.g. natural selection for translation efficiency/accuracy) evolutionary forces. Most population genetics models quantify the effects of mutation bias and selection on shaping codon usage patterns assuming a uniform mutation bias across the genome. However, mutation biases can vary both along and across chromosomes due to processes such as biased gene conversion, potentially obfuscating signals of translational selection. Moreover, estimates of variation in genomic mutation biases are often lacking for non-model organisms. Here, we combine an unsupervised learning method with a population genetics model of synonymous codon bias evolution to assess the impact of intragenomic variation in mutation bias on the strength and direction of natural selection on synonymous codon usage across 49 Saccharomycotina budding yeasts. We find that in the absence of a priori information, unsupervised learning approaches can be used to identify regions evolving under different mutation biases. We find that the impact of intragenomic variation in mutation bias varies widely, even among closely-related species. We show that the overall strength and direction of selection on codon usage can be underestimated by failing to account for intragenomic variation in mutation biases. Interestingly, genes falling into clusters identified by machine learning are also often physically clustered across chromosomes, consistent with processes such as biased gene conversion. Our results indicate the need for more nuanced models of sequence evolution that systematically incorporate the effects of variable mutation biases on codon frequencies.


2017 ◽  
Author(s):  
Prashant Mainali ◽  
Sobita Pathak

ABSTRACTCodon usage bias is the preferential use of the subset of synonymous codons during translation. In this paper, the comparisons of normalized entropy and GC content between the sequence of coding regions of Escherichia coli k12 and noncoding regions (ncRNA, rRNA) of various organisms were done to shed light on the origin of the codon usage bias.The normalized entropy of the coding regions was found significantly higher than the noncoding regions, suggesting the role of the translation process in shaping codon usage bias. Further, when the position specific GC content of both coding and noncoding regions was analyzed, the GC2 content in coding regions was lower than GC1 and GC2 while in noncoding regions, the GC1, GC2, GC3 contents were approximately equal. This discrepancy is explained by the biased mutation coupled with the presence and absence of selection pressure. The accumulation of CG content occurs in the sequences due to mutation bias in DNA repair and recombination process. In noncoding regions, the mutation is harmful and thus, selected against while due to the degeneracy of codons in coding regions, a mutation in GC3 is neutral and hence, not selected. Thus, the accumulation of GC content occurs in coding regions, and thus codon usage bias occurs.


Author(s):  
Brian R. Morton

AbstractTwo competing proposals about the degree to which selection affects codon usage of angiosperm chloroplast genes are examined. The first, based on observations that codon usage does not match expectations under the naïve assumption that base composition will be identical at all neutral sites, is that selection plays a significant role. The second is that codon usage is determined almost solely by mutation bias and drift, with selection influencing only one or two highly expressed genes, in particular psbA. First it is shown that, as a result of an influence of neighboring base composition on mutation dynamics, compositional biases are expected to be widely divergent at different sites in the absence of selection. The observed mutation properties are then used to predict expected neutral codon usage biases and to show that observed deviations from the naïve expectations are in fact expected given the context-dependent mutational dynamics. It is also shown that there is a match between the observed and expected codon usage when context effects are taken into consideration, with psbA being a notable exception. Overall, the data support the model that selection is not a widespread factor affecting the codon usage of angiosperm chloroplast genes and highlight the need to have an accurate model of mutational dynamics.


DNA Research ◽  
2011 ◽  
Vol 18 (6) ◽  
pp. 499-512 ◽  
Author(s):  
Y. Rao ◽  
G. Wu ◽  
Z. Wang ◽  
X. Chai ◽  
Q. Nie ◽  
...  

2018 ◽  
Author(s):  
Alexander L. Cope ◽  
Robert L. Hettich ◽  
Michael A. Gilchrist

AbstractThe Sec secretion pathway is found across all domains of life. A critical feature of Sec secreted proteins is the signal peptide, a short peptide with distinct physicochemical properties located at the N-terminus of the protein. Previous work indicates signal peptides are biased towards translationally inefficient codons, which is hypothesized to be an adaptation driven by selection to improve the efficacy and efficiency of the protein secretion mechanisms. We investigate codon usage in the signal peptides of E. coli using the Codon Adaptation Index (CAI), the tRNA Adaptation Index (tAI), and the ribosomal overhead cost formulation of the stochastic evolutionary model of protein production rates (ROC-SEMPPR). Comparisons between signal peptides and 5’-end of cytoplasmic proteins using CAI and tAI are consistent with a preference for inefficient codons in signal peptides. Simulations reveal these differences are due to amino acid usage and gene expression - we find these differences disappear when accounting for both factors. In contrast, ROC-SEMPPR, a mechanistic population genetics model capable of separating the effects of selection and mutation bias, shows codon usage bias (CUB) of the signal peptides is indistinguishable from the 5’-ends of cytoplasmic proteins. Additionally, we find CUB at the 5’-ends is weaker than later segments of the gene. Results illustrate the value in using models grounded in population genetics to interpret genetic data. We show failure to account for mutation bias and the effects of gene expression on the efficacy of selection against translation inefficiency can lead to a misinterpretation of codon usage patterns.


10.29007/87r9 ◽  
2020 ◽  
Author(s):  
Zhixiu Lu ◽  
Michael Gilchrist ◽  
Scott Emrich

Codon usage bias has been known to reflect the expression level of a protein-coding gene under the evolutionary theory that selection favors certain synonymous codons. Although measuring the effect of selection in simple organisms such as yeast and E. coli has proven to be effective and accurate, codon-based methods perform less well in plants and humans. In this paper, we extend a prior method that incorporates another evolutionary factor, namely mutation bias and its effect on codon usage. Our results indicate that prediction of gene expression is significantly improved under our framework, and suggests that quantification of mutation bias is essential for fully understanding synonymous codon usage. We also propose an improved method, namely MLE-Φ, with much greater computation efficiency and a wider range of applications. An implementation of this method is provided at https://github.com/luzhixiu1996/MLE- Phi.


2019 ◽  
Author(s):  
Cedric Landerer ◽  
Brian C. O’Meara ◽  
Russell Zaretzki ◽  
Michael A. Gilchrist

AbstractFor decades, codon usage has been used as a measure of adaptation for translational efficiency and translation accuracy of a gene’s coding sequence. These patterns of codon usage reflect both the selective and mutational environment in which the coding sequences evolved. Over this same period, gene transfer between lineages has become widely recognized as an important biological phenomenon. Nevertheless, most studies of codon usage implicitly assume that all genes within a genome evolved under the same selective and mutational environment, an assumption violated when introgression occurs. In order to better understand the effects of introgression on codon usage patterns and vice versa, we examine the patterns of codon usage in Lachancea kluyveri, a yeast which has experienced a large introgression. We quantify the effects of mutation bias and selection for translation efficiency on the codon usage pattern of the endogenous and introgressed exogenous genes using a Bayesian mixture model, ROC SEMPPR, which is built on mechanistic assumptions about protein synthesis and grounded in population genetics.We find substantial differences in codon usage between the endogenous and exogenous genes, and show that these differences can be largely attributed to differences in mutation bias favoring A/T ending codons in the endogenous genes while favoring C/G ending codons in the exogenous genes. Recognizing the two different signatures of mutation bias and selection improves our ability to predict protein synthesis rate by 42% and allowed us to accurately assess the decaying signal of endogenous codon mutation and preferences. In addition, using our estimates of mutation bias and selection, we identify Eremothecium gossypii as the closest relative to the exogenous genes, providing an alternative hypothesis about the origin of the exogenous genes, estimate that the introgression occurred ∼ 6 × 108 generation ago, and estimate its historic and current selection against mismatched codon usage.Our work illustrates how mechanistic, population genetic models like ROC SEMPPR can separate the effects of mutation and selection on codon usage and provide quantitative estimates from sequence data.


2014 ◽  
Author(s):  
Michael Gilchrist ◽  
Wei-Chen Chen ◽  
Premal Shah ◽  
Cedric L. Landerer ◽  
Russell Zaretzki

Extracting biologically meaningful information from the continuing flood of genomic data is a major challenge in the life sciences. Codon usage bias (CUB) is a general feature of most genomes and is thought to reflect the effects of both natural selection for efficient translation and mutation bias. Here we present a mechanistically interpretable, Bayesian model (ROC SEMPPR) to extract biologically meaningful information from patterns of CUB within a genome. ROC SEMPPR, is grounded in population genetics and allows us to separate the contributions of mutational biases and natural selection against translational inefficiency on a gene by gene and codon by codon basis. Until now, the primary disadvantage of similar approaches was the need for genome scale measurements of gene expression. Here we demonstrate that it is possible to both extract accurate estimates of codon specific mutation biases and translational efficiencies while simultaneously generating accurate estimates of gene expression, rather than requiring such information. We demonstrate the utility of ROC SEMPPR using theSaccharomyces cerevisiaeS288c genome. When we compare our model fits with previous approaches we observe an exceptionally high agreement between estimates of both codon specific parameters and gene expression levels (ρ > 0.99 in all cases). We also observe strong agreement between our parameter estimates and those derived from alternative datasets. For example, our estimates of mutation bias and those from mutational accumulation experiments are highly correlated (ρ=0.95). Our estimates of codon specific translational inefficiencies are tRNA copy number based estimates of ribosome pausing time (ρ = 0.64), and mRNA and ribosome profiling footprint based estimates of gene expression (ρ=0.53-0.74) are also highly correlated, thus supporting the hypothesis that selection against translational inefficiency is an important force driving the evolution of CUB. Surprisingly, we find that for particular amino acids, codon usage in highly expressed genes can still be largely driven by mutation bias and that failing to take mutation bias into account can lead to the misidentification of an amino acid's `optimal' codon. In conclusion, our method demonstrates that an enormous amount of biologically important information is encoded within genome scale patterns of codon usage, accessing this information does not require gene expression measurements, but instead carefully formulated biologically interpretable models.


Sign in / Sign up

Export Citation Format

Share Document