codon model
Recently Published Documents


TOTAL DOCUMENTS

22
(FIVE YEARS 8)

H-INDEX

7
(FIVE YEARS 1)

2021 ◽  
Vol 118 (20) ◽  
pp. e2023575118
Author(s):  
Shakibur Rahman ◽  
Sergei L. Kosakovsky Pond ◽  
Andrew Webb ◽  
Jody Hey

Synonymous codon substitutions are not always selectively neutral as revealed by several types of analyses, including studies of codon usage patterns among genes. We analyzed codon usage in 13 bacterial genomes sampled from across a large order of bacteria, Enterobacterales, and identified presumptively neutral and selected classes of synonymous substitutions. To estimate substitution rates, given a neutral/selected classification of synonymous substitutions, we developed a flexible dN/dS substitution model that allows multiple classes of synonymous substitutions. Under this multiclass synonymous substitution (MSS) model, the denominator of dN/dS includes only the strictly neutral class of synonymous substitutions. On average, the value of dN/dS under the MSS model was 80% of that under the standard codon model in which all synonymous substitutions are assumed to be neutral. The indication is that conventional dN/dS analyses overestimate these values and thus overestimate the frequency of positive diversifying selection and underestimate the strength of purifying selection. To quantify the strength of selection necessary to explain this reduction, we developed a model of selected compensatory codon substitutions. The reduction in synonymous substitution rate, and thus the contribution that selection makes to codon bias variation among genes, can be adequately explained by very weak selection, with a mean product of population size and selection coefficient, Ns=0.8.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Jessica A. Rodrigues ◽  
Richard V. Espley ◽  
Andrew C. Allan

AbstractMYB transcription factors regulate diverse aspects of plant development and secondary metabolism, often by partnering in transcriptional regulatory complexes. Here, we harness genomic resources to identify novel MYBs, thereby producing an updated eudicot MYB phylogeny with revised relationships among subgroups as well as new information on sequence variation in the disordered C-terminus of anthocyanin-activating MYBs. BLAST® and hidden Markov model scans of gene annotations identified a total of 714 MYB transcription factors across the genomes of four crops that span the eudicots: apple, grape, kiwifruit and tomato. Codon model-based phylogenetic inference identified novel members of previously defined subgroups, and the function of specific anthocyanin-activating subgroup 6 members was assayed transiently in tobacco leaves. Sequence conservation within subgroup 6 highlighted one previously described and two novel short linear motifs in the disordered C-terminal region. The novel motifs have a mix of hydrophobic and acidic residues and are predicted to be relatively ordered compared with flanking protein sequences. Comparison of motifs with the Eukaryotic Linear Motif database suggests roles in protein–protein interaction. Engineering of motifs and their flanking regions from strong anthocyanin activators into weak activators, and vice versa, affected function. We conclude that, although the MYB C-terminal sequence diverges greatly even within MYB clades, variation within the C-terminus at and near relatively ordered regions offers opportunities for exploring MYB function and developing superior alleles for plant breeding.


2020 ◽  
Author(s):  
Keren Halabi ◽  
Eli Levy Karin ◽  
Laurent Guéguen ◽  
Itay Mayrose

Abstract Detecting the signature of selection in coding sequences and associating it with shifts in phenotypic states can unveil genes underlying complex traits. Of the various signatures of selection exhibited at the molecular level, changes in the pattern of selection at protein coding genes have been of main interest. To this end, phylogenetic branch-site codon models are routinely applied to detect changes in selective patterns along specific branches of the phylogeny. Many of these methods rely on a pre-specified partition of the phylogeny to branch categories, thus treating the course of trait evolution as fully resolved and assuming that phenotypic transitions have occurred only at speciation events. Here we present TraitRELAX, a new phylogenetic model that alleviates these strong assumptions by explicitly accounting for the uncertainty in the evolution of both trait and coding sequences. This joint statistical framework enables the detection of changes in selection intensity upon repeated trait transitions. We evaluated the performance of TraitRELAX using simulations and then applied it to two case studies. Using TraitRELAX, we found an intensification of selection in the primate SEMG2 gene in polygynandrous species compared to species of other mating forms, as well as changes in the intensity of purifying selection operating on sixteen bacterial genes upon transitioning from a free-living to an endosymbiotic lifestyle.


2020 ◽  
Vol 38 (1) ◽  
pp. 181-191
Author(s):  
Zhengting Zou ◽  
Jianzhi Zhang

Abstract It has been suggested that, due to the structure of the genetic code, nonsynonymous transitions are less likely than transversions to cause radical changes in amino acid physicochemical properties so are on average less deleterious. This view was supported by some but not all mutagenesis experiments. Because laboratory measures of fitness effects have limited sensitivities and relative frequencies of different mutations in mutagenesis studies may not match those in nature, we here revisit this issue using comparative genomics. We extend the standard codon model of sequence evolution by adding the parameter η that quantifies the ratio of the fixation probability of transitional nonsynonymous mutations to that of transversional nonsynonymous mutations. We then estimate η from the concatenated alignment of all protein-coding DNA sequences of two closely related genomes. Surprisingly, η ranges from 0.13 to 2.0 across 90 species pairs sampled from the tree of life, with 51 incidences of η < 1 and 30 incidences of η >1 that are statistically significant. Hence, whether nonsynonymous transversions are overall more deleterious than nonsynonymous transitions is species-dependent. Because the corresponding groups of amino acid replacements differ between nonsynonymous transitions and transversions, η is influenced by the relative exchangeabilities of amino acid pairs. Indeed, an extensive search reveals that the large variation in η is primarily explainable by the recently reported among-species disparity in amino acid exchangeabilities. These findings demonstrate that genome-wide nucleotide substitution patterns in coding sequences have species-specific features and are more variable among evolutionary lineages than are currently thought.


2020 ◽  
Vol 70 (1) ◽  
pp. 21-32
Author(s):  
Claudia C Weber ◽  
Umberto Perron ◽  
Dearbhaile Casey ◽  
Ziheng Yang ◽  
Nick Goldman

Abstract How can we best learn the history of a protein’s evolution? Ideally, a model of sequence evolution should capture both the process that generates genetic variation and the functional constraints determining which changes are fixed. However, in practical terms the most suitable approach may simply be the one that combines the convenience of easily available input data with the ability to return useful parameter estimates. For example, we might be interested in a measure of the strength of selection (typically obtained using a codon model) or an ancestral structure (obtained using structural modeling based on inferred amino acid sequence and side chain configuration). But what if data in the relevant state-space are not readily available? We show that it is possible to obtain accurate estimates of the outputs of interest using an established method for handling missing data. Encoding observed characters in an alignment as ambiguous representations of characters in a larger state-space allows the application of models with the desired features to data that lack the resolution that is normally required. This strategy is viable because the evolutionary path taken through the observed space contains information about states that were likely visited in the “unseen” state-space. To illustrate this, we consider two examples with amino acid sequences as input. We show that $$\omega$$, a parameter describing the relative strength of selection on nonsynonymous and synonymous changes, can be estimated in an unbiased manner using an adapted version of a standard 61-state codon model. Using simulated and empirical data, we find that ancestral amino acid side chain configuration can be inferred by applying a 55-state empirical model to 20-state amino acid data. Where feasible, combining inputs from both ambiguity-coded and fully resolved data improves accuracy. Adding structural information to as few as 12.5% of the sequences in an amino acid alignment results in remarkable ancestral reconstruction performance compared to a benchmark that considers the full rotamer state information. These examples show that our methods permit the recovery of evolutionary information from sequences where it has previously been inaccessible. [Ancestral reconstruction; natural selection; protein structure; state-spaces; substitution models.]


2020 ◽  
Author(s):  
Keren Halabi ◽  
Eli Levy Karin ◽  
Laurent Guéguen ◽  
Itay Mayrose

AbstractChanges in complex phenotypes, such as pathogenicity levels, trophic lifestyle, and habitat shifts are brought on by multiple genomic changes: sub- and neofunctionalization, loss of function, and levels of gene expression. Thus, detecting the signature of selection in coding sequences and associating it with shifts in phenotypic state can unveil the genes underlying complex traits. Phylogenetic branch-site codon models are routinely applied to detect changes in selective patterns along specific branches of the phylogeny. These methods rely on a pre-specified partition of the phylogeny to branch categories, thus treating the course of trait evolution as fully resolved and assuming that transitions in phenotypic states have occurred only at speciation events. Here we present TraitRELAX, a new phylogenetic model that alleviates these strong assumptions by explicitly accounting for the uncertainty in the evolution of both trait and coding sequences. This joint statistical framework enables the detection of changes in selection intensity upon repeated trait transitions. We evaluated the performance of TraitRELAX using simulations and then applied it to two case studies. Using TraitRELAX, we found an intensification of selection in the SEMG2 gene in polygynandrous species of primates compared to species of other mating forms, as well as changes in the intensity of purifying selection operating on sixteen bacterial genes upon transitioning from free-living to an endosymbiotic lifestyle.


2019 ◽  
Author(s):  
Claudia C. Weber ◽  
Umberto Perron ◽  
Dearbhaile Casey ◽  
Ziheng Yang ◽  
Nick Goldman

How can we best learn the history of a protein’s evolution? Ideally, a model of sequence evolution should capture both the process that generates genetic variation and the functional constraints determining which changes are fixed. However, in practical terms the most suitable approach may simply be the one that combines the convenience of easily available input data with the ability to return useful parameter estimates. For example, we might be interested in a measure of the strength of selection (typically obtained using a codon model) or an ancestral structure (obtained using structural modelling based on inferred amino acid sequence and side chain configuration).But what if data in the relevant state-space are not readily available? We show that it is possible to obtain accurate estimates of the outputs of interest using an established method for handling missing data. Encoding observed characters in an alignment as ambiguous representations of characters in a larger state-space allows the application of models with the desired features to data that lack the resolution that is normally required. This strategy is viable because the evolutionary path taken through the observed space contains information about states that were likely visited in the “unseen” state-space. To illustrate this, we consider two examples with amino acid sequences as input.We show that ω, a parameter describing the relative strength of selection on non-synonymous and synonymous changes, can be estimated in an unbiased manner using an adapted version of a standard 61-state codon model. Using simulated and empirical data, we find that ancestral amino acid side chain configuration can be inferred by applying a 55-state empirical model to 20-state amino acid data. Where feasible, combining inputs from both ambiguity-coded and fully resolved data improves accuracy. Adding structural information to as few as 12.5% of the sequences in an amino acid alignment results in remarkable ancestral reconstruction performance compared to a benchmark that considers the full rotamer state information. These examples show that our methods permit the recovery of evolutionary information from sequences where it has previously been inaccessible.


2018 ◽  
Author(s):  
Claudia C. Weber ◽  
Simon Whelan

AbstractSubstitutions between chemically distant amino acids are known to occur less frequently than those between more similar amino acids. This knowledge, however, is not reflected in most codon substitution models, which treat all non-synonymous changes as if they were equivalent in terms of impact on the protein. A variety of methods for integrating chemical distances into models have been proposed, with a common approach being to divide substitutions into radical or conservative categories. Nevertheless, it remains unclear whether the resulting models describe sequence evolution better than their simpler counterparts.We propose a parametric codon model that distinguishes between radical and conservative substitutions, allowing us to assess if radical substitutions are preferentially removed by selection. Applying our new model to a range of phylogenomic data, we find differentiating between radical and conservative substitutions provides significantly better fit for large populations, but see no equivalent improvement for smaller populations. Comparing codon- and amino acid models using these same data shows that alignments from large populations tend to select phylogenetic models containing information about amino acid exchangeabilities, whereas the structure of the genetic code is more important for smaller populations.Our results suggest selection against radical substitutions is, on average, more pronounced in large populations than smaller ones. The reduced observable effect of selection in smaller populations may be due to stronger genetic drift making it more challenging to detect preferences. Our results imply an important connection between the life history of a phylogenetic group and the model that best describes its evolution.


Zootaxa ◽  
2017 ◽  
Vol 4320 (3) ◽  
pp. 523 ◽  
Author(s):  
XIN-RAN LI ◽  
MENG LI ◽  
ZONG-QING WANG

The beetle cockroach, or genus Diploptera Saussure, has been reviewed recently, with unresolved issues remaining. New materials facilitated a molecular phylogenetic study and further comparisons of male and female genitalia among known species. We performed phylogenetic estimates based on two mitochondrial DNA fragments: 657 bases of COI gene and 376 bases of 16S rRNA gene. We used codon model and doublet model (secondary structure) for COI and 16S respectively, and the predicted secondary structure of sequenced 16S fragment is illustrated. The phylogeny revealed that 1) D. bicolor Hanitsch is a junior synonym of D. maculata Hanitsch, and therefore D. pulchra Anisyutkin is also a new synonym of the latter because of its synonymy with D. bicolor; and 2) D. punctata (Eschscholtz) can be reliably determined only for specimens from Hawaii and continental Asia, and distributional records of this species require re-examination. The male phallic complex and female valvulae are generalized with diagrams, and interspecific differences are discussed. Genital structures of Diploptera are not significantly varied. We notice a superficial linkage between hook-like phallomere and pronotum: a protrusion on the inner margin of hook-apex sclerite is combined with an angular pronotum; whilst no protrusion, no pronotal angles. The differences in valvulae lie with the third valvulae and the anterior arch of second valvifer ring; these may have taxonomic implications. The uniformity in physical property of oothecae suggests that all Diploptera species, not only D. punctata, are viviparous. 


Sign in / Sign up

Export Citation Format

Share Document