scholarly journals Codon-Substitution Models for Heterogeneous Selection Pressure at Amino Acid Sites

Genetics ◽  
2000 ◽  
Vol 155 (1) ◽  
pp. 431-449 ◽  
Author(s):  
Ziheng Yang ◽  
Rasmus Nielsen ◽  
Nick Goldman ◽  
Anne-Mette Krabbe Pedersen

AbstractComparison of relative fixation rates of synonymous (silent) and nonsynonymous (amino acid-altering) mutations provides a means for understanding the mechanisms of molecular sequence evolution. The nonsynonymous/synonymous rate ratio (ω = dN/dS) is an important indicator of selective pressure at the protein level, with ω = 1 meaning neutral mutations, ω < 1 purifying selection, and ω > 1 diversifying positive selection. Amino acid sites in a protein are expected to be under different selective pressures and have different underlying ω ratios. We develop models that account for heterogeneous ω ratios among amino acid sites and apply them to phylogenetic analyses of protein-coding DNA sequences. These models are useful for testing for adaptive molecular evolution and identifying amino acid sites under diversifying selection. Ten data sets of genes from nuclear, mitochondrial, and viral genomes are analyzed to estimate the distributions of ω among sites. In all data sets analyzed, the selective pressure indicated by the ω ratio is found to be highly heterogeneous among sites. Previously unsuspected Darwinian selection is detected in several genes in which the average ω ratio across sites is <1, but in which some sites are clearly under diversifying selection with ω > 1. Genes undergoing positive selection include the β-globin gene from vertebrates, mitochondrial protein-coding genes from hominoids, the hemagglutinin (HA) gene from human influenza virus A, and HIV-1 env, vif, and pol genes. Tests for the presence of positively selected sites and their subsequent identification appear quite robust to the specific distributional form assumed for ω and can be achieved using any of several models we implement. However, we encountered difficulties in estimating the precise distribution of ω among sites from real data sets.

2005 ◽  
Vol 79 (11) ◽  
pp. 7014-7023 ◽  
Author(s):  
Zigui Chen ◽  
Masanori Terai ◽  
Leiping Fu ◽  
Rolando Herrero ◽  
Rob DeSalle ◽  
...  

ABSTRACT Human papillomavirus type 16 (HPV16) is the primary etiological agent of cervical cancer, the second most common cancer in women worldwide. Complete genomes of 12 isolates representing the major lineages of HPV16 were cloned and sequenced from cervicovaginal cells. The sequence variations within the open reading frames (ORFs) and noncoding regions were identified and compared with the HPV16R reference sequence (50). This whole-genome approach gives us unprecedented precision in detailing sequence-level changes that are under selection on a whole-viral-genome scale. Of 7,908 base pair nucleotide positions, 313 (4.0%) were variable. Within the 2,452 amino acids (aa) comprising 8 ORFs, 243 (9.9%) amino acid positions were variable. In order to investigate the molecular evolution of HPV16 variants, maximum likelihood models of codon substitution were used to identify lineages and amino acid sites under selective pressure. Five codon sites in the E5 (aa 48, 65) and E6 (aa 10, 14, 83) ORFs were demonstrated to be under diversifying selective pressure. The E5 ORF had the overall highest nonsynonymous/synonymous substitution rate (ω) ratio (M3 = 0.7965). The E2 gene had the next-highest ω ratio (M3 = 0.5611); however, no specific codons were under positive selection. These data indicate that the E6 and E5 ORFs are evolving under positive Darwinian selection and have done so in a relatively short time period. Whether response to selective pressure upon the E5 and E6 ORFs contributes to the biological success of HPV16, its specific biological niche, and/or its oncogenic potential remains to be established.


2020 ◽  
Vol 16 ◽  
pp. 117693432091014
Author(s):  
Rong Wang ◽  
Congfen He ◽  
Kun Dong ◽  
Xin Zhao ◽  
Yaxuan Li ◽  
...  

Trehalose-6-phosphate synthase (TPS) is a key enzyme in the biosynthesis of trehalose, with its direct product, trehalose-6-phosphate, playing important roles in regulating whole-plant carbohydrate allocation and utilization. Genes encoding TPS constitute a multigene family in which functional divergence appears to have occurred repeatedly. To identify the crucial evolutionary amino acid sites of TPS in higher plants, a series of bioinformatics tools were applied to investigate the phylogenetic relationships, functional divergence, positive selection, and co-evolution of TPS proteins. First, we identified 150 TPS genes from 13 higher plant species. Phylogenetic analysis placed these TPS proteins into 2 clades: clades A and B, of which clade B could be further divided into 4 subclades (B1-B4). This classification was supported by the intron-exon structures, with more introns present in clade A. Next, detection of the critical functionally divergent amino acid sites resulted in the isolation of a total of 286 sites reflecting nonredundant radical shifts in amino acid properties with a high posterior probability cutoff among subclades. In addition, positively selected sites were identified using a codon substitution model, from which 46 amino acid sites were isolated as exhibiting positive selection at a significant level. Moreover, 18 amino acid sites were highlighted both for functional divergence and positive selection; these may thus potentially represent crucial evolutionary sites in the TPS family. Further co-evolutionary analysis revealed 3 pairs of sites: 11S and 12H, 33S and 34N, and 109G and 110E as demonstrating co-evolution. Finally, the 18 crucial evolutionary amino acid sites were mapped in the 3-dimensional structure. A total of 77 sites harboring functionally and structurally important residues of TPS proteins were found by using the CLIPS-4D online tool; notably, no overlap was observed with the identified crucial evolutionary sites, providing positive evidence supporting their designation. A total of 18 sites were isolated as key amino acids by using multiple bioinformatics tools based on their concomitant functional divergence and positive selection. Almost all these key sites are located in 2 domains of this protein family where they exhibit no overlap with the structurally and functionally conserved sites. These results will provide an improved understanding of the complexity of the TPS gene family and of its function and evolution in higher plants. Moreover, this knowledge may facilitate the exploitation of these sites for protein engineering applications.


Author(s):  
Nicolas Rodrigue ◽  
Thibault Latrille ◽  
Nicolas Lartillot

Abstract In recent years, codon substitution models based on the mutation–selection principle have been extended for the purpose of detecting signatures of adaptive evolution in protein-coding genes. However, the approaches used to date have either focused on detecting global signals of adaptive regimes—across the entire gene—or on contexts where experimentally derived, site-specific amino acid fitness profiles are available. Here, we present a Bayesian site-heterogeneous mutation–selection framework for site-specific detection of adaptive substitution regimes given a protein-coding DNA alignment. We offer implementations, briefly present simulation results, and apply the approach on a few real data sets. Our analyses suggest that the new approach shows greater sensitivity than traditional methods. However, more study is required to assess the impact of potential model violations on the method, and gain a greater empirical sense its behavior on a broader range of real data sets. We propose an outline of such a research program.


BMC Biology ◽  
2019 ◽  
Vol 17 (1) ◽  
Author(s):  
Frida Belinky ◽  
Itamar Sela ◽  
Igor B. Rogozin ◽  
Eugene V. Koonin

Abstract Background Single nucleotide substitutions in protein-coding genes can be divided into synonymous (S), with little fitness effect, and non-synonymous (N) ones that alter amino acids and thus generally have a greater effect. Most of the N substitutions are affected by purifying selection that eliminates them from evolving populations. However, additional mutations of nearby bases potentially could alleviate the deleterious effect of single substitutions, making them subject to positive selection. To elucidate the effects of selection on double substitutions in all codons, it is critical to differentiate selection from mutational biases. Results We addressed the evolutionary regimes of within-codon double substitutions in 37 groups of closely related prokaryotic genomes from diverse phyla by comparing the fractions of double substitutions within codons to those of the equivalent double S substitutions in adjacent codons. Under the assumption that substitutions occur one at a time, all within-codon double substitutions can be represented as “ancestral-intermediate-final” sequences (where “intermediate” refers to the first single substitution and “final” refers to the second substitution) and can be partitioned into four classes: (1) SS, S intermediate–S final; (2) SN, S intermediate–N final; (3) NS, N intermediate–S final; and (4) NN, N intermediate–N final. We found that the selective pressure on the second substitution markedly differs among these classes of double substitutions. Analogous to single S (synonymous) substitutions, SS double substitutions evolve neutrally, whereas analogous to single N (non-synonymous) substitutions, SN double substitutions are subject to purifying selection. In contrast, NS show positive selection on the second step because the original amino acid is recovered. The NN double substitutions are heterogeneous and can be subject to either purifying or positive selection, or evolve neutrally, depending on the amino acid similarity between the final or intermediate and the ancestral states. Conclusions The results of the present, comprehensive analysis of the evolutionary landscape of within-codon double substitutions reaffirm the largely conservative regime of protein evolution. However, the second step of a double substitution can be subject to positive selection when the first step is deleterious. Such positive selection can result in frequent crossing of valleys on the fitness landscape.


PLoS ONE ◽  
2010 ◽  
Vol 5 (1) ◽  
pp. e8885 ◽  
Author(s):  
Aristeidis Parmakelis ◽  
Marina Moustaka ◽  
Nikolaos Poulakakis ◽  
Christos Louis ◽  
Michel A. Slotman ◽  
...  

2012 ◽  
Vol 9 (3) ◽  
pp. 18-32 ◽  
Author(s):  
David Reboiro-Jato ◽  
Miguel Reboiro-Jato ◽  
Florentino Fdez-Riverola ◽  
Cristina P. Vieira ◽  
Nuno A. Fonseca ◽  
...  

Summary Maximum-likelihood methods based on models of codon substitution have been widely used to infer positively selected amino acid sites that are responsible for adaptive changes. Nevertheless, in order to use such an approach, software applications are required to align protein and DNA sequences, infer a phylogenetic tree and run the maximum-likelihood models. Therefore, a significant effort is made in order to prepare input files for the different software applications and in the analysis of the output of every analysis. In this paper we present the ADOPS (Automatic Detection Of Positively Selected Sites) software. It was developed with the goal of providing an automatic and flexible tool for detecting positively selected sites given a set of unaligned nucleotide sequence data. An example of the usefulness of such a pipeline is given by showing, under different conditions, positively selected amino acid sites in a set of 54 Coffea putative S-RNase sequences. ADOPS software is freely available and can be downloaded from http://sing.ei.uvigo.es/ADOPS.


2002 ◽  
Vol 19 (6) ◽  
pp. 950-958 ◽  
Author(s):  
Maria Anisimova ◽  
Joseph P. Bielawski ◽  
Ziheng Yang

eLife ◽  
2019 ◽  
Vol 8 ◽  
Author(s):  
Allison J Shultz ◽  
Timothy B Sackton

Consistent patterns of positive selection in functionally similar genes can suggest a common selective pressure across a group of species. We use alignments of orthologous protein-coding genes from 39 species of birds to estimate parameters related to positive selection for 11,000 genes conserved across birds. We show that functional pathways related to the immune system, recombination, lipid metabolism, and phototransduction are enriched for positively selected genes. By comparing our results with mammalian data, we find a significant enrichment for positively selected genes shared between taxa, and that these shared selected genes are enriched for viral immune pathways. Using pathogen-challenge transcriptome data, we show that genes up-regulated in response to pathogens are also enriched for positively selected genes. Together, our results suggest that pathogens, particularly viruses, consistently target the same genes across divergent clades, and that these genes are hotspots of host-pathogen conflict over deep evolutionary time.


Sign in / Sign up

Export Citation Format

Share Document