scholarly journals Protein evolution is structure dependent and non-homogeneous across the tree of life

2020 ◽  
Author(s):  
Akanksha Pandey ◽  
Edward L. Braun

AbstractMotivationProtein sequence evolution is a complex process that varies among-sites within proteins and across the tree of life. Comparisons of evolutionary rate matrices for specific taxa (‘clade-specific models’) have the potential to reveal this variation and provide information about the underlying reasons for those changes. To study changes in patterns of protein sequence evolution we estimated and compared clade-specific models in a way that acknowledged variation within proteins due to structure.ResultsClade-specific model fit was able to correctly classify proteins from four specific groups (vertebrates, plants, oomycetes, and yeasts) more than 70% of the time. This was true whether we used mixture models that incorporate relative solvent accessibility or simple models that treat sites as homogeneous. Thus, protein evolution is non-homogeneous over the tree of life. However, a small number of dimensions could explain the differences among models (for mixture models ~50% of the variance reflected relative solvent accessibility and ~25% reflected clade). Relaxed purifying selection in taxa with lower long-term effective population sizes appears to explain much of the among clade variance. Relaxed selection on solvent-exposed sites was correlated with changes in amino acid side-chain volume; other differences among models were more complex. Beyond the information they reveal about protein evolution, our clade-specific models also represent tools for phylogenomic inference.AvailabilityModel files are available from https://github.com/ebraun68/[email protected] informationSupplementary data are appended to this preprint.

2015 ◽  
Vol 112 (25) ◽  
pp. E3226-E3235 ◽  
Author(s):  
Premal Shah ◽  
David M. McCandlish ◽  
Joshua B. Plotkin

The phenotypic effect of an allele at one genetic site may depend on alleles at other sites, a phenomenon known as epistasis. Epistasis can profoundly influence the process of evolution in populations and shape the patterns of protein divergence across species. Whereas epistasis between adaptive substitutions has been studied extensively, relatively little is known about epistasis under purifying selection. Here we use computational models of thermodynamic stability in a ligand-binding protein to explore the structure of epistasis in simulations of protein sequence evolution. Even though the predicted effects on stability of random mutations are almost completely additive, the mutations that fix under purifying selection are enriched for epistasis. In particular, the mutations that fix are contingent on previous substitutions: Although nearly neutral at their time of fixation, these mutations would be deleterious in the absence of preceding substitutions. Conversely, substitutions under purifying selection are subsequently entrenched by epistasis with later substitutions: They become increasingly deleterious to revert over time. Our results imply that, even under purifying selection, protein sequence evolution is often contingent on history and so it cannot be predicted by the phenotypic effects of mutations assayed in the ancestral background.


2014 ◽  
Vol 2014 ◽  
pp. 1-10 ◽  
Author(s):  
So-Wei Yeh ◽  
Tsun-Tsao Huang ◽  
Jen-Wei Liu ◽  
Sung-Huan Yu ◽  
Chien-Hua Shih ◽  
...  

Functional and biophysical constraints result in site-dependent patterns of protein sequence variability. It is commonly assumed that the key structural determinant of site-specific rates of evolution is the Relative Solvent Accessibility (RSA). However, a recent study found that amino acid substitution rates correlate better with two Local Packing Density (LPD) measures, the Weighted Contact Number (WCN) and the Contact Number (CN), than with RSA. This work aims at a more thorough assessment. To this end, in addition to substitution rates, we considered four other sequence variability scores, four measures of solvent accessibility (SA), and other CN measures. We compared all properties for each protein of a structurally and functionally diverse representative dataset of monomeric enzymes. We show that the best sequence variability measures take into account phylogenetic tree topology. More importantly, we show that both LPD measures (WCN and CN) correlate better than all of the SA measures, regardless of the sequence variability score used. Moreover, the independent contribution of the best LPD measure is approximately four times larger than that of the best SA measure. This study strongly supports the conclusion that a site’s packing density rather than its solvent accessibility is the main structural determinant of its rate of evolution.


2015 ◽  
Vol 11 ◽  
pp. EBO.S22911 ◽  
Author(s):  
Kuangyu Wang ◽  
Shuhui Yu ◽  
Xiang Ji ◽  
Clemens Lakner ◽  
Alexander Griffing ◽  
...  

2020 ◽  
Vol 36 (11) ◽  
pp. 3372-3378
Author(s):  
Alexander Gress ◽  
Olga V Kalinina

Abstract Motivation In proteins, solvent accessibility of individual residues is a factor contributing to their importance for protein function and stability. Hence one might wish to calculate solvent accessibility in order to predict the impact of mutations, their pathogenicity and for other biomedical applications. A direct computation of solvent accessibility is only possible if all atoms of a protein three-dimensional structure are reliably resolved. Results We present SphereCon, a new precise measure that can estimate residue relative solvent accessibility (RSA) from limited data. The measure is based on calculating the volume of intersection of a sphere with a cone cut out in the direction opposite of the residue with surrounding atoms. We propose a method for estimating the position and volume of residue atoms in cases when they are not known from the structure, or when the structural data are unreliable or missing. We show that in cases of reliable input structures, SphereCon correlates almost perfectly with the directly computed RSA, and outperforms other previously suggested indirect methods. Moreover, SphereCon is the only measure that yields accurate results when the identities of amino acids are unknown. A significant novel feature of SphereCon is that it can estimate RSA from inter-residue distance and contact matrices, without any information about the actual atom coordinates. Availability and implementation https://github.com/kalininalab/spherecon. Contact [email protected] Supplementary information Supplementary data are available at Bioinformatics online.


2011 ◽  
Vol 11 (1) ◽  
pp. 361 ◽  
Author(s):  
Johan A Grahnen ◽  
Priyanka Nandakumar ◽  
Jan Kubelka ◽  
David A Liberles

2009 ◽  
Vol 37 (4) ◽  
pp. 783-786 ◽  
Author(s):  
Romain A. Studer ◽  
Marc Robinson-Rechavi

The evolution of protein function appears to involve alternating periods of conservative evolution and of relatively rapid change. Evidence for such episodic evolution, consistent with some theoretical expectations, comes from the application of increasingly sophisticated models of evolution to large sequence datasets. We present here some of the recent methods to detect functional shifts, using amino acid or codon models. Both provide evidence for punctual shifts in patterns of amino acid conservation, including the fixation of key changes by positive selection. Although a link to gene duplication, a presumed source of functional changes, has been difficult to establish, this episodic model appears to apply to a wide variety of proteins and organisms.


Sign in / Sign up

Export Citation Format

Share Document