Evidence for an episodic model of protein sequence evolution

2009 ◽  
Vol 37 (4) ◽  
pp. 783-786 ◽  
Author(s):  
Romain A. Studer ◽  
Marc Robinson-Rechavi

The evolution of protein function appears to involve alternating periods of conservative evolution and of relatively rapid change. Evidence for such episodic evolution, consistent with some theoretical expectations, comes from the application of increasingly sophisticated models of evolution to large sequence datasets. We present here some of the recent methods to detect functional shifts, using amino acid or codon models. Both provide evidence for punctual shifts in patterns of amino acid conservation, including the fixation of key changes by positive selection. Although a link to gene duplication, a presumed source of functional changes, has been difficult to establish, this episodic model appears to apply to a wide variety of proteins and organisms.

2018 ◽  
Author(s):  
Jeffrey I. Boucher ◽  
Troy W. Whitfield ◽  
Ann Dauphin ◽  
Gily Nachum ◽  
Carl Hollins ◽  
...  

AbstractThe evolution of HIV-1 protein sequences should be governed by a combination of factors including nucleotide mutational probabilities, the genetic code, and fitness. The impact of these factors on protein sequence evolution are interdependent, making it challenging to infer the individual contribution of each factor from phylogenetic analyses alone. We investigated the protein sequence evolution of HIV-1 by determining an experimental fitness landscape of all individual amino acid changes in protease. We compared our experimental results to the frequency of protease variants in a publicly available dataset of 32,163 sequenced isolates from drug-naïve individuals. The most common amino acids in sequenced isolates supported robust experimental fitness, indicating that the experimental fitness landscape captured key features of selection acting on protease during viral infections of hosts. Amino acid changes requiring multiple mutations from the likely ancestor were slightly less likely to support robust experimental fitness than single mutations, consistent with the genetic code favoring chemically conservative amino acid changes. Amino acids that were common in sequenced isolates were predominantly accessible by single mutations from the likely protease ancestor. Multiple mutations commonly observed in isolates were accessible by mutational walks with highly fit single mutation intermediates. Our results indicate that the prevalence of multiple base mutations in HIV-1 protease is strongly influenced by mutational sampling.


2021 ◽  
Vol 8 (6) ◽  
pp. 201852
Author(s):  
Yi Qian ◽  
Rui Zhang ◽  
Xinglu Jiang ◽  
Guoqiu Wu

Four nucleotides (A, U, C and G) constitute 64 codons at free combination but 64 codons are unequally assigned to 21 items (20 amino acids plus one stop). About 500 amino acids are known but only 20 are selected to make up the proteins. However, the relationships between amino acid and codon and between 20 amino acids have been unclear. In this paper, we studied the relationships between 20 amino acids in 33 species and found there were three constraints between 20 amino acids, such as the relatively stable mean carbon and hydrogen (C : H) ratios (0.50), similarity interactions between the constituent ratios of amino acids, and the frequency of amino acids according with Poisson distribution under certain conditions. We demonstrated that the unequal distribution of 64 codons and the choice of amino acids in molecular evolution would be constrained to remain stable C : H ratios. The constituent ratios and frequency of 20 amino acids in a species or a protein are two determinants of protein sequence evolution, so this finding showed the constraints between 20 amino acids played an important role in protein sequence evolution.


2020 ◽  
Author(s):  
Yi Qian ◽  
Rui Zhang ◽  
Xinglu Jiang ◽  
Guoqiu Wu

Abstract 4 nucleotides(A, U, C, G) constitute 64 codons at free combination but 64 codons are unequally assigned to 21 items (20 amino acids plus one stop). About 500 amino acids are known but only 20 ones are selected to make up the proteins. However, the relationships between amino acid and codon and between 20 amino acids have been unclear. In this paper, we studied on the relationships between 20 amino acids in 33 species and found there were three constraints between 20 amino acids, such as the relatively stable mean carbon and hydrogen(C:H) ratios(0.50), similarity interactions between the constituent ratios of amino acids, and the frequency of amino acids according with Poisson distribution under a certain conditions. We demonstrated that the unequal distribution of 64 codons and the choice of amino acids in molecular evolution would be constrained to remain stable C:H ratios. The constituent ratios and frequency of 20 amino acids in a species or a protein are two determinants of protein sequence evolution, so this findings showed the constraints between 20 amino acids played an important role in protein sequence evolution.


2019 ◽  
Vol 36 (4) ◽  
pp. 798-810 ◽  
Author(s):  
Jeffrey I Boucher ◽  
Troy W Whitfield ◽  
Ann Dauphin ◽  
Gily Nachum ◽  
Carl Hollins ◽  
...  

Abstract The evolution of HIV-1 protein sequences should be governed by a combination of factors including nucleotide mutational probabilities, the genetic code, and fitness. The impact of these factors on protein sequence evolution is interdependent, making it challenging to infer the individual contribution of each factor from phylogenetic analyses alone. We investigated the protein sequence evolution of HIV-1 by determining an experimental fitness landscape of all individual amino acid changes in protease. We compared our experimental results to the frequency of protease variants in a publicly available data set of 32,163 sequenced isolates from drug-naïve individuals. The most common amino acids in sequenced isolates supported robust experimental fitness, indicating that the experimental fitness landscape captured key features of selection acting on protease during viral infections of hosts. Amino acid changes requiring multiple mutations from the likely ancestor were slightly less likely to support robust experimental fitness than single mutations, consistent with the genetic code favoring chemically conservative amino acid changes. Amino acids that were common in sequenced isolates were predominantly accessible by single mutations from the likely protease ancestor. Multiple mutations commonly observed in isolates were accessible by mutational walks with highly fit single mutation intermediates. Our results indicate that the prevalence of multiple-base mutations in HIV-1 protease is strongly influenced by mutational sampling.


2021 ◽  
Author(s):  
Stefano Pascarelli ◽  
Paola Laurino

Connecting protein sequence to function is becoming increasingly relevant since high-throughput sequencing studies accumulate large amounts of genomic data. Protein database annotation helps to bridge this gap; however, it is fundamental to understand the mechanisms underlying functional inheritance and divergence. If the homology relationship between proteins is known, can we determine whether the function diverged? In this work, we analyze different possibilities of protein sequence evolution after gene duplication and identify "residue inversions", i.e., sites where the relationship between the ancestry and the functional signal is decoupled. Residues in these sites play a role in functional divergence and could indicate a shift in protein function. We develop a method to recognize residue inversions in a phylogeny and test it on real and simulated datasets. In a dataset built from the Epidermal Growth Factor Receptor (EGFR) sequences found in 88 fish species, we identify 19 positions that went through inversion after gene duplication, mostly located at the ligand-binding extracellular domain.


2011 ◽  
Vol 11 (1) ◽  
pp. 361 ◽  
Author(s):  
Johan A Grahnen ◽  
Priyanka Nandakumar ◽  
Jan Kubelka ◽  
David A Liberles

Sign in / Sign up

Export Citation Format

Share Document