Fibrinopeptides A and B of Baboons (Papio anubis, Papio hamadryas, and Theropithecus gelada): Their Amino Acid Sequences and Evolutionary Rates and a Molecular Phylogeny for the Baboons

1983 ◽  
Vol 94 (6) ◽  
pp. 1973-1978 ◽  
Author(s):  
Shin NAKAMURA ◽  
Osamu TAKENAKA ◽  
Kenji TAKAHASHI
PeerJ ◽  
2017 ◽  
Vol 5 ◽  
pp. e3391 ◽  
Author(s):  
Dariya K. Sydykova ◽  
Claus O. Wilke

Site-specific evolutionary rates can be estimated from codon sequences or from amino-acid sequences. For codon sequences, the most popular methods use some variation of the dN∕dS ratio. For amino-acid sequences, one widely-used method is called Rate4Site, and it assigns a relative conservation score to each site in an alignment. How site-wise dN∕dS values relate to Rate4Site scores is not known. Here we elucidate the relationship between these two rate measurements. We simulate sequences with known dN∕dS, using either dN∕dS models or mutation–selection models for simulation. We then infer Rate4Site scores on the simulated alignments, and we compare those scores to either true or inferred dN∕dS values on the same alignments. We find that Rate4Site scores generally correlate well with true dN∕dS, and the correlation strengths increase in alignments with greater sequence divergence and more taxa. Moreover, Rate4Site scores correlate very well with inferred (as opposed to true) dN∕dS values, even for small alignments with little divergence. Finally, we verify this relationship between Rate4Site and dN∕dS in a variety of empirical datasets. We conclude that codon-level and amino-acid-level analysis frameworks are directly comparable and yield very similar inferences.


1999 ◽  
Vol 65 (4) ◽  
pp. 563-570 ◽  
Author(s):  
Keijiro Sezaki ◽  
Rowshan Ara Begum ◽  
Prachit Wongrat ◽  
Mansha Prasad Srivastava ◽  
Sachi SriKantha ◽  
...  

2017 ◽  
Author(s):  
Dariya K Sydykova ◽  
Claus O Wilke

Many applications require the calculation of site-specific evolutionary rates from alignments of amino-acid sequences. For example, catalytic residues in enzymes and interface regions in protein complexes can be inferred from observed relative rates. While numerous approaches exist to calculate amino-acid rates, however, it is not entirely clear what physical quantities the inferred rates represent and how these rates relate to the underlying fitness landscape of the evolving protein. Further, amino-acid rates can be calculated in the context of different amino-acid exchangeability matrices, such as JTT, LG, or WAG, and again it is not known how the choice of the matrix influences the physical interpretation of the inferred rates. Here, we develop a theory of measurement for site-specific evolutionary rates, but analytically solving the maximum-likelihood equations for rate inference performed on sequences evolved under a mutation–selection model. We demonstrate that the measurement process can only recover the true expected rates of the mutation–selection model if rates are measured relative to a naïve exchangeability matrix, in which all exchangeabilities are equal to one. Rate measurements using other matrices are quantitatively close but not mathematically correct. Our results demonstrate that insights obtained from phylogenetic-tree inference do not necessarily apply to rate inference, and best practices for the former may be deleterious for the latter.


2017 ◽  
Author(s):  
Dariya K Sydykova ◽  
Claus O Wilke

Many applications require the calculation of site-specific evolutionary rates from alignments of amino-acid sequences. For example, catalytic residues in enzymes and interface regions in protein complexes can be inferred from observed relative rates. While numerous approaches exist to calculate amino-acid rates, however, it is not entirely clear what physical quantities the inferred rates represent and how these rates relate to the underlying fitness landscape of the evolving protein. Further, amino-acid rates can be calculated in the context of different amino-acid exchangeability matrices, such as JTT, LG, or WAG, and again it is not known how the choice of the matrix influences the physical interpretation of the inferred rates. Here, we develop a theory of measurement for site-specific evolutionary rates, but analytically solving the maximum-likelihood equations for rate inference performed on sequences evolved under a mutation–selection model. We demonstrate that the measurement process can only recover the true expected rates of the mutation–selection model if rates are measured relative to a naïve exchangeability matrix, in which all exchangeabilities are equal to one. Rate measurements using other matrices are quantitatively close but not mathematically correct. Our results demonstrate that insights obtained from phylogenetic-tree inference do not necessarily apply to rate inference, and best practices for the former may be deleterious for the latter.


2017 ◽  
Author(s):  
Dariya K. Sydykova ◽  
Claus O Wilke

Site-specific evolutionary rates can be estimated from codon sequences or from amino-acid sequences. For codon sequences, the most popular methods use some variation of the dN/dS ratio. For amino-acid sequences, one widely-used method is called Rate4Site, and it assigns a relative conservation score to each site in an alignment. How site-wise dN/dS values relate to Rate4Site scores is not known. Here we elucidate the relationship between these two rate measurements. We simulate sequences with known dN/dS, using either dN/dS models or mutation--selection models for simulation. We then infer Rate4Site scores on the simulated alignments, and we compare those scores to either true or inferred dN/dS values on the same alignments. We find that Rate4Site scores generally correlate well with true dN/dS, and the correlation strengths increase in alignments with higher sequence divergence and higher number of taxa. Moreover, Rate4Site scores correlate nearly perfectly with inferred dN/dS values, even for small alignments with little divergence. Finally, we verify this relationship between Rate4Site and dN/dS in a variety of natural sequence alignments. We conclude that codon-level and amino-acid-level analysis frameworks are directly comparable and yield near-identical inferences.


2017 ◽  
Author(s):  
Dariya K. Sydykova ◽  
Claus O Wilke

Site-specific evolutionary rates can be estimated from codon sequences or from amino-acid sequences. For codon sequences, the most popular methods use some variation of the dN/dS ratio. For amino-acid sequences, one widely-used method is called Rate4Site, and it assigns a relative conservation score to each site in an alignment. How site-wise dN/dS values relate to Rate4Site scores is not known. Here we elucidate the relationship between these two rate measurements. We simulate sequences with known dN/dS, using either dN/dS models or mutation--selection models for simulation. We then infer Rate4Site scores on the simulated alignments, and we compare those scores to either true or inferred dN/dS values on the same alignments. We find that Rate4Site scores generally correlate well with true dN/dS, and the correlation strengths increase in alignments with higher sequence divergence and higher number of taxa. Moreover, Rate4Site scores correlate nearly perfectly with inferred dN/dS values, even for small alignments with little divergence. Finally, we verify this relationship between Rate4Site and dN/dS in a variety of natural sequence alignments. We conclude that codon-level and amino-acid-level analysis frameworks are directly comparable and yield near-identical inferences.


2018 ◽  
Author(s):  
Dariya K. Sydykova ◽  
Claus O. Wilke

In the field of molecular evolution, we commonly calculate site-specific evolutionary rates from alignments of amino-acid sequences. For example, catalytic residues in enzymes and interface regions in protein complexes can be inferred from observed relative rates. While numerous approaches exist to calculate amino-acid rates, it is not entirely clear what physical quantities the inferred rates represent and how these rates relate to the underlying fitness landscape of the evolving proteins. Further, amino-acid rates can be calculated in the context of different amino-acid exchangeability matrices, such as JTT, LG, or WAG, and again it is not well understood how the choice of the matrix influences the physical inter-pretation of the inferred rates. Here, we develop a theory of measurement for site-specific evolutionary rates, by analytically solving the maximum-likelihood equations for rate inference performed on sequences evolved under a mutation–selection model. We demonstrate that for realistic analysis settings the measurement process will recover the true expected rates of the mutation–selection model if rates are measured relative to a naïve exchangeability matrix, in which all exchangeabilities are equal to 1/19. We also show that rate measurements using other matrices are quantitatively close but in general not mathematically equivalent. Our results demonstrate that insights obtained from phylogenetic-tree inference do not necessarily apply to rate inference, and best practices for the former may be deleterious for the latter.Significance StatementMaximum likelihood inference is widely used to infer model parameters from sequence data in an evolutionary context. One major challenge in such inference procedures is the problem of having to identify the appropriate model used for inference. Model parameters usually are meaningful only to the extent that the model is appropriately specified and matches the process that generated the data. However, in practice, we don’t know what process generated the data, and most models in actual use are misspecified. To circumvent this problem, we show here that we can employ maximum likelihood inference to make defined and meaningful measurements on arbitrary processes. Our approach uses misspecification as a deliberate strategy, and this strategy results in robust and meaningful parameter inference.


1999 ◽  
Vol 73 (5) ◽  
pp. 4413-4426 ◽  
Author(s):  
Stephen E. Lindstrom ◽  
Yasuaki Hiromoto ◽  
Hidekazu Nishimura ◽  
Takehiko Saito ◽  
Reiko Nerome ◽  
...  

ABSTRACT Phylogenetic profiles of the genes coding for the hemagglutinin (HA) protein, nucleoprotein (NP), matrix (M) protein, and nonstructural (NS) proteins of influenza B viruses isolated from 1940 to 1998 were analyzed in a parallel manner in order to understand the evolutionary mechanisms of these viruses. Unlike human influenza A (H3N2) viruses, the evolutionary pathways of all four genes of recent influenza B viruses revealed similar patterns of genetic divergence into two major lineages. Although evolutionary rates of the HA, NP, M, and NS genes of influenza B viruses were estimated to be generally lower than those of human influenza A viruses, genes of influenza B viruses demonstrated complex phylogenetic patterns, indicating alternative mechanisms for generation of virus variability. Topologies of the evolutionary trees of each gene were determined to be quite distinct from one another, showing that these genes were evolving in an independent manner. Furthermore, variable topologies were apparently the result of frequent genetic exchange among cocirculating epidemic viruses. Evolutionary analysis done in the present study provided further evidence for cocirculation of multiple lineages as well as sequestering and reemergence of phylogenetic lineages of the internal genes. In addition, comparison of deduced amino acid sequences revealed a novel amino acid deletion in the HA1 domain of the HA protein of recent isolates from 1998 belonging to the B/Yamagata/16/88-like lineage. It thus became apparent that, despite lower evolutionary rates, influenza B viruses were able to generate genetic diversity among circulating viruses through a combination of evolutionary mechanisms involving cocirculating lineages and genetic reassortment by which new variants with distinct gene constellations emerged.


Sign in / Sign up

Export Citation Format

Share Document