scholarly journals Extra base hits: Widespread empirical support for instantaneous multiple-nucleotide changes

PLoS ONE ◽  
2021 ◽  
Vol 16 (3) ◽  
pp. e0248337
Author(s):  
Alexander G. Lucaci ◽  
Sadie R. Wisotsky ◽  
Stephen D. Shank ◽  
Steven Weaver ◽  
Sergei L. Kosakovsky Pond

Despite many attempts to introduce evolutionary models that permit substitutions to instantly alter more than one nucleotide in a codon, the prevailing wisdom remains that such changes are rare and generally negligible or are reflective of non-biological artifacts, such as alignment errors. Codon models continue to posit that only single nucleotide change have non-zero rates. Here, we develop and test a simple hierarchy of codon-substitution models with non-zero evolutionary rates for only one-nucleotide (1H), one- and two-nucleotide (2H), or any (3H) codon substitutions. Using over 42, 000 empirical alignments, we find widespread statistical support for multiple hits: 61% of alignments prefer models with 2H allowed, and 23%—with 3H allowed. Analyses of simulated data suggest that these results are not likely to be due to simple artifacts such as model misspecification or alignment errors. Further modeling reveals that synonymous codon island jumping among codons encoding serine, especially along short branches, contributes significantly to this 3H signal. While serine codons were prominently involved in multiple-hit substitutions, there were other common exchanges contributing to better model fit. It appears that a small subset of sites in most alignments have unusual evolutionary dynamics not well explained by existing model formalisms, and that commonly estimated quantities, such as dN/dS ratios may be biased by model misspecification. Our findings highlight the need for continued evaluation of assumptions underlying workhorse evolutionary models and subsequent evolutionary inference techniques. We provide a software implementation for evolutionary biologists to assess the potential impact of extra base hits in their data in the HyPhy package and in the Datamonkey.org server.

2020 ◽  
Author(s):  
Alexander G Lucaci ◽  
Sadie R Wisotsky ◽  
Stephen D. Shank ◽  
Steven Weaver ◽  
Sergei L. Kosakovsky Pond

AbstractDespite many attempts to introduce evolutionary models that permit substitutions that instantly alter more than one nucleotide in a codon, the prevailing wisdom remains that such changes are rare and generally negligible (or are reflective of non-biological artifacts, such as alignment errors), and codon models continue to posit that only single nucleotide change have non-zero rates. We develop and test a simple hierarchy of codon-substitution models with non-zero evolutionary rates for only one-nucleotide (1H), one- and two-nucleotide (2H), or any (3H) codon substitutions. Using 35,000 empirical alignments, we find widespread statistical support for multiple hits: 58% of alignments prefer models with 2H allowed, and 22% – with 3H allowed. Analyses of simulated data suggest that these results are not likely to be due to simple artifacts such as model misclassification or alignment errors. Further modeling revealed that synonymous codon island jumping among codons encoding serine, especially along short branches, contributes significantly to this 3H signal. While serine codons were prominently involved in multiple-hit substitutions, there were other common exchanges contributing to better model fit. It appears that a small subset of sites in most alignments have unusual evolutionary dynamics not well explained by existing model formalisms, and that commonly estimated quantities, such as dN/dS ratios may be biased by model misspecification. Our findings highlight the need for continued evaluation of assumptions underlying workhorse evolutionary models and subsequent evolutionary inference techniques. We provide a software implementation for evolutionary biologists to assess the potential impact of extra base hits in their data in the HyPhy package.


2020 ◽  
Author(s):  
Paul Robert Connor ◽  
Ellen Riemke Katrien Evers

Payne, Vuletich, and Lundberg’s bias-of-crowds model proposes that a number of empirical puzzles can be resolved by conceptualizing implicit bias as a feature of situations rather than a feature of individuals. In the present article we argue against this model and propose that, given the existing evidence, implicit bias is best understood as an individual-level construct measured with substantial error. First, using real and simulated data, we show how each of Payne and colleagues’ proposed puzzles can be explained as being the result of measurement error and its reduction via aggregation. Second, we discuss why the authors’ counterarguments against this explanation have been unconvincing. Finally, we test a hypothesis derived from the bias-of-crowds model about the effect of an individually targeted “implicit-bias-based expulsion program” within universities and show the model to lack empirical support. We conclude by considering the implications of conceptualizing implicit bias as a noisily measured individual-level construct for ongoing implicit-bias research. All data and code are available at https://osf.io/tj8u6/.


2018 ◽  
Author(s):  
Russell A. Ligon ◽  
Christopher D. Diaz ◽  
Janelle L. Morano ◽  
Jolyon Troscianko ◽  
Martin Stevens ◽  
...  

Ornaments used in courtship often vary wildly among species, reflecting the evolutionary interplay between mate preference functions and the constraints imposed by natural selection. Consequently, understanding the evolutionary dynamics responsible for ornament diversification has been a longstanding challenge in evolutionary biology. However, comparing radically different ornaments across species, as well as different classes of ornaments within species, is a profound challenge to understanding diversification of sexual signals. Using novel methods and a unique natural history dataset, we explore evolutionary patterns of ornament evolution in a group - the birds-of-paradise - exhibiting dramatic phenotypic diversification widely assumed to be driven by sexual selection. Rather than the tradeoff between ornament types originally envisioned by Darwin and Wallace, we found positive correlations among cross-modal (visual/acoustic) signals indicating functional integration of ornamental traits into a composite unit - the courtship phenotype. Furthermore, given the broad theoretical and empirical support for the idea that systemic robustness - functional overlap and interdependency - promotes evolutionary innovation, we posit that birds-of-paradise have radiated extensively through ornamental phenotype space as a consequence of the robustness in the courtship phenotype that we document at a phylogenetic scale. We suggest that the degree of robustness in courtship phenotypes among taxa can provide new insights into the relative influence of sexual and natural selection on phenotypic radiations.Author SummaryAnimals frequently vary widely in ornamentation, even among closely related species. Understanding the patterns that underlie this variation is a significant challenge, requiring comparisons among drastically different traits - like comparing apples to oranges. Here, we use novel analytical approaches to quantify variation in ornamental diversity and richness across the wildly divergent birds-of-paradise, a textbook example of how sexual selection can profoundly shape organismal phenotypes. We find that color and acoustic complexity, along with behavior and acoustic complexity, are positively correlated across evolutionary time-scales. Positive covariation among ornament classes suggests that selection is acting on correlated suites of traits - a composite courtship phenotype - and that this integration may be partially responsible for the extreme variation we see in birds-of-paradise.


2020 ◽  
Vol 37 (11) ◽  
pp. 3131-3148 ◽  
Author(s):  
Noor Youssef ◽  
Edward Susko ◽  
Joseph P Bielawski

Abstract Do interactions between residues in a protein (i.e., epistasis) significantly alter evolutionary dynamics? If so, what consequences might they have on inference from traditional codon substitution models which assume site-independence for the sake of computational tractability? To investigate the effects of epistasis on substitution rates, we employed a mechanistic mutation-selection model in conjunction with a fitness framework derived from protein stability. We refer to this as the stability-informed site-dependent (S-SD) model and developed a new stability-informed site-independent (S-SI) model that captures the average effect of stability constraints on individual sites of a protein. Comparison of S-SI and S-SD offers a novel and direct method for investigating the consequences of stability-induced epistasis on protein evolution. We developed S-SI and S-SD models for three natural proteins and showed that they generate sequences consistent with real alignments. Our analyses revealed that epistasis tends to increase substitution rates compared with the rates under site-independent evolution. We then assessed the epistatic sensitivity of individual site and discovered a counterintuitive effect: Highly connected sites were less influenced by epistasis relative to exposed sites. Lastly, we show that, despite the unrealistic assumptions, traditional models perform comparably well in the presence and absence of epistasis and provide reasonable summaries of average selection intensities. We conclude that epistatic models are critical to understanding protein evolutionary dynamics, but epistasis might not be required for reasonable inference of selection pressure when averaging over time and sites.


2020 ◽  
Vol 36 (18) ◽  
pp. 4699-4705
Author(s):  
Hamid Bagheri ◽  
Andrew J Severin ◽  
Hridesh Rajan

Abstract Motivation As the cost of sequencing decreases, the amount of data being deposited into public repositories is increasing rapidly. Public databases rely on the user to provide metadata for each submission that is prone to user error. Unfortunately, most public databases, such as non-redundant (NR), rely on user input and do not have methods for identifying errors in the provided metadata, leading to the potential for error propagation. Previous research on a small subset of the NR database analyzed misclassification based on sequence similarity. To the best of our knowledge, the amount of misclassification in the entire database has not been quantified. We propose a heuristic method to detect potentially misclassified taxonomic assignments in the NR database. We applied a curation technique and quality control to find the most probable taxonomic assignment. Our method incorporates provenance and frequency of each annotation from manually and computationally created databases and clustering information at 95% similarity. Results We found more than two million potentially taxonomically misclassified proteins in the NR database. Using simulated data, we show a high precision of 97% and a recall of 87% for detecting taxonomically misclassified proteins. The proposed approach and findings could also be applied to other databases. Availability and implementation Source code, dataset, documentation, Jupyter notebooks and Docker container are available at https://github.com/boalang/nr. Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Vol 16 (11) ◽  
pp. 20200418
Author(s):  
Jorge R. Flores

Estimating how fast or slow morphology evolves through time (phenotypic change rate, PR) has become common in macroevolutionary studies and has been important for clarifying key evolutionary events. However, the inclusion of incompletely scored taxa (e.g. fossils) and variable lengths of discrete arbitrary time bins could affect PR estimates and potentially mask real PR patterns. Here, the impact of taxon incompleteness (unscored data) on PR estimates is assessed in simulated data. Three different time bin series were likewise evaluated: bins evenly spanning the tree length (i), a shorter middle bin and longer first and third bins (ii), and a longer middle bin and shorter first and third bins (iii). The results indicate that PR values decrease as taxon incompleteness increases. Statistically significant PR values, and the dispersion among PR values, depended on the time bins. These outcomes imply that taxon incompleteness can undermine our capacity to infer morphology evolutionary dynamics and that these estimates are also influenced by our choice of discrete time bins. More importantly, the present results stress the need for a better approach to deal with taxon incompleteness and arbitrary discrete time bins.


2020 ◽  
Vol 15 (6) ◽  
pp. 1329-1345 ◽  
Author(s):  
Paul Connor ◽  
Ellen R. K. Evers

Payne, Vuletich, and Lundberg’s bias-of-crowds model proposes that a number of empirical puzzles can be resolved by conceptualizing implicit bias as a feature of situations rather than a feature of individuals. In the present article we argue against this model and propose that, given the existing evidence, implicit bias is best understood as an individual-level construct measured with substantial error. First, using real and simulated data, we show how each of Payne and colleagues’ proposed puzzles can be explained as being the result of measurement error and its reduction via aggregation. Second, we discuss why the authors’ counterarguments against this explanation have been unconvincing. Finally, we test a hypothesis derived from the bias-of-crowds model about the effect of an individually targeted “implicit-bias-based expulsion program” within universities and show the model to lack empirical support. We conclude by considering the implications of conceptualizing implicit bias as a noisily measured individual-level construct for ongoing implicit-bias research. All data and code are available at https://osf.io/tj8u6/ .


Blood ◽  
2010 ◽  
Vol 116 (21) ◽  
pp. 2197-2197 ◽  
Author(s):  
Chava Kimchi-Sarfaty ◽  
Vijaya L Simhadri ◽  
David Kopelman ◽  
Adam Friedman ◽  
Nathan Edwards ◽  
...  

Abstract Abstract 2197 Hemophilia B is characterized by structural and functional defects in coagulation factor IX (FIX) caused by mutations in the F9 gene. Various mutations (nonsense, missense, etc.) are known to be associated with the disease, including a synonymous V107V mutation reported recently by Knobe and colleagues (Knobe et al., Hemophilia, 2008). However the mechanism by which this synonymous mutation contributes to the disease has not yet been elucidated. Earlier we have shown that synonymous codon substitutions in the mRNA of the multidrug resistance protein (MDR1) may change the conformation of the protein and result in altered functionality (Kimchi-Sarfaty et al., Science, 2008). Here we have performed in silico analyses of the synonymous codon substitution (GTGàGTA) leading to the V107V polymorphism and found that it may change the mRNA structure, stability, codon usage, and 3D structure of the encoded protein. We hypothesize that changes in codon usage might affect the rhythm of protein translation and thus result in slightly altered FIX conformation. In vitro analyses of FIX mRNA and protein expression supported our in silico analyses. The GTGàGTA (V107V) synonymous mutation results in reduced expression levels as well as an encoded protein with a slightly different conformation compared to wild-type FIX. These results show that the V107V polymorphism is not silent and might cause mild hemophilia B. This work sheds further light on ways in which synonymous mutations impact disease. The findings and conclusions in this article have not been formally disseminated by the Food and Drug Administration and should not be construed to represent any Agency determination policy Disclosures: No relevant conflicts of interest to declare.


2018 ◽  
Author(s):  
Hanna Schenk ◽  
Hinrich Schulenburg ◽  
Arne Traulsen

AbstractBackgroundRed Queen dynamics are defined as long term co-evolutionary dynamics, often with oscillations of genotype abundances driven by fluctuating selection in host-parasite systems. Much of our current understanding of these dynamics is based on theoretical concepts explored in mathematical models that are mostly (i) deterministic, inferring an infinite population size and (ii) evolutionary, thus ecological interactions that change population sizes are excluded. Here, we recall the different mathematical approaches used in the current literature on Red Queen dynamics. We then compare models from game theory (evo) and classical theoretical ecology models (eco-evo), that are all derived from individual interactions and are thus intrinsically stochastic. We assess the influence of this stochasticity through the time to the first loss of a genotype within a host or parasite population.ResultsThe time until the first genotype is lost (“extinction time”), is shorter when ecological dynamics, in the form of a changing population size, is considered. Furthermore, when individuals compete only locally with other individuals extinction is even faster. On the other hand, evolutionary models with a fixed population size and competition on the scale of the whole population prolong extinction and therefore stabilise the oscillations. The stabilising properties of intraspecific competitions become stronger when population size is increased and the deterministic part of the dynamics gain influence. In general, the loss of genotype diversity can be counteracted with mutations (or recombination), which then allow the populations to recurrently undergo negative frequency-dependent selection dynamics and selective sweeps.ConclusionAlthough the models we investigated are equal in their biological motivation and interpretation, they have diverging mathematical properties both in the derived deterministic dynamics and the derived stochastic dynamics. We find that models that do not consider intraspecific competition and that include ecological dynamics by letting the population size vary, lose genotypes – and thus Red Queen oscillations – faster than models with competition and a fixed population size.


Sign in / Sign up

Export Citation Format

Share Document