dna sequence evolution
Recently Published Documents


TOTAL DOCUMENTS

63
(FIVE YEARS 1)

H-INDEX

20
(FIVE YEARS 0)

2020 ◽  
Vol 36 (Supplement_1) ◽  
pp. i353-i361
Author(s):  
Dongjoon Lim ◽  
Mathieu Blanchette

Abstract Motivation Accurate probabilistic models of sequence evolution are essential for a wide variety of bioinformatics tasks, including sequence alignment and phylogenetic inference. The ability to realistically simulate sequence evolution is also at the core of many benchmarking strategies. Yet, mutational processes have complex context dependencies that remain poorly modeled and understood. Results We introduce EvoLSTM, a recurrent neural network-based evolution simulator that captures mutational context dependencies. EvoLSTM uses a sequence-to-sequence long short-term memory model trained to predict mutation probabilities at each position of a given sequence, taking into consideration the 14 flanking nucleotides. EvoLSTM can realistically simulate mammalian and plant DNA sequence evolution and reveals unexpectedly strong long-range context dependencies in mutation probabilities. EvoLSTM brings modern machine-learning approaches to bear on sequence evolution. It will serve as a useful tool to study and simulate complex mutational processes. Availability and implementation Code and dataset are available at https://github.com/DongjoonLim/EvoLSTM. Supplementary information Supplementary data are available at Bioinformatics online.



Author(s):  
Qipian Chen ◽  
Ziwen He ◽  
Xiao Feng ◽  
Hao Yang ◽  
Suhua Shi ◽  
...  

AbstractEvidence for biological adaptation is often obtained by studying DNA sequence evolution. Since the analyses are affected by both positive and negative selection, studies usually assume constant negative selection in the time span of interest. For this reason, hundreds of studies that conclude adaptive evolution might have reported false signals caused by relaxed negative selection. We test this suspicion two ways. First, we analyze the fluctuation in population size, N, during evolution. For example, the evolutionary rate in the primate phylogeny could vary by as much as 2000 fold due to the variation in N alone. Second, we measure the variation in negative selection directly by analyzing the polymorphism data from four taxa (Drosophila, Arabidopsis, primates, and birds, with 64 species in total). The strength of negative selection, as measured by the ratio of nonsynonymous/synonymous polymorphisms, fluctuates strongly and at multiple time scales. The two approaches suggest that the variation in the strength of negative selection may be responsible for the bulk of the reported adaptive genome evolution in the last two decades. This study corroborates the recent report1 on the inconsistent patterns of adaptive genome evolution. Finally, we discuss the path forward in detecting adaptive sequence evolution.



2019 ◽  
Author(s):  
Jing Peng ◽  
David Swofford ◽  
Laura Kubatko

AbtractMotivationThe coalescent model is now widely accepted as a necessary component for phylogenetic inference from genome-scale data. However, because model-based analysis under the coalescent is computationally prohibitive, a variety of inferential frameworks and corresponding algorithms have been proposed for estimation of species-level phylogenies and the associated parameters, including the speciation times and effective population sizes.ResultsWe consider the problem of estimating the timing of speciation events along a phylogeny in a coalescent framework. We propose a pseudolikelihood method for estimation of these speciation times under a model of DNA sequence evolution for which exact site pattern probabilities can be computed. We demonstrate that the pseudolikelihood estimates are statistically consistent and asymptotically normally distributed, and we show how this result can be used to estimate their asymptotic variance. We also provide a more computationally efficient estimator of the asymptotic variance based on the nonparametric bootstrap. We evaluate the performance of our method using simulation and by application to an empirical dataset on gibbons.



2019 ◽  
Author(s):  
Chase W. Nelson ◽  
Yunxin Fu ◽  
Wen-Hsiung Li

AbstractSummaryRecent de novo mutation data allow the estimation of non-reversible mutation rates for trinucleotide sequence contexts. However, existing tools for simulating DNA sequence evolution are limited to time-reversible models or do not consider trinucleotide context-dependent rates. As this ability is critical to testing evolutionary scenarios under neutrality, we created Trevolver. Sequence evolution is simulated on a bifurcating tree using a 64 × 4 trinucleotide mutation model. Runtime is fast and results match theoretical expectation for CpG sites. Simulations with Trevolver will enable neutral hypotheses to be tested at within-species (polymorphism), between-species (divergence), within-host (e.g., viral evolution), and somatic (e.g., cancer) levels of evolutionary change.Availability and ImplementationTrevolver is implemented in Perl and available on GitHub under GNU General Public License (GPL) version 3 at https://github.com/chasewnelson/[email protected] informationFurther details and example data are available on GitHub.



Author(s):  
Asher D. Cutter

Chapter 4, “Neutral theories of molecular evolution,” outlines the logic and predictions of the neutral theory of molecular evolution and its derivatives as a simple conceptual framework for understanding DNA sequence evolution. It introduces the standard neutral model as a null model of evolutionary change in DNA sequences to describe patterns of polymorphism within species and divergence between species. An overview is provided for the molecular clock concept and for predictions about the amount of polymorphism and allele frequency distributions within populations. This chapter covers how population size and selection intersect to define nearly neutral fitness effects and their implications, as well as misinterpretations and misapplications of Neutral Theory. This overview provides a foundation for how theoretical predictions offer null models for tests of molecular evolution developed in later chapters.



2018 ◽  
Author(s):  
Ziwen He ◽  
Qipian Chen ◽  
Hao Yang ◽  
Qingjian Chen ◽  
Suhua Shi ◽  
...  

AbstractA recent study suggests that the evidence of adaptive DNA sequence evolution accumulated in the last 20 years may be suspect1. The suspicion thus calls for a re-examination of the reported evidence. The two main lines of evidence are from the McDonald-Kreitman (MK) test, which compares divergence and polymorphism data, and the PAML test, which analyzes multi-species divergence data. Here, we apply these two tests concurrently on the genomic data of Drosophila and Arabidopsis. To our surprise, the >100 genes identified by the two tests do not overlap beyond random expectations. The results could mean i) high false positives by either test or ii) high false-negatives by both tests due to low powers. To rule out the latter, we merge every 20 - 30 genes into a “supergene”. At the supergene level, the power of detection is high, with 8% - 56% yielding adaptive signals. Nevertheless, the calls still do not overlap. Since it is unlikely that one test is largely correct and the other is mostly wrong (see Discussion), the total evidence of adaptive DNA sequence evolution should be deemed unreliable. As suggested by Chen et al.1, the reported evidence for positive selection may in fact be signals of fluctuating negative selection, which are handled differently by the two tests. Possible paths forward on this central evolutionary issue are discussed.



PLoS Genetics ◽  
2016 ◽  
Vol 12 (7) ◽  
pp. e1006206 ◽  
Author(s):  
Justin T. Page ◽  
Zach S. Liechty ◽  
Rich H. Alexander ◽  
Kimberly Clemons ◽  
Amanda M. Hulse-Kemp ◽  
...  


PLoS Genetics ◽  
2016 ◽  
Vol 12 (5) ◽  
pp. e1006012 ◽  
Author(s):  
Justin T. Page ◽  
Zach S. Liechty ◽  
Rich H. Alexander ◽  
Kimberly Clemons ◽  
Amanda M. Hulse-Kemp ◽  
...  


Sign in / Sign up

Export Citation Format

Share Document