scholarly journals Evaluating the Impact of Purifying Selection on Species-level Molecular Dating

2019 ◽  
Author(s):  
Chong He ◽  
Dan Liang ◽  
Peng Zhang

AbstractThe neutral theory of molecular evolution suggests that the constancy of the molecular clock relies on the neutral condition. Thus, purifying selection, the most common type of natural selection, could influence the constancy of the molecular clock, and the use of genes/sites under purifying selection may produce less reliable molecular dating results. However, in current practices of species-level molecular dating, some researchers prefer to select slowly evolving genes/sites to avoid the potential impact of substitution saturation. These genes/sites are generally under a strong influence of purifying selection. Here, from the data of 23 published mammal genomes, we constructed datasets under various selective constraints. We compared the differences in branch lengths and time estimates among these datasets to investigate the impact of purifying selection on species-level molecular dating. We found that as the selective constraint increases, terminal branches are extended, which introduces biases into the result of species-level molecular dating. This result suggests that in species-level molecular dating, the impact of purifying selection should be taken into consideration, and researchers should be more cautious with the use of genes/sites under purifying selection.

2020 ◽  
Vol 36 (Supplement_2) ◽  
pp. i884-i894
Author(s):  
Jose Barba-Montoya ◽  
Qiqing Tao ◽  
Sudhir Kumar

Abstract Motivation As the number and diversity of species and genes grow in contemporary datasets, two common assumptions made in all molecular dating methods, namely the time-reversibility and stationarity of the substitution process, become untenable. No software tools for molecular dating allow researchers to relax these two assumptions in their data analyses. Frequently the same General Time Reversible (GTR) model across lineages along with a gamma (+Γ) distributed rates across sites is used in relaxed clock analyses, which assumes time-reversibility and stationarity of the substitution process. Many reports have quantified the impact of violations of these underlying assumptions on molecular phylogeny, but none have systematically analyzed their impact on divergence time estimates. Results We quantified the bias on time estimates that resulted from using the GTR + Γ model for the analysis of computer-simulated nucleotide sequence alignments that were evolved with non-stationary (NS) and non-reversible (NR) substitution models. We tested Bayesian and RelTime approaches that do not require a molecular clock for estimating divergence times. Divergence times obtained using a GTR + Γ model differed only slightly (∼3% on average) from the expected times for NR datasets, but the difference was larger for NS datasets (∼10% on average). The use of only a few calibrations reduced these biases considerably (∼5%). Confidence and credibility intervals from GTR + Γ analysis usually contained correct times. Therefore, the bias introduced by the use of the GTR + Γ model to analyze datasets, in which the time-reversibility and stationarity assumptions are violated, is likely not large and can be reduced by applying multiple calibrations. Availability and implementation All datasets are deposited in Figshare: https://doi.org/10.6084/m9.figshare.12594638.


2019 ◽  
Vol 10 (1) ◽  
Author(s):  
Eugene J. Gardner ◽  
Elena Prigmore ◽  
Giuseppe Gallone ◽  
Petr Danecek ◽  
Kaitlin E. Samocha ◽  
...  

Abstract Mobile genetic Elements (MEs) are segments of DNA which can copy themselves and other transcribed sequences through the process of retrotransposition (RT). In humans several disorders have been attributed to RT, but the role of RT in severe developmental disorders (DD) has not yet been explored. Here we identify RT-derived events in 9738 exome sequenced trios with DD-affected probands. We ascertain 9 de novo MEs, 4 of which are likely causative of the patient’s symptoms (0.04%), as well as 2 de novo gene retroduplications. Beyond identifying likely diagnostic RT events, we estimate genome-wide germline ME mutation rate and selective constraint and demonstrate that coding RT events have signatures of purifying selection equivalent to those of truncating mutations. Overall, our analysis represents a comprehensive interrogation of the impact of retrotransposition on protein coding genes and a framework for future evolutionary and disease studies.


2017 ◽  
Author(s):  
Fabia U. Battistuzzi ◽  
Qiqing Tao ◽  
Lance Jones ◽  
Koichiro Tamura ◽  
Sudhir Kumar

AbstractThe RelTime method estimates divergence times when evolutionary rates vary among lineages. Theoretical analyses show that RelTime relaxes the strict molecular clock throughout a molecular phylogeny, and it performs well in the analysis of empirical and computer simulated datasets in which evolutionary rates are variable. Lozano-Fernandez et al. (2017) found that the application of RelTime to one metazoan dataset (Erwin et al. 2011) produced equal rates for several ancient lineages, which led them to speculate that RelTime imposes a strict molecular clock for deep animal divergences. RelTime does not impose a strict molecular clock. The pattern observed by Lozano-Fernandez et al. (2017) was a result of the use of an option to assign the same rate to lineages in RelTime when the rates are not statistically significantly different. The median rate difference was 5% for many deep metazoan lineages for Erwin et al. (2011) dataset, so the rate equality was not rejected. In fact, RelTime analysis with and without the option to test rate differences produced very similar time estimates. We found that the Bayesian time estimates vary widely depending on the root priors assigned, and that the use of less restrictive priors produce Bayesian divergence times that are concordant with those from RelTime for Erwin et al. (2011) dataset. Therefore, it is prudent to discuss Bayesian estimates obtained under a range of priors in any discourse about molecular dating, including method comparisons.


2019 ◽  
Author(s):  
vicente M Cabrera

Abstract Background The molecular clock is the most important genetic tool to estimate evolutionary timescales. However, the detection of a time dependency effect on the mutation rate estimates is complicating its application. It has been suggested that demographic processes could be the main cause of this confounding effect. In the present study I propose a new algorithm to estimate the coalescent age of phylogenetically related sequences, taking into account the observed time dependency effect on the molecular rate detected by others.Results Applying this method to real human mitochondrial DNA trees, with shallow and deep topologies, I have obtained significantly older molecular ages for the main events of human evolution than in previous estimates. These ages are in close agreement with the most recent archaeological and paleontological records that are in favor of an emergence of early anatomically modern humans in Africa at 315 ± 34 thousand years ago and the presence of recent modern humans out of Africa as early as 174 ± 48 thousand years ago. Furthermore, in the implementation process, we demonstrated that in a population with fluctuating sizes, the probability of fixation of a new neutral mutant depends on the effective population size which is more in accordance with the fact that, under the neutral theory of molecular evolution, the fate of a molecular mutation is mainly determined by random drift.Conclusions I suggest that the demographic history of populations has a more decisive effect than purifying selection and/or mutational saturation on the time dependence effect observed for the substitution rate.


2020 ◽  
Author(s):  
Tshifhiwa G. Matumba ◽  
Jody Oliver ◽  
Nigel P. Barker ◽  
Christopher D. McQuaid ◽  
Peter R. Teske

AbstractMitochondrial DNA (mtDNA) has long been used to date the divergence between species, and to explore the time when species’ effective population sizes changed. The idea that mitochondrial DNA is useful for molecular dating rests on the premise that its evolution is neutral. This premise was questionable to begin with, and even though it has long been challenged, the evidence against clock-like evolution of mtDNA is usually ignored. Here, we present a particularly clear and simple example to illustrate the implications of violations of the assumption of selective neutrality. DNA sequences were generated for the mtDNA COI gene and the nuclear 28S rRNA of two closely related and widely distributed rocky shore snails whose geographical ranges are defined by different thermal preferences. To our knowledge, this is the first study to use nuclear rRNA sequence for studying species-level genealogies instead of phylogenetics, presumably because this marker is considered to be uninformative at this taxonomic level. Even though the COI gene evolves at least an order of magnitude faster, which was reflected in high inter-specific divergence, intraspecific genetic variation was similar for both markers. As a result, estimates of population expansion times based on mismatch distributions were completely different for the two markers. Assuming that 28S evolves effectively clock-like, these findings likely illustrate variation-reducing purifying selection in mtDNA at the species level, and an elevated divergence rate caused by divergent selection between the two species. Although these two selective forces together make mtDNA suitable as a DNA barcoding marker because they create a ‘barcoding gap’, estimates of demographic change can be expected to be highly unreliable. Our study contributes to the growing evidence that the utility of mtDNA beyond DNA barcoding is limited.


2020 ◽  
Author(s):  
Jose Barba-Montoya ◽  
Qiqing Tao ◽  
Sudhir Kumar

AbstractMotivationAs the number and diversity of species and genes grow in contemporary datasets, two common assumptions made in all molecular dating methods, namely the time-reversibility and stationarity of the substitution process, become untenable. No software tools for molecular dating allow researchers to relax these two assumptions in their data analyses. Frequently the same General Time Reversible (GTR) model across lineages along with a gamma (+Γ) distributed rates across sites is used in relaxed clock analyses, which assumes time-reversibility and stationarity of the substitution process. Many reports have quantified the impact of violations of these underlying assumptions on molecular phylogeny, but none have systematically analyzed their impact on divergence time estimates.ResultsWe quantified the bias on time estimates that resulted from using the GTR+Γ model for the analysis of computer-simulated nucleotide sequence alignments that were evolved with non-stationary (NS) and non-reversible (NR) substitution models. We tested Bayesian and RelTime approaches that do not require a molecular clock for estimating divergence times. Divergence times obtained using a GTR+Γ model differed only slightly (∼3% on average) from the expected times for NR datasets, but the difference was larger for NS datasets (∼10% on average). The use of only a few calibrations reduced these biases considerably (∼5%). Confidence and credibility intervals from GTR+Γ analysis usually contained correct times. Therefore, the bias introduced by the use of the GTR+Γ model to analyze datasets, in which the time-reversibility and stationarity assumptions are violated, is likely not large and can be reduced by applying multiple calibrations.AvailabilityAll datasets are deposited in Figshare: https://doi.org/10.6084/[email protected]


2019 ◽  
Vol 1 (1) ◽  
Author(s):  
D C Blackburn ◽  
G Giribet ◽  
D E Soltis ◽  
E L Stanley

Abstract Although our inventory of Earth’s biodiversity remains incomplete, we still require analyses using the Tree of Life to understand evolutionary and ecological patterns. Because incomplete sampling may bias our inferences, we must evaluate how future additions of newly discovered species might impact analyses performed today. We describe an approach that uses taxonomic history and phylogenetic trees to characterize the impact of past species discoveries on phylogenetic knowledge using patterns of branch-length variation, tree shape, and phylogenetic diversity. This provides a framework for assessing the relative completeness of taxonomic knowledge of lineages within a phylogeny. To demonstrate this approach, we use recent large phylogenies for amphibians, reptiles, flowering plants, and invertebrates. Well-known clades exhibit a decline in the mean and range of branch lengths that are added each year as new species are described. With increased taxonomic knowledge over time, deep lineages of well-known clades become known such that most recently described new species are added close to the tips of the tree, reflecting changing tree shape over the course of taxonomic history. The same analyses reveal other clades to be candidates for future discoveries that could dramatically impact our phylogenetic knowledge. Our work reveals that species are often added non-randomly to the phylogeny over multiyear time-scales in a predictable pattern of taxonomic maturation. Our results suggest that we can make informed predictions about how new species will be added across the phylogeny of a given clade, thus providing a framework for accommodating unsampled undescribed species in evolutionary analyses.


2011 ◽  
Vol 77 (16) ◽  
pp. 5682-5687 ◽  
Author(s):  
Erin E. King ◽  
Rachel P. Smith ◽  
Benoit St-Pierre ◽  
André-Denis G. Wright

ABSTRACTIn the dairy cattle industry, Holstein and Jersey are the breeds most commonly used for production. They differ in performance by various traits, such as body size, milk production, and milk composition. With increased concerns about the impact of agriculture on climate change, potential differences in other traits, such as methane emission, also need to be characterized further. Since methane is produced in the rumen by methanogenic archaea, we investigated whether the population structure of methanogen communities would differ between Holsteins and Jerseys. Breed-specific rumen methanogen 16S rRNA gene clone libraries were constructed from pooled PCR products obtained from lactating Holstein and Jersey cows, generating 180 and 185 clones, respectively. The combined 365 sequences were assigned to 55 species-level operational taxonomic units (OTUs). Twenty OTUs, representing 85% of the combined library sequences, were common to both breeds, while 23 OTUs (36 sequences) were found only in the Holstein library and 12 OTUs (18 sequences) were found only in the Jersey library, highlighting increased diversity in the Holstein library. Other differences included the observation that sequences with species-like sequence identity toMethanobrevibacter milleraewere represented more highly in the Jersey breed, whileMethanosphaera-related sequences and novel uncultured methanogen clones were more frequent in the Holstein library. In contrast, OTU sequences with species-level sequence identity toMethanobrevibacter ruminantiumwere represented similarly in both libraries. Since the sampled animals were from a single herd consisting of two breeds which were fed the same diet and maintained under the same environmental conditions, the differences we observed may be due to differences in host breed genetics.


2008 ◽  
Vol 62 (1) ◽  
pp. 1-9 ◽  
Author(s):  
Ashish Tripathi ◽  
Rabih E. Jabbour ◽  
Patrick J. Treado ◽  
Jason H. Neiss ◽  
Matthew P. Nelson ◽  
...  

Raman spectroscopy is being evaluated as a candidate technology for waterborne pathogen detection. We have investigated the impact of key experimental and background interference parameters on the bacterial species level identification performance of Raman detection. These parameters include laser-induced photodamage threshold, composition of water matrix, and organism aging in water. The laser-induced photodamage may be minimized by operating a 532 nm continuous wave laser excitation at laser power densities below 2300 W/cm2 for Gram-positive Bacillus atrophaeus (formerly Bacillus globigii, BG) vegetative cells, 2800 W/cm2 for BG spores, and 3500 W/cm2 for Gram-negative E. coli (EC) organisms. In general, Bacillus spore microorganism preparations may be irradiated with higher laser power densities than the equivalent Bacillus vegetative preparations. In order to evaluate the impact of background interference and organism aging, we selected a biomaterials set comprising Gram-positive (anthrax simulants) organisms, Gram-negative (plague simulant) organisms, and proteins (toxin simulants) and constructed a Raman signature classifier that identifies at the species level. Subsequently, we evaluated the impact of tap water and storage time in water (aging) on the classifier performance when characterizing B. thuringiensis spores, BG spores, and EC cell preparations. In general, the measured Raman signatures of biological organisms exhibited minimal spectral variability with respect to the age of a resting suspension and water matrix composition. The observed signature variability did not substantially degrade discrimination performance at the genus and species levels. In addition, Raman chemical imaging spectroscopy was used to distinguish a mixture of BG spores and EC cells at the single cell level.


Sign in / Sign up

Export Citation Format

Share Document