scholarly journals A simple island biodiversity model is robust to trait dependence in diversification and colonization rates

2022 ◽  
Author(s):  
Shu Xie ◽  
Luis Valente ◽  
Rampal Etienne

The application of state-dependent speciation and extinction (SSE) models to phylogenetic trees has revealed an important role for traits in diversification. However, this role remains comparatively unexplored on islands, which can include multiple independent clades resulting from different colonization events. Here, we perform a robustness study to identify how trait-dependence in rates of island colonization, extinction and speciation (CES rates) affects the estimation accuracy of a phylogenetic model that assumes no rate variation between trait states. We extend the DAISIE (Dynamic Assembly of Islands through Speciation, Immigration and Extinction) simulation model to include state-dependent rates, and evaluate the robustness of the DAISIE inference model using simulated data. Our results show that when the CES rate differences between trait states are moderate, DAISIE shows negligible error for a variety of island diversity metrics. However, for large differences in speciation rates, we find large errors when reconstructing clade size variation and non-endemic species diversity through time. We conclude that for many biologically realistic scenarios with trait-dependent speciation and colonization, island diversity dynamics can be accurately estimated without the need to explicitly model trait dynamics. Nonetheless, our new simulation model may provide a useful tool for studying patterns of trait variation.

Author(s):  
Vadim Puller ◽  
Pavel Sagulenko ◽  
Richard A. Neher

AbstractNatural selection imposes a complex filter on which variants persist in a population resulting in evolutionary patterns that vary greatly along the genome. Some sites evolve close to neutrally, while others are highly conserved, allow only specific states or only change in concert with other sites. Most commonly used evolutionary models, however, ignore much of this complexity and at best account for variation in the rate at which different sites change. Here, we present an efficient algorithm to estimate more complex models that allow for site-specific preferences and explore the accuracy at which such models can be estimated from simulated data. We find that an iterative approximate maximum likelihood scheme uses information in the data efficiently and accurately estimates site-specific preferences from large data sets with moderately diverged sequences. Ignoring site-specific preferences during estimation of branch length of phylogenetic trees – an assumption of most phylogeny software – results in substantial underestimation comparable to the error incurred when ignoring rate variation. However, the joint estimation of branch lengths, site-specific rates, and site-specific preferences can suffer from identifiability problems and is typically unable to recover the correct branch lengths. Site-specific preferences estimated from large HIV pol alignments show qualitative concordance with intra-host estimates of fitness costs. Analysis of site-specific HIV substitution models suggests near saturation of divergence after a few hundred years. Such saturation can explain the inability to infer deep divergence times of HIV and SIVs using molecular clock approaches and time-dependent rate estimates.


2019 ◽  
Author(s):  
Benoit Morel ◽  
Alexey M. Kozlov ◽  
Alexandros Stamatakis ◽  
Gergely J. Szöllősi

AbstractInferring phylogenetic trees for individual homologous gene families is difficult because alignments are often too short, and thus contain insufficient signal, while substitution models inevitably fail to capture the complexity of the evolutionary processes. To overcome these challenges species tree-aware methods also leverage information from a putative species tree. However, only few methods are available that implement a full likelihood framework or account for horizontal gene transfers. Furthermore, these methods often require expensive data pre-processing (e.g., computing bootstrap trees), and rely on approximations and heuristics that limit the degree of tree space exploration. Here we present GeneRax, the first maximum likelihood species tree-aware phylogenetic inference software. It simultaneously accounts for substitutions at the sequence level as well as gene level events, such as duplication, transfer, and loss relying on established maximum likelihood optimization algorithms. GeneRax can infer rooted phylogenetic trees for multiple gene families, directly from the per-gene sequence alignments and a rooted, yet undated, species tree. We show that compared to competing tools, on simulated data GeneRax infers trees that are the closest to the true tree in 90% of the simulations in terms of relative Robinson-Foulds distance. On empirical datasets, GeneRax is the fastest among all tested methods when starting from aligned sequences, and it infers trees with the highest likelihood score, based on our model. GeneRax completed tree inferences and reconciliations for 1099 Cyanobacteria families in eight minutes on 512 CPU cores. Thus, its parallelization scheme enables large-scale analyses. GeneRax is available under GNU GPL at https://github.com/BenoitMorel/GeneRax.


2019 ◽  
Author(s):  
Qiqing Tao ◽  
Koichiro Tamura ◽  
Beatriz Mello ◽  
Sudhir Kumar

AbstractConfidence intervals (CIs) depict the statistical uncertainty surrounding evolutionary divergence time estimates. They capture variance contributed by the finite number of sequences and sites used in the alignment, deviations of evolutionary rates from a strict molecular clock in a phylogeny, and uncertainty associated with clock calibrations. Reliable tests of biological hypotheses demand reliable CIs. However, current non-Bayesian methods may produce unreliable CIs because they do not incorporate rate variation among lineages and interactions among clock calibrations properly. Here, we present a new analytical method to calculate CIs of divergence times estimated using the RelTime method, along with an approach to utilize multiple calibration uncertainty densities in these analyses. Empirical data analyses showed that the new methods produce CIs that overlap with Bayesian highest posterior density (HPD) intervals. In the analysis of computer-simulated data, we found that RelTime CIs show excellent average coverage probabilities, i.e., the true time is contained within the CIs with a 95% probability. These developments will encourage broader use of computationally-efficient RelTime approach in molecular dating analyses and biological hypothesis testing.


2019 ◽  
Vol 69 (3) ◽  
pp. 530-544 ◽  
Author(s):  
Michael R May ◽  
Brian R Moore

Abstract Understanding how and why rates of character evolution vary across the Tree of Life is central to many evolutionary questions; for example, does the trophic apparatus (a set of continuous characters) evolve at a higher rate in fish lineages that dwell in reef versus nonreef habitats (a discrete character)? Existing approaches for inferring the relationship between a discrete character and rates of continuous-character evolution rely on comparing a null model (in which rates of continuous-character evolution are constant across lineages) to an alternative model (in which rates of continuous-character evolution depend on the state of the discrete character under consideration). However, these approaches are susceptible to a “straw-man” effect: the influence of the discrete character is inflated because the null model is extremely unrealistic. Here, we describe MuSSCRat, a Bayesian approach for inferring the impact of a discrete trait on rates of continuous-character evolution in the presence of alternative sources of rate variation (“background-rate variation”). We demonstrate by simulation that our method is able to reliably infer the degree of state-dependent rate variation, and show that ignoring background-rate variation leads to biased inferences regarding the degree of state-dependent rate variation in grunts (the fish group Haemulidae). [Bayesian phylogenetic comparative methods; continuous-character evolution; data augmentation; discrete-character evolution.]


2020 ◽  
Vol 30 (1) ◽  
pp. 247-261
Author(s):  
Madli Jõks ◽  
Holger Kreft ◽  
Patrick Weigelt ◽  
Meelis Pärtel

2010 ◽  
Vol 277 (1700) ◽  
pp. 3587-3592 ◽  
Author(s):  
Soo Hyung Eo ◽  
J. Andrew DeWoody

Rates of biological diversification should ultimately correspond to rates of genome evolution. Recent studies have compared diversification rates with phylogenetic branch lengths, but incomplete phylogenies hamper such analyses for many taxa. Herein, we use pairwise comparisons of confamilial sauropsid (bird and reptile) mitochondrial DNA (mtDNA) genome sequences to estimate substitution rates. These molecular evolutionary rates are considered in light of the age and species richness of each taxonomic family, using a random-walk speciation–extinction process to estimate rates of diversification. We find the molecular clock ticks at disparate rates in different families and at different genes. For example, evolutionary rates are relatively fast in snakes and lizards, intermediate in crocodilians and slow in turtles and birds. There was also rate variation across genes, where non-synonymous substitution rates were fastest at ATP8 and slowest at CO 3. Family-by-gene interactions were significant, indicating that local clocks vary substantially among sauropsids. Most importantly, we find evidence that mitochondrial genome evolutionary rates are positively correlated with speciation rates and with contemporary species richness. Nuclear sequences are poorly represented among reptiles, but the correlation between rates of molecular evolution and species diversification also extends to 18 avian nuclear genes we tested. Thus, the nuclear data buttress our mtDNA findings.


Author(s):  
Shengda Wang ◽  
Kourosh Danai

A method of helicopter track and balance is introduced that uses a forward-model to search for the appropriate blade modifications. This method uses an interval model to represent the ranges of effects of blade modifications on helicopter vibration, instead of exact values, in order to cope with the stochastic nature of aircraft vibration. The coefficients of the interval model are initially defined according to sensitivity coefficients between the blade modifications and helicopter vibration, but they are subsequently updated after each tuning iteration to improve the model’s estimation accuracy. The effectiveness of the proposed method is demonstrated through a simulation model that represents experimental vibration measurements of Black Hawk helicopters.


2009 ◽  
Vol 27 (7) ◽  
pp. 2799-2811 ◽  
Author(s):  
I. I. Virtanen ◽  
J. Vierinen ◽  
M. S. Lehtinen

Abstract. Both ionospheric and weather radar communities have already adopted the method of transmitting radar pulses in an aperiodic manner when measuring moderately overspread targets. Among the users of the ionospheric radars, this method is called Aperiodic Transmitter Coding (ATC), whereas the weather radar users have adopted the term Simultaneous Multiple Pulse-Repetition Frequency (SMPRF). When probing the ionosphere at the carrier frequencies of the EISCAT Incoherent Scatter Radar facilities, the range extent of the detectable target is typically of the order of one thousand kilometers – about seven milliseconds – whereas the characteristic correlation time of the scattered signal varies from a few milliseconds in the D-region to only tens of microseconds in the F-region. If one is interested in estimating the scattering autocorrelation function (ACF) at time lags shorter than the F-region correlation time, the D-region must be considered as a moderately overspread target, whereas the F-region is a severely overspread one. Given the technical restrictions of the radar hardware, a combination of ATC and phase-coded long pulses is advantageous for this kind of target. We evaluate such an experiment under infinitely low signal-to-noise ratio (SNR) conditions using lag profile inversion. In addition, a qualitative evaluation under high-SNR conditions is performed by analysing simulated data. The results show that an acceptable estimation accuracy and a very good lag resolution in the D-region can be achieved with a pulse length long enough for simultaneous E- and F-region measurements with a reasonable lag extent. The new experiment design is tested with the EISCAT Tromsø VHF (224 MHz) radar. An example of a full D/E/F-region ACF from the test run is shown at the end of the paper.


2020 ◽  
Vol 37 (9) ◽  
pp. 2763-2774 ◽  
Author(s):  
Benoit Morel ◽  
Alexey M Kozlov ◽  
Alexandros Stamatakis ◽  
Gergely J Szöllősi

Abstract Inferring phylogenetic trees for individual homologous gene families is difficult because alignments are often too short, and thus contain insufficient signal, while substitution models inevitably fail to capture the complexity of the evolutionary processes. To overcome these challenges, species-tree-aware methods also leverage information from a putative species tree. However, only few methods are available that implement a full likelihood framework or account for horizontal gene transfers. Furthermore, these methods often require expensive data preprocessing (e.g., computing bootstrap trees) and rely on approximations and heuristics that limit the degree of tree space exploration. Here, we present GeneRax, the first maximum likelihood species-tree-aware phylogenetic inference software. It simultaneously accounts for substitutions at the sequence level as well as gene level events, such as duplication, transfer, and loss relying on established maximum likelihood optimization algorithms. GeneRax can infer rooted phylogenetic trees for multiple gene families, directly from the per-gene sequence alignments and a rooted, yet undated, species tree. We show that compared with competing tools, on simulated data GeneRax infers trees that are the closest to the true tree in 90% of the simulations in terms of relative Robinson–Foulds distance. On empirical data sets, GeneRax is the fastest among all tested methods when starting from aligned sequences, and it infers trees with the highest likelihood score, based on our model. GeneRax completed tree inferences and reconciliations for 1,099 Cyanobacteria families in 8 min on 512 CPU cores. Thus, its parallelization scheme enables large-scale analyses. GeneRax is available under GNU GPL at https://github.com/BenoitMorel/GeneRax (last accessed June 17, 2020).  


2011 ◽  
Vol 38 (3) ◽  
pp. 305-318 ◽  
Author(s):  
Mohamed El Esawey ◽  
Tarek Sayed

Travel time is a simple and robust network performance measure that is well understood by the public. However, travel time data collection can be costly especially if the analysis area is large. This research proposes a solution to the problem of limited network sensor coverage caused by insufficient sample size of probe vehicles or inadequate numbers of fixed sensors. Within a homogeneous road network, nearby links of similar character are exposed to comparable traffic conditions, and therefore, their travel times are likely to be positively correlated. This correlation can be useful in developing travel time relationships between nearby links so that if data becomes available on a subset of these links, travel times of their neighbours can be estimated. A methodology is proposed to estimate link travel times using available data from neighbouring links. To test the proposed methodology, a case study was undertaken using a VISSIM micro-simulation model of downtown Vancouver. The simulation model was calibrated and validated using field traffic volumes and travel time data. Neighbour links travel time estimation accuracy was assessed using different error measurements and the results were satisfactory. Overall, the results of this research demonstrate the feasibility of using neighbour links data as an additional source of information to estimate travel time, especially in case of limited coverage.


Sign in / Sign up

Export Citation Format

Share Document