scholarly journals Accounting for spatial sampling patterns in Bayesian phylogeography

2021 ◽  
Vol 118 (52) ◽  
pp. e2105273118
Author(s):  
Stéphane Guindon ◽  
Nicola De Maio

Statistical phylogeography provides useful tools to characterize and quantify the spread of organisms during the course of evolution. Analyzing georeferenced genetic data often relies on the assumption that samples are preferentially collected in densely populated areas of the habitat. Deviation from this assumption negatively impacts the inference of the spatial and demographic dynamics. This issue is pervasive in phylogeography. It affects analyses that approximate the habitat as a set of discrete demes as well as those that treat it as a continuum. The present study introduces a Bayesian modeling approach that explicitly accommodates for spatial sampling strategies. An original inference technique, based on recent advances in statistical computing, is then described that is most suited to modeling data where sequences are preferentially collected at certain locations, independently of the outcome of the evolutionary process. The analysis of georeferenced genetic sequences from the West Nile virus in North America along with simulated data shows how assumptions about spatial sampling may impact our understanding of the forces shaping biodiversity across time and space.




2021 ◽  
Vol 19 (2) ◽  
pp. 1128-1153
Author(s):  
Junhua Ku ◽  
◽  
Shuijia Li ◽  
Wenyin Gong ◽  

<abstract><p>The accuracy of unknown parameters determines the accuracy of photovoltaic (PV) models that occupy an important position in the PV power generation system. Due to the complexity of the equation equivalent of PV models, estimating the parameters of the PV model is still an arduous task. In order to accurately and reliably estimate the unknown parameters in PV models, in this paper, an enhanced Rao-1 algorithm is proposed. The main point of enhancement lies in i) a repaired evolution operator is presented; ii) to prevent the Rao-1 algorithm from falling into a local optimum, a new evolution operator is developed; iii) in order to enable population size to change adaptively with the evolutionary process, the population size linear reduction strategy is employed. To verify the validity of ERao-1 algorithm, we embark a study on parameter estimation of three different PV models. Experimental results show that the proposed ERao-1 algorithm performs better than existing parameter estimation algorithms in terms of the accuracy and reliability, especially for the double diode model with RMSE 9.8248E-04, three diode model with RMSE 9.8257E-04 for the R.T.C France silicon cell, and 2.4251E-03 for the three diode model of Photowatt- PWP201 cell. In addition, the fitting curve of the simulated data and the measured data also shows the accuracy of the estimated parameters.</p></abstract>



2012 ◽  
Vol 9 (73) ◽  
pp. 1797-1808 ◽  
Author(s):  
Eric de Silva ◽  
Neil M. Ferguson ◽  
Christophe Fraser

Using sequence data to infer population dynamics is playing an increasing role in the analysis of outbreaks. The most common methods in use, based on coalescent inference, have been widely used but not extensively tested against simulated epidemics. Here, we use simulated data to test the ability of both parametric and non-parametric methods for inference of effective population size (coded in the popular BEAST package) to reconstruct epidemic dynamics. We consider a range of simulations centred on scenarios considered plausible for pandemic influenza, but our conclusions are generic for any exponentially growing epidemic. We highlight systematic biases in non-parametric effective population size estimation. The most prominent such bias leads to the false inference of slowing of epidemic spread in the recent past even when the real epidemic is growing exponentially. We suggest some sampling strategies that could reduce (but not eliminate) some of the biases. Parametric methods can correct for these biases if the infected population size is large. We also explore how some poor sampling strategies (e.g. that over-represent epidemiologically linked clusters of cases) could dramatically exacerbate bias in an uncontrolled manner. Finally, we present a simple diagnostic indicator, based on coalescent density and which can easily be applied to reconstructed phylogenies, that identifies time-periods for which effective population size estimates are less likely to be biased. We illustrate this with an application to the 2009 H1N1 pandemic.







2020 ◽  
Vol 36 (Supplement_2) ◽  
pp. i684-i691
Author(s):  
Sarah Christensen ◽  
Juho Kim ◽  
Nicholas Chia ◽  
Oluwasanmi Koyejo ◽  
Mohammed El-Kebir

Abstract Motivation While each cancer is the result of an isolated evolutionary process, there are repeated patterns in tumorigenesis defined by recurrent driver mutations and their temporal ordering. Such repeated evolutionary trajectories hold the potential to improve stratification of cancer patients into subtypes with distinct survival and therapy response profiles. However, current cancer phylogeny methods infer large solution spaces of plausible evolutionary histories from the same sequencing data, obfuscating repeated evolutionary patterns. Results To simultaneously resolve ambiguities in sequencing data and identify cancer subtypes, we propose to leverage common patterns of evolution found in patient cohorts. We first formulate the Multiple Choice Consensus Tree problem, which seeks to select a tumor tree for each patient and assign patients into clusters in such a way that maximizes consistency within each cluster of patient trees. We prove that this problem is NP-hard and develop a heuristic algorithm, Revealing Evolutionary Consensus Across Patients (RECAP), to solve this problem in practice. Finally, on simulated data, we show RECAP outperforms existing methods that do not account for patient subtypes. We then use RECAP to resolve ambiguities in patient trees and find repeated evolutionary trajectories in lung and breast cancer cohorts. Availability and implementation https://github.com/elkebir-group/RECAP. Supplementary information Supplementary data are available at Bioinformatics online.



2014 ◽  
Vol 7 (7) ◽  
pp. 2313-2335 ◽  
Author(s):  
P. R. Colarco ◽  
R. A. Kahn ◽  
L. A. Remer ◽  
R. C. Levy

Abstract. We use the Moderate Resolution Imaging Spectroradiometer (MODIS) satellite aerosol optical thickness (AOT) product to assess the impact of reduced swath width on global and regional AOT statistics and trends. Along-track and across-track sampling strategies are employed, in which the full MODIS data set is sub-sampled with various narrow-swath (~ 400–800 km) and single pixel width (~ 10 km) configurations. Although view-angle artifacts in the MODIS AOT retrieval confound direct comparisons between averages derived from different sub-samples, careful analysis shows that with many portions of the Earth essentially unobserved, spatial sampling introduces uncertainty in the derived seasonal–regional mean AOT. These AOT spatial sampling artifacts comprise up to 60% of the full-swath AOT value under moderate aerosol loading, and can be as large as 0.1 in some regions under high aerosol loading. Compared to full-swath observations, narrower swath and single pixel width sampling exhibits a reduced ability to detect AOT trends with statistical significance. On the other hand, estimates of the global, annual mean AOT do not vary significantly from the full-swath values as spatial sampling is reduced. Aggregation of the MODIS data at coarse grid scales (10°) shows consistency in the aerosol trends across sampling strategies, with increased statistical confidence, but quantitative errors in the derived trends are found even for the full-swath data when compared to high spatial resolution (0.5°) aggregations. Using results of a model-derived aerosol reanalysis, we find consistency in our conclusions about a seasonal–regional spatial sampling artifact in AOT. Furthermore, the model shows that reduced spatial sampling can amount to uncertainty in computed shortwave top-of-atmosphere aerosol radiative forcing of 2–3 W m−2. These artifacts are lower bounds, as possibly other unconsidered sampling strategies would perform less well. These results suggest that future aerosol satellite missions having significantly less than full-swath viewing are unlikely to sample the true AOT distribution well enough to obtain the statistics needed to reduce uncertainty in aerosol direct forcing of climate.



2018 ◽  
Vol 2 ◽  
Author(s):  
Jonas Bylemans ◽  
Dianne M. Gleeson ◽  
Mark Lintermans ◽  
Christopher M. Hardy ◽  
Matthew Beitzel ◽  
...  

Monitoring aquatic biodiversity through DNA extracted from environmental samples (eDNA) combined with high-throughput sequencing, commonly referred to as eDNA metabarcoding, is increasing in popularity within the scientific community. However, sampling strategies, laboratory protocols and analytical pipelines can influence the results of eDNA metabarcoding surveys. While the impact of laboratory protocols and analytical pipelines have been extensively studied, the importance of sampling strategies on eDNA metabarcoding surveys has not received the same attention. To avoid underestimating local biodiversity, adequate sampling strategies (i.e. sampling intensity and spatial sampling replication) need to be implemented. This study evaluated the impact of sampling strategies along an altitudinal and biodiversity gradient in the upper section of the Murrumbidgee River (Murray-Darling Basin, Australia). An eDNA metabarcoding survey was used to determine the local fish biodiversity and evaluate the influence of sampling intensity and spatial sampling replication on the biodiversity estimates. The results show that optimal eDNA sampling strategies varied between sites and indicate that river morphology, species richness and species abundance affect the optimal sampling intensity and spatial sampling replication needed to accurately assess the fish biodiversity. While the generality of the patterns will need to be confirmed through future studies, these findings provide a basis to guide future eDNA metabarcoding surveys in river systems.



2021 ◽  
Author(s):  
T. Latrille ◽  
V. Lanore ◽  
N. Lartillot

AbstractMutation-selection phylogenetic codon models are grounded on population genetics first principles and represent a principled approach for investigating the intricate interplay between mutation, selection and drift. In their current form, mutation-selection codon models are entirely characterized by the collection of site-specific amino-acid fitness profiles. However, thus far, they have relied on the assumption of a constant genetic drift, translating into a unique effective population size (Ne) across the phylogeny, clearly an unreasonable hypothesis. This assumption can be alleviated by introducing variation in Ne between lineages. In addition to Ne, the mutation rate (μ) is susceptible to vary between lineages, and both should co-vary with life-history traits (LHTs). This suggests that the model should more globally account for the joint evolutionary process followed by all of these lineage-specific variables (Ne, μ, and LHTs). In this direction, we introduce an extended mutation-selection model jointly reconstructing in a Bayesian Monte Carlo framework the fitness landscape across sites and long-term trends in Ne, μ and LHTs along the phylogeny, from an alignment of DNA coding sequences and a matrix of observed LHTs in extant species. The model was tested against simulated data and applied to empirical data in mammals, isopods and primates. The reconstructed history of Ne in these groups appears to correlate with LHTs or ecological variables in a way that suggests that the reconstruction is reasonable, at least in its global trends. On the other hand, the range of variation in Ne inferred across species is surprisingly narrow. This last point suggests that some of the assumptions of the model, in particular concerning the assumed absence of epistatic interactions between sites, are potentially problematic.



Sign in / Sign up

Export Citation Format

Share Document