scholarly journals Distinguishing gene flow between malaria parasite populations

2021 ◽  
Author(s):  
Tyler Steven Brown ◽  
Aimee R. Taylor ◽  
Olufunmilayo Arogbokun ◽  
Caroline O. Buckee ◽  
Hsiao-Han Chang

Measuring gene flow between malaria parasite populations in different geographic locations can provide strategic information for malaria control interventions. Multiple important questions pertaining to the design of such studies remain unanswered, limiting efforts to operationalize genomic surveillance tools for routine public health use. This report evaluates numerically the ability to distinguish different levels of gene flow between malaria populations, using different amounts of real and simulated data, where data are simulated using parameters that approximate different epidemiological conditions. Specifically, using Plasmodium falciparum  whole genome sequence data and sequence data simulated for a metapopulation with different migration rates and effective population sizes, we compare two estimators of gene flow, explore the number of genetic markers and number of individuals required to reliably rank highly connected locations, and describe how these thresholds change given different effective population sizes and migration rates. Our results have implications for the design and implementation of malaria genomic surveillance efforts.

PLoS Genetics ◽  
2021 ◽  
Vol 17 (12) ◽  
pp. e1009335
Author(s):  
Tyler S. Brown ◽  
Olufunmilayo Arogbokun ◽  
Caroline O. Buckee ◽  
Hsiao-Han Chang

Measuring gene flow between malaria parasite populations in different geographic locations can provide strategic information for malaria control interventions. Multiple important questions pertaining to the design of such studies remain unanswered, limiting efforts to operationalize genomic surveillance tools for routine public health use. This report examines the use of population-level summaries of genetic divergence (FST) and relatedness (identity-by-descent) to distinguish levels of gene flow between malaria populations, focused on field-relevant questions about data size, sampling, and interpretability of observations from genomic surveillance studies. To do this, we use P. falciparum whole genome sequence data and simulated sequence data approximating malaria populations evolving under different current and historical epidemiological conditions. We employ mobile-phone associated mobility data to estimate parasite migration rates over different spatial scales and use this to inform our analysis. This analysis underscores the complementary nature of divergence- and relatedness-based metrics for distinguishing gene flow over different temporal and spatial scales and characterizes the data requirements for using these metrics in different contexts. Our results have implications for the design and implementation of malaria genomic surveillance studies.


2019 ◽  
Author(s):  
Arun Sethuraman ◽  
Melissa Lynch

AbstractUnsampled or extinct ‘ghost’ populations leave signatures on the genomes of individuals from extant, sampled populations, especially if they have exchanged genes with them over evolutionary time. This gene flow from ‘ghost’ populations can introduce biases when estimating evolutionary history from genomic data, often leading to data misinterpretation and ambiguous results. Here we assess these biases while accounting, or not accounting for gene flow from ‘ghost’ populations under the Isolation with Migration (IM) model. We perform extensive simulations under five scenarios with no gene flow (Scenario A), to extensive gene flow to- and from- an unsampled ‘ghost’ population (Scenarios B, C, D, and E). Estimates of evolutionary history across all scenarios A-E (effective population sizes, divergence times, and migration rates) indicate consistent a) under-estimation of divergence times between sampled populations, (b) over-estimation of effective population sizes of sampled populations, and (c) under-estimation of migration rates between sampled populations, with increased gene flow from the unsampled ‘ghost’ population. Without accounting for an unsampled ‘ghost’, summary statistics like FST are under-estimated, and π is over-estimated with increased gene flow from the‘ghost’. To show this persistent issue in empirical data, we use a 355 locus dataset from African Hunter-Gatherer populations and discuss similar biases in estimating evolutionary history while not accounting for unsampled ‘ghosts’. Considering the large effects of gene flow from these ‘ghosts’, we propose a multi-pronged approach to account for the presence of unsampled ‘ghost’ populations in population genomics studies to reduce erroneous inferences.


Author(s):  
Iago Maceda ◽  
Miguel Martín Álvarez ◽  
Georgios Athanasiadis ◽  
Raúl Tonda ◽  
Jordi Camps ◽  
...  

AbstractThe area of the Spanish Pyrenees is particularly interesting for studying the demographic dynamics of European rural areas given its orography, the main traditional rural condition of its population and the reported higher patterns of consanguinity of the region. Previous genetic studies suggest a gradient of genetic continuity of the area in the West to East axis. However, it has been shown that micro-population substructure can be detected when considering high-quality NGS data and using spatial explicit methods. In this work, we have analyzed the genome of 30 individuals sequenced at 40× from five different valleys in the Spanish Eastern Pyrenees (SEP) separated by less than 140 km along a west to east axis. Using haplotype-based methods and spatial analyses, we have been able to detect micro-population substructure within SEP not seen in previous studies. Linkage disequilibrium and autozygosity analyses suggest that the SEP populations show diverse demographic histories. In agreement with these results, demographic modeling by means of ABC-DL identify heterogeneity in their effective population sizes despite of their close geographic proximity, and suggests that the population substructure within SEP could have appeared around 2500 years ago. Overall, these results suggest that each rural population of the Pyrenees could represent a unique entity.


2019 ◽  
Vol 5 (2) ◽  
Author(s):  
Nicola F Müller ◽  
Gytis Dudas ◽  
Tanja Stadler

Abstract Population dynamics can be inferred from genetic sequence data by using phylodynamic methods. These methods typically quantify the dynamics in unstructured populations or assume migration rates and effective population sizes to be constant through time in structured populations. When considering rates to vary through time in structured populations, the number of parameters to infer increases rapidly and the available data might not be sufficient to inform these. Additionally, it is often of interest to know what predicts these parameters rather than knowing the parameters themselves. Here, we introduce a method to  infer the predictors for time-varying migration rates and effective population sizes by using a generalized linear model (GLM) approach under the marginal approximation of the structured coalescent. Using simulations, we show that our approach is able to reliably infer the model parameters and its predictors from phylogenetic trees. Furthermore, when simulating trees under the structured coalescent, we show that our new approach outperforms the discrete trait GLM model. We then apply our framework to a previously described Ebola virus dataset, where we infer the parameters and its predictors from genome sequences while accounting for phylogenetic uncertainty. We infer weekly cases to be the strongest predictor for effective population size and geographic distance the strongest predictor for migration. This approach is implemented as part of the BEAST2 package MASCOT, which allows us to jointly infer population dynamics, i.e. the parameters and predictors, within structured populations, the phylogenetic tree, and evolutionary parameters.


Genetics ◽  
2001 ◽  
Vol 157 (2) ◽  
pp. 743-750
Author(s):  
Charles Taylor ◽  
Yeya T Touré ◽  
John Carnahan ◽  
Douglas E Norris ◽  
Guimogo Dolo ◽  
...  

Abstract The population structure of the Anopheles gambiae complex is unusual, with several sibling species often occupying a single area and, in one of these species, An. gambiae sensu stricto, as many as three “chromosomal forms” occurring together. The chromosomal forms are thought to be intermediate between populations and species, distinguishable by patterns of chromosome gene arrangements. The extent of reproductive isolation among these forms has been debated. To better characterize this structure we measured effective population size, Ne, and migration rates, m, or their product by both direct and indirect means. Gene flow among villages within each chromosomal form was found to be large (Nem > 40), was intermediate between chromosomal forms (Nem ≈ 3–30), and was low between species (Nem ≈ 0.17–1.3). A recently developed means for distinguishing among certain of the forms using PCR indicated rates of gene flow consistent with those observed using the other genetic markers.


2018 ◽  
Author(s):  
Nicola F. Müller ◽  
Gytis Dudas ◽  
Tanja Stadler

AbstractPopulation dynamics can be inferred from genetic sequence data using phylodynamic methods. These methods typically quantify the dynamics in unstructured populations or assume the parameters describing the dynamics to be constant through time in structured populations. Inference methods allowing for structured populations and parameters to vary through time involve many parameters which have to be inferred. Each of these parameters might be however only weakly informed by data. Here we introduce an approach that uses so-called predictors, such as geographic distance between locations, within a generalized linear model to inform the population dynamic parameters, namely the time-varying migration rates and effective population sizes under the marginal approximation of the structured coalescent. By using simulations, we show that we are able to reliably infer the parameters from phylogenetic trees. We then apply this framework to a previously described Ebola virus dataset. We infer incidence to be the strongest predictor for effective population size and geographic distance the strongest predictor for migration. This allows us to show not only on simulated data, but also on real data, that we are able to identify reasonable predictors. Overall, we provide a novel method that allows to identify predictors for migration rates and effective population sizes and to use these predictors to quantify migration rates and effective population sizes. Its implementation as part of the BEAST2 software package MASCOT allows to jointly infer population dynamics within structured populations, the phylogenetic tree, and evolutionary parameters.


Genetics ◽  
2003 ◽  
Vol 163 (1) ◽  
pp. 429-446 ◽  
Author(s):  
Jinliang Wang ◽  
Michael C Whitlock

Abstract In the past, moment and likelihood methods have been developed to estimate the effective population size (Ne) on the basis of the observed changes of marker allele frequencies over time, and these have been applied to a large variety of species and populations. Such methods invariably make the critical assumption of a single isolated population receiving no immigrants over the study interval. For most populations in the real world, however, migration is not negligible and can substantially bias estimates of Ne if it is not accounted for. Here we extend previous moment and maximum-likelihood methods to allow the joint estimation of Ne and migration rate (m) using genetic samples over space and time. It is shown that, compared to genetic drift acting alone, migration results in changes in allele frequency that are greater in the short term and smaller in the long term, leading to under- and overestimation of Ne, respectively, if it is ignored. Extensive simulations are run to evaluate the newly developed moment and likelihood methods, which yield generally satisfactory estimates of both Ne and m for populations with widely different effective sizes and migration rates and patterns, given a reasonably large sample size and number of markers.


2019 ◽  
Vol 110 (5) ◽  
pp. 587-600
Author(s):  
A Millie Burrell ◽  
Jeffrey H R Goddard ◽  
Paul J Greer ◽  
Ryan J Williams ◽  
Alan E Pepper

Abstract Globally, a small number of plants have adapted to terrestrial outcroppings of serpentine geology, which are characterized by soils with low levels of essential mineral nutrients (N, P, K, Ca, Mo) and toxic levels of heavy metals (Ni, Cr, Co). Paradoxically, many of these plants are restricted to this harsh environment. Caulanthus ampexlicaulis var. barbarae (Brassicaceae) is a rare annual plant that is strictly endemic to a small set of isolated serpentine outcrops in the coastal mountains of central California. The goals of the work presented here were to 1) determine the patterns of genetic connectivity among all known populations of C. ampexlicaulis var. barbarae, and 2) estimate contemporary effective population sizes (Ne), to inform ongoing genomic analyses of the evolutionary history of this taxon, and to provide a foundation upon which to model its future evolutionary potential and long-term viability in a changing environment. Eleven populations of this taxon were sampled, and population-genetic parameters were estimated using 11 nuclear microsatellite markers. Contemporary effective population sizes were estimated using multiple methods and found to be strikingly small (typically Ne < 10). Further, our data showed that a substantial component of genetic connectivity of this taxon is not at equilibrium, and instead showed sporadic gene flow. Several lines of evidence indicate that gene flow between isolated populations is maintained through long-distance seed dispersal (e.g., >1 km), possibly via zoochory.


1997 ◽  
Vol 69 (2) ◽  
pp. 111-116 ◽  
Author(s):  
ZIHENG YANG

The theory developed by Takahata and colleagues for estimating the effective population size of ancestral species using homologous sequences from closely related extant species was extended to take account of variation of evolutionary rates among loci. Nuclear sequence data related to the evolution of modern humans were reanalysed and computer simulations were performed to examine the effect of rate variation on estimation of ancestral population sizes. It is found that the among-locus rate variation does not have a significant effect on estimation of the current population size when sequences from multiple loci are sampled from the same species, but does have a significant effect on estimation of the ancestral population size using sequences from different species. The effects of ancestral population size, species divergence time and among-locus rate variation are found to be highly correlated, and to achieve reliable estimates of the ancestral population size, effects of the other two factors should be estimated independently.


Sign in / Sign up

Export Citation Format

Share Document