scholarly journals Blockwise Site Frequency Spectra for Inferring Complex Population Histories and Recombination

2016 ◽  
Author(s):  
Champak R. Beeravolu ◽  
Michael J. Hickerson ◽  
Laurent A.F. Frantz ◽  
Konrad Lohse

AbstractWe introduce ABLE (Approximate Blockwise Likelihood Estimation), a novel composite likelihood framework based on a recently introduced summary of sequence variation: the blockwise site frequency spectrum (bSFS). This simulation-based framework uses the the frequencies of bSFS configurations to jointly model demographic history and recombination and is explicitly designed to make inference using multiple whole genomes or genome-wide multi-locus data (e.g. RADSeq) catering to the needs of researchers studying model or non-model organisms respectively. The flexible nature of our method further allows for arbitrarily complex population histories using unphased and unpolarized whole genome sequences. In silico experiments demonstrate accurate parameter estimates across a range of divergence models with increasing complexity, and as a proof of principle, we infer the demographic history of the two species of orangutan from multiple genome sequences (over 160 Mbp in length) from each species. Our results indicate that the two orangutan species split approximately 650-950 thousand years ago but experienced a pulse of secondary contact much more recently, most likely during a period of low sea-level South East Asia (∼300,000 years ago). Unlike previous analyses we can reject a history of continuous gene flow and co-estimate genome-wide recombination. ABLE is available for download at https://github.com/champost/ABLE.

2020 ◽  
Vol 37 (7) ◽  
pp. 2124-2136
Author(s):  
Paul D Blischak ◽  
Michael S Barker ◽  
Ryan N Gutenkunst

Abstract Demographic inference using the site frequency spectrum (SFS) is a common way to understand historical events affecting genetic variation. However, most methods for estimating demography from the SFS assume random mating within populations, precluding these types of analyses in inbred populations. To address this issue, we developed a model for the expected SFS that includes inbreeding by parameterizing individual genotypes using beta-binomial distributions. We then take the convolution of these genotype probabilities to calculate the expected frequency of biallelic variants in the population. Using simulations, we evaluated the model’s ability to coestimate demography and inbreeding using one- and two-population models across a range of inbreeding levels. We also applied our method to two empirical examples, American pumas (Puma concolor) and domesticated cabbage (Brassica oleracea var. capitata), inferring models both with and without inbreeding to compare parameter estimates and model fit. Our simulations showed that we are able to accurately coestimate demographic parameters and inbreeding even for highly inbred populations (F = 0.9). In contrast, failing to include inbreeding generally resulted in inaccurate parameter estimates in simulated data and led to poor model fit in our empirical analyses. These results show that inbreeding can have a strong effect on demographic inference, a pattern that was especially noticeable for parameters involving changes in population size. Given the importance of these estimates for informing practices in conservation, agriculture, and elsewhere, our method provides an important advancement for accurately estimating the demographic histories of these species.


2018 ◽  
Author(s):  
Arun Durvasula ◽  
Sriram Sankararaman

AbstractWhile introgression from Neanderthals and Denisovans has been well-documented in modern humans outside Africa, the contribution of archaic hominins to the genetic variation of present-day Africans remains poorly understood. Using 405 whole-genome sequences from four sub-Saharan African populations, we provide complementary lines of evidence for archaic introgression into these populations. Our analyses of site frequency spectra indicate that these populations derive 2-19% of their genetic ancestry from an archaic population that diverged prior to the split of Neanderthals and modern humans. Using a method that can identify segments of archaic ancestry without the need for reference archaic genomes, we built genome-wide maps of archaic ancestry in the Yoruba and the Mende populations that recover about 482 and 502 megabases of archaic sequence, respectively. Analyses of these maps reveal segments of archaic ancestry at high frequency in these populations that represent potential targets of adaptive introgression. Our results reveal the substantial contribution of archaic ancestry in shaping the gene pool of present-day African populations.One sentence summaryMultiple present-day African populations inherited genes from an unknown archaic population that diverged before modern humans and Neanderthals split.


2017 ◽  
Vol 1 (Special Issue) ◽  
pp. 15-15
Author(s):  
Anubhab Khan ◽  
Rithvik Vinekar ◽  
Prachi Thatte ◽  
Uma Ramakrishnan

PeerJ ◽  
2016 ◽  
Vol 4 ◽  
pp. e1910 ◽  
Author(s):  
Quentin Rougemont ◽  
Camille Roux ◽  
Samuel Neuenschwander ◽  
Jerome Goudet ◽  
Sophie Launey ◽  
...  

Inferring the history of isolation and gene flow during species divergence is a central question in evolutionary biology. The European river lamprey (Lampetra fluviatilis) and brook lamprey(L. planeri)show a low reproductive isolation but have highly distinct life histories, the former being parasitic-anadromous and the latter non-parasitic and freshwater resident. Here we used microsatellite data from six replicated population pairs to reconstruct their history of divergence using an approximate Bayesian computation framework combined with a random forest model. In most population pairs, scenarios of divergence with recent isolation were outcompeted by scenarios proposing ongoing gene flow, namely the Secondary Contact (SC) and Isolation with Migration (IM) models. The estimation of demographic parameters under the SC model indicated a time of secondary contact close to the time of speciation, explaining why SC and IM models could not be discriminated. In case of an ancient secondary contact, the historical signal of divergence is lost and neutral markers converge to the same equilibrium as under the less parameterized model allowing ongoing gene flow. Our results imply that models of secondary contacts should be systematically compared to models of divergence with gene flow; given the difficulty to discriminate among these models, we suggest that genome-wide data are needed to adequately reconstruct divergence history.


2021 ◽  
Vol 12 ◽  
Author(s):  
Xingfei Gong ◽  
Mingda Hu ◽  
Wei Chen ◽  
Haoyi Yang ◽  
Boqian Wang ◽  
...  

Influenza A virus (IAV) genomes are composed of eight single-stranded RNA segments. Genetic exchange through reassortment of the segmented genomes often endows IAVs with new genetic characteristics, which may affect transmissibility and pathogenicity of the viruses. However, a comprehensive understanding of the reassortment history of IAVs remains lacking. To this end, we assembled 40,296 whole-genome sequences of IAVs for analysis. Using a new clustering method based on Mean Pairwise Distances in the phylogenetic trees, we classified each segment of IAVs into clades. Correspondingly, reassortment events among IAVs were detected by checking the segment clade compositions of related genomes under specific environment factors and time period. We systematically identified 1,927 possible reassortment events of IAVs and constructed their reassortment network. Interestingly, minimum spanning tree of the reassortment network reproved that swine act as an intermediate host in the reassortment history of IAVs between avian species and humans. Moreover, reassortment patterns among related subtypes constructed in this study are consistent with previous studies. Taken together, our genome-wide reassortment analysis of all the IAVs offers an overview of the leaping evolution of the virus and a comprehensive network representing the relationships of IAVs.


2019 ◽  
Author(s):  
Andrew D. Foote ◽  
Michael D. Martin ◽  
Marie Louis ◽  
George Pacheco ◽  
Kelly M. Robertson ◽  
...  

AbstractReconstruction of the demographic and evolutionary history of populations assuming a consensus tree-like relationship can mask more complex scenarios, which are prevalent in nature. An emerging genomic toolset, which has been most comprehensively harnessed in the reconstruction of human evolutionary history, enables molecular ecologists to elucidate complex population histories. Killer whales have limited extrinsic barriers to dispersal and have radiated globally, and are therefore a good candidate model for the application of such tools. Here, we analyse a global dataset of killer whale genomes in a rare attempt to elucidate global population structure in a non-human species. We identify a pattern of genetic homogenisation at lower latitudes and the greatest differentiation at high latitudes, even between currently sympatric lineages. The processes underlying the major axis of structure include high drift at the edge of species’ range, likely associated with founder effects and allelic surfing during post-glacial range expansion. Divergence between Antarctic and non-Antarctic lineages is further driven by ancestry segments with up to four-fold older coalescence time than the genome-wide average; relicts of a previous vicariance during an earlier glacial cycle. Our study further underpins that episodic gene flow is ubiquitous in natural populations, and can occur across great distances and after substantial periods of isolation between populations. Thus, understanding the evolutionary history of a species requires comprehensive geographic sampling and genome-wide data to sample the variation in ancestry within individuals.


Author(s):  
Emily Koot ◽  
Elise Arnst ◽  
Melissa Taane ◽  
Kelsey Goldsmith ◽  
Peri Tobias ◽  
...  

Leptospermum scoparium J. R. Forst et G. Forst, known as mānuka by Māori, the indigenous people of Aotearoa (New Zealand), is a culturally and economically significant shrub species, native to New Zealand and Australia. Chemical, morphological and phylogenetic studies have indicated geographical variation of mānuka across its range in New Zealand, and genetic differentiation between New Zealand and Australia. We used pooled whole genome re-sequencing of 76 L. scoparium and outgroup populations from New Zealand and Australia to compile a dataset totalling ~2.5 million SNPs. We explored the genetic structure and relatedness of L. scoparium across New Zealand, and between populations in New Zealand and Australia, as well as the complex demographic history of this species. Our population genomic investigation suggests there are five geographically distinct mānuka gene pools within New Zealand, with evidence of gene flow occurring between these pools. Demographic modelling suggests three of these gene pools have undergone expansion events, whilst the evolutionary histories of the remaining two have been subjected to contractions. Furthermore, mānuka populations in New Zealand are genetically distinct from populations in Australia, with coalescent modelling suggesting these two clades diverged ~9 –12 million years ago. We discuss the evolutionary history of this species and the benefits of using pool-seq for such studies. Our research will support the management and conservation of mānuka by landowners, particularly Māori, and the development of a provenance story for the branding of mānuka based products.


GigaScience ◽  
2020 ◽  
Vol 9 (3) ◽  
Author(s):  
Ekaterina Noskova ◽  
Vladimir Ulyantsev ◽  
Klaus-Peter Koepfli ◽  
Stephen J O’Brien ◽  
Pavel Dobrynin

Abstract Background The demographic history of any population is imprinted in the genomes of the individuals that make up the population. One of the most popular and convenient representations of genetic information is the allele frequency spectrum (AFS), the distribution of allele frequencies in populations. The joint AFS is commonly used to reconstruct the demographic history of multiple populations, and several methods based on diffusion approximation (e.g., ∂a∂i) and ordinary differential equations (e.g., moments) have been developed and applied for demographic inference. These methods provide an opportunity to simulate AFS under a variety of researcher-specified demographic models and to estimate the best model and associated parameters using likelihood-based local optimizations. However, there are no known algorithms to perform global searches of demographic models with a given AFS. Results Here, we introduce a new method that implements a global search using a genetic algorithm for the automatic and unsupervised inference of demographic history from joint AFS data. Our method is implemented in the software GADMA (Genetic Algorithm for Demographic Model Analysis, https://github.com/ctlab/GADMA). Conclusions We demonstrate the performance of GADMA by applying it to sequence data from humans and non-model organisms and show that it is able to automatically infer a demographic model close to or even better than the one that was previously obtained manually. Moreover, GADMA is able to infer multiple demographic models at different local optima close to the global one, providing a larger set of possible scenarios to further explore demographic history.


2012 ◽  
Vol 21 (4) ◽  
pp. 1005-1018 ◽  
Author(s):  
ANDREIA MIRALDO ◽  
GODFREY M. HEWITT ◽  
PAUL H. DEAR ◽  
OCTAVIO S. PAULO ◽  
BRENT C. EMERSON

2014 ◽  
Vol 31 (11) ◽  
pp. 2929-2940 ◽  
Author(s):  
Takehiro Sato ◽  
Shigeki Nakagome ◽  
Chiaki Watanabe ◽  
Kyoko Yamaguchi ◽  
Akira Kawaguchi ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document