scholarly journals Versatile simulations of admixture and accurate local ancestry inference with mixnmatch and ancestryinfer

2019 ◽  
Author(s):  
Molly Schumer ◽  
Daniel L. Powell ◽  
Russ Corbett-Detig

AbstractIt is now clear that hybridization between species is much more common than previously recognized. As a result, we now know that the genomes of many modern species, including our own, are a patchwork of regions derived from past hybridization events. Increasingly researchers are interested in disentangling which regions of the genome originated from each parental species using local ancestry inference methods. Due to the diverse effects of admixture, this interest is shared across disparate fields, from human genetics to research in ecology and evolutionary biology. However, local ancestry inference methods are sensitive to a range of biological and technical parameters which can impact accuracy. Here we present paired simulation and ancestry inference pipelines, mixnmatch and ancestryinfer, to help researchers plan and execute local ancestry inference studies. mixnmatch can simulate arbitrarily complex demographic histories in the parental and hybrid populations, selection on hybrids, and technical variables such as coverage and contamination. ancestryinfer takes as input sequencing reads from simulated or real individuals, and implements an efficient local ancestry inference pipeline. We perform a series of simulations with mixnmatch to pinpoint factors that influence accuracy in local ancestry inference and highlight useful features of the two pipelines. Together, mixnmatch and ancestryinfer are powerful tools for predicting the performance of local ancestry inference methods on real data.

2019 ◽  
Vol 10 (2) ◽  
pp. 569-579
Author(s):  
Aurélien Cottin ◽  
Benjamin Penaud ◽  
Jean-Christophe Glaszmann ◽  
Nabila Yahiaoui ◽  
Mathieu Gautier

Hybridizations between species and subspecies represented major steps in the history of many crop species. Such events generally lead to genomes with mosaic patterns of chromosomal segments of various origins that may be assessed by local ancestry inference methods. However, these methods have mainly been developed in the context of human population genetics with implicit assumptions that may not always fit plant models. The purpose of this study was to evaluate the suitability of three state-of-the-art inference methods (SABER, ELAI and WINPOP) for local ancestry inference under scenarios that can be encountered in plant species. For this, we developed an R package to simulate genotyping data under such scenarios. The tested inference methods performed similarly well as far as representatives of source populations were available. As expected, the higher the level of differentiation between ancestral source populations and the lower the number of generations since admixture, the more accurate were the results. Interestingly, the accuracy of the methods was only marginally affected by i) the number of ancestries (up to six tested); ii) the sample design (i.e., unbalanced representation of source populations); and iii) the reproduction mode (e.g., selfing, vegetative propagation). If a source population was not represented in the data set, no bias was observed in inference accuracy for regions originating from represented sources and regions from the missing source were assigned differently depending on the methods. Overall, the selected ancestry inference methods may be used for crop plant analysis if all ancestral sources are known.


2013 ◽  
Vol 93 (2) ◽  
pp. 278-288 ◽  
Author(s):  
Brian K. Maples ◽  
Simon Gravel ◽  
Eimear E. Kenny ◽  
Carlos D. Bustamante

BMC Genetics ◽  
2017 ◽  
Vol 18 (1) ◽  
Author(s):  
Daniel Hui ◽  
Zhou Fang ◽  
Jerome Lin ◽  
Qing Duan ◽  
Yun Li ◽  
...  

2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Heming Wang ◽  
Tamar Sofer ◽  
Xiang Zhang ◽  
Robert C. Elston ◽  
Susan Redline ◽  
...  

2020 ◽  
Author(s):  
Ryan Schubert ◽  
Angela Andaleon ◽  
Heather E. Wheeler

Abstract Background: Local ancestry estimation infers the regional ancestral origin of chromosomal segments in admixed populations using reference populations and a variety of statistical models. Integrating local ancestry into complex trait genetics has the potential to increase detection of genetic associations and improve genetic prediction models in understudied admixed populations, including African Americans and Hispanics. Five methods for local ancestry estimation are LAMP-LD (2012), RFMix (2013), ELAI (2014), Loter (2018), and MOSAIC (2019), but direct comparisons of accuracy, runtime, and memory usage of all these software tools have not previously been reported across common patterns of human admixture. Results: We found that in cases of two-way admixture, RFMix and ELAI had the highest median accuracy depending on population structure, while in cases of three-way admixture, we found RFMix, MOSAIC, and LAMP-LD had the highest median accuracy. Additionally, we estimate the O(n) of both memory and runtime for each software and find that for both time and memory most software expand linearly with respect to sample size. The only exception is RFMix, which expands quadratically with respect to runtime and linearly with respect to memory. Conclusions: Effective local ancestry estimation tools are necessary to combat population disparities in human genetics studies. RFMix performs the best across methods, however, depending on application, other methods perform similarly well with the benefit of shorter runtimes. Scripts used to format data, run software, and estimate accuracy can be found at https://github.com/WheelerLab/LAI_benchmarking .


2019 ◽  
Author(s):  
Caitlin Uren ◽  
Eileen G. Hoal ◽  
Marlo Möller

AbstractGlobal and local ancestry inference in admixed human populations can be performed using computational tools implementing distinct algorithms, such as RFMix and ADMIXTURE. The accuracy of these tools has been tested largely on populations with relatively straightforward admixture histories but little is known about how well they perform in more complex admixture scenarios. Using simulations, we show that RFMix outperforms ADMIXTURE in determining global ancestry proportions in a complex 5-way admixed population. In addition, RFMix correctly assigns local ancestry with an accuracy of 89%. The increase in reported local ancestry inference accuracy in this population (as compared to previous studies) can largely be attributed to the recent availability of large-scale genotyping data for more representative reference populations. The ability of RFMix to determine global and local ancestry to a high degree of accuracy, allows for more reliable population structure analysis, scans for natural selection, admixture mapping and case-control association studies. This study highlights the utility of the extension of computational tools to become more relevant to genetically structured populations, as seen with RFMix. This is particularly noteworthy as modern-day societies are becoming increasingly genetically complex and some genetic tools are therefore less appropriate. We therefore suggest that RFMix be used for both global and local ancestry estimation in complex admixture scenarios.


2013 ◽  
Vol 93 (5) ◽  
pp. 891-899 ◽  
Author(s):  
Youna Hu ◽  
Cristen Willer ◽  
Xiaowei Zhan ◽  
Hyun Min Kang ◽  
Gonçalo R. Abecasis

Sign in / Sign up

Export Citation Format

Share Document