Biogeographical Ancestry Inference from Genotype: A Comparison of Ancestral Informative SNPs and Genome-wide SNPs

Author(s):  
Yue Qu ◽  
Dat Tran ◽  
Elisa Martinez-Marroquin
2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Jiarui Li ◽  
Tomás González Zarzar ◽  
Julie D. White ◽  
Karlijne Indencleef ◽  
Hanne Hoskens ◽  
...  

2012 ◽  
Vol 20 (11) ◽  
pp. 1148-1154 ◽  
Author(s):  
Andrew J Pakstis ◽  
Rixun Fang ◽  
Manohar R Furtado ◽  
Judith R Kidd ◽  
Kenneth K Kidd

2019 ◽  
Vol 21 (6) ◽  
pp. 806-812 ◽  
Author(s):  
Guijia Liu ◽  
Linsong Dong ◽  
Linlin Gu ◽  
Zhaofang Han ◽  
Wenjing Zhang ◽  
...  

AbstractYellow drum (Nibea albiflora) is an important maricultural fish in China, and genetic improvement is necessary for this species. This research evaluated the application of genomic selection methods to predict the genetic values of seven economic traits for yellow drum. Using genome-wide single-nucleotide polymorphisms (SNPs), we estimated the genetic parameters for seven traits, including body length (BL), swimming bladder index (SBI), swimming bladder weight (SBW), body thickness (BT), body height (BH), body length/body height ratio (LHR), and gonad weight index (GWI). The heritability estimates ranged from 0.309 to 0.843. We evaluated the prediction performance of various statistical methods, and no one method provided the highest predictive ability for all traits. We then evaluated and compared the use of genome-wide association study (GWAS)–informative SNPs and random SNPs for prediction and found that GWAS-informative SNPs obviously increased. It only needed 5 and 100 informative SNPs for LHR and BT to achieve almost the same predictive abilities as using genome-wide SNPs, and for BL, SBI, SBW, BH, and GWI, about 1000 to 3000 informative SNPs were needed to achieve whole-genome level predictive abilities. It can be concluded from the test results that breeders can use fewer SNPs to save the breeding costs of genomic selection for some traits.


2019 ◽  
Author(s):  
Jairui Li ◽  
Tomas Gonzalez ◽  
Julie D. White ◽  
Karlijne Indencleef ◽  
Hanne Hoskens ◽  
...  

AbstractAccurate inference of genomic ancestry is critically important in human genetics, epidemiology, and related fields. Geneticists today have access to multiple heterogeneous population-based datasets from studies collected under different protocols. Therefore, joint analyses of these datasets require robust and consistent inference of ancestry, where a common strategy is to yield an ancestry space generated by a reference dataset. However, such a strategy is sensitive to batch artefacts introduced by different protocols. In this work, we propose a novel robust genome-wide ancestry inference method; referred to as SUGIBS, based on an unnormalized genomic (UG) relationship matrix whose spectral (S) decomposition is generalized by an Identity-by-State (IBS) similarity degree matrix. SUGIBS robustly constructs an ancestry space from a single reference dataset, and provides a robust projection of new samples, from different studies. In experiments and simulations, we show that, SUGIBS is robust against individual outliers and batch artifacts introduced by different genotyping protocols. The performance of SUGIBS is equivalent to the widely used principal component analysis (PCA) on normalized genotype data in revealing the underlying structure of an admixed population and in adjusting for false positive findings in a case-control admixed GWAS. We applied SUGIBS on the 1000 Genome project, as a reference, in combination with a large heterogeneous dataset containing auxiliary 3D facial images, to predict population stratified average or ancestry faces. In addition, we projected eight ancient DNA profiles into the 1000 Genome ancestry space and reconstructed their ancestry face. Based on the visually strong and recognizable human facial phenotype, comprehensive facial illustrations of the populations embedded in the 1000 Genome project are provided. Furthermore, ancestry facial imaging has important applications in personalized and precision medicine along with forensic and archeological DNA phenotyping.Author SummaryEstimates of individual-level genomic ancestry are routinely used in human genetics, epidemiology, and related fields. The analysis of population structure and genomic ancestry can yield significant insights in terms of modern and ancient population dynamics, allowing us to address questions regarding the timing of the admixture events, and the numbers and identities of the parental source populations. Unrecognized or cryptic population structure is also an important confounder to correct for in genome-wide association studies (GWAS). However, to date, it remains challenging to work with heterogeneous datasets from multiple studies collected by different laboratories with diverse genotyping and imputation protocols. This work presents a new approach and an accompanying open-source software toolbox that facilitates a robust integrative analysis for population structure and genomic ancestry estimates for heterogeneous datasets. Given that visually evident and easily recognizable patterns of human facial characteristics covary with genomic ancestry, we can generate predicted ancestry faces on both the population and individual levels as we illustrate for the 26 1000 Genome populations and for eight eminent ancient-DNA profiles, respectively.


2016 ◽  
Vol 131 (4) ◽  
pp. 901-912 ◽  
Author(s):  
Elaine Y Y Cheung ◽  
Michelle Elizabeth Gahan ◽  
Dennis McNevin

Heredity ◽  
2015 ◽  
Vol 115 (3) ◽  
pp. 195-205 ◽  
Author(s):  
R Oliveira ◽  
E Randi ◽  
F Mattucci ◽  
J D Kurushima ◽  
L A Lyons ◽  
...  

BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Jamal Momeni ◽  
Melanie Parejo ◽  
Rasmus O. Nielsen ◽  
Jorge Langa ◽  
Iratxe Montes ◽  
...  

Abstract Background With numerous endemic subspecies representing four of its five evolutionary lineages, Europe holds a large fraction of Apis mellifera genetic diversity. This diversity and the natural distribution range have been altered by anthropogenic factors. The conservation of this natural heritage relies on the availability of accurate tools for subspecies diagnosis. Based on pool-sequence data from 2145 worker bees representing 22 populations sampled across Europe, we employed two highly discriminative approaches (PCA and FST) to select the most informative SNPs for ancestry inference. Results Using a supervised machine learning (ML) approach and a set of 3896 genotyped individuals, we could show that the 4094 selected single nucleotide polymorphisms (SNPs) provide an accurate prediction of ancestry inference in European honey bees. The best ML model was Linear Support Vector Classifier (Linear SVC) which correctly assigned most individuals to one of the 14 subspecies or different genetic origins with a mean accuracy of 96.2% ± 0.8 SD. A total of 3.8% of test individuals were misclassified, most probably due to limited differentiation between the subspecies caused by close geographical proximity, or human interference of genetic integrity of reference subspecies, or a combination thereof. Conclusions The diagnostic tool presented here will contribute to a sustainable conservation and support breeding activities in order to preserve the genetic heritage of European honey bees.


2018 ◽  
Vol 8 (1) ◽  
Author(s):  
Udita Basu ◽  
Rishi Srivastava ◽  
Deepak Bajaj ◽  
Virevol Thakro ◽  
Anurag Daware ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document