scholarly journals Inference of Ancestries and Heterozygosity Proportion and Genotype Imputation in West African Cattle Populations

2021 ◽  
Vol 12 ◽  
Author(s):  
Netsanet Z. Gebrehiwot ◽  
Hassan Aliloo ◽  
Eva M. Strucken ◽  
Karen Marshall ◽  
Mohammad Al Kalaldeh ◽  
...  

Several studies have evaluated computational methods that infer the haplotypes from population genotype data in European cattle populations. However, little is known about how well they perform in African indigenous and crossbred populations. This study investigates: (1) global and local ancestry inference; (2) heterozygosity proportion estimation; and (3) genotype imputation in West African indigenous and crossbred cattle populations. Principal component analysis (PCA), ADMIXTURE, and LAMP-LD were used to analyse a medium-density single nucleotide polymorphism (SNP) dataset from Senegalese crossbred cattle. Reference SNP data of East and West African indigenous and crossbred cattle populations were used to investigate the accuracy of imputation from low to medium-density and from medium to high-density SNP datasets using Minimac v3. The first two principal components differentiated Bos indicus from European Bos taurus and African Bos taurus from other breeds. Irrespective of assuming two or three ancestral breeds for the Senegalese crossbreds, breed proportion estimates from ADMIXTURE and LAMP-LD showed a high correlation (r ≥ 0.981). The observed ancestral origin heterozygosity proportion in putative F1 crosses was close to the expected value of 1.0, and clearly differentiated F1 from all other crosses. The imputation accuracies (estimated as correlation) between imputed and the real data in crossbred animals ranged from 0.142 to 0.717 when imputing from low to medium-density, and from 0.478 to 0.899 for imputation from medium to high-density. The imputation accuracy was generally higher when the reference data came from the same geographical region as the target population, and when crossbred reference data was used to impute crossbred genotypes. The lowest imputation accuracies were observed for indigenous breed genotypes. This study shows that ancestral origin heterozygosity can be estimated with high accuracy and will be far superior to the use of observed individual heterozygosity for estimating heterosis in African crossbred populations. It was not possible to achieve high imputation accuracy in West African crossbred or indigenous populations based on reference data sets from East Africa, and population-specific genotyping with high-density SNP assays is required to improve imputation.

2019 ◽  
Vol 10 (2) ◽  
pp. 581-590 ◽  
Author(s):  
Smaragda Tsairidou ◽  
Alastair Hamilton ◽  
Diego Robledo ◽  
James E. Bron ◽  
Ross D. Houston

Genomic selection enables cumulative genetic gains in key production traits such as disease resistance, playing an important role in the economic and environmental sustainability of aquaculture production. However, it requires genome-wide genetic marker data on large populations, which can be prohibitively expensive. Genotype imputation is a cost-effective method for obtaining high-density genotypes, but its value in aquaculture breeding programs which are characterized by large full-sibling families has yet to be fully assessed. The aim of this study was to optimize the use of low-density genotypes and evaluate genotype imputation strategies for cost-effective genomic prediction. Phenotypes and genotypes (78,362 SNPs) were obtained for 610 individuals from a Scottish Atlantic salmon breeding program population (Landcatch, UK) challenged with sea lice, Lepeophtheirus salmonis. The genomic prediction accuracy of genomic selection was calculated using GBLUP approaches and compared across SNP panels of varying densities and composition, with and without imputation. Imputation was tested when parents were genotyped for the optimal SNP panel, and offspring were genotyped for a range of lower density imputation panels. Reducing SNP density had little impact on prediction accuracy until 5,000 SNPs, below which the accuracy dropped. Imputation accuracy increased with increasing imputation panel density. Genomic prediction accuracy when offspring were genotyped for just 200 SNPs, and parents for 5,000 SNPs, was 0.53. This accuracy was similar to the full high density and optimal density dataset, and markedly higher than using 200 SNPs without imputation. These results suggest that imputation from very low to medium density can be a cost-effective tool for genomic selection in Atlantic salmon breeding programs.


2021 ◽  
Vol 53 (1) ◽  
Author(s):  
Netsanet Z. Gebrehiwot ◽  
Eva M. Strucken ◽  
Karen Marshall ◽  
Hassan Aliloo ◽  
John P. Gibson

Abstract Background Understanding the relationship between genetic admixture and phenotypic performance is crucial for the optimization of crossbreeding programs. The use of small sets of informative ancestry markers can be a cost-effective option for the estimation of breed composition and for parentage assignment in situations where pedigree recording is difficult. The objectives of this study were to develop small single nucleotide polymorphism (SNP) panels that can accurately estimate the total dairy proportion and assign parentage in both West and East African crossbred dairy cows. Methods Medium- and high-density SNP genotype data (Illumina BovineSNP50 and BovineHD Beadchip) for 4231 animals sampled from African crossbreds, African Bos taurus, European Bos taurus, Bos indicus, and African indigenous populations were used. For estimating breed composition, the absolute differences in allele frequency were calculated between pure ancestral breeds to identify SNPs with the highest discriminating power, and different combinations of SNPs weighted by ancestral origin were tested against estimates based on all available SNPs. For parentage assignment, informative SNPs were selected based on the highest minor allele frequency (MAF) in African crossbred populations assuming two Scenarios: (1) parents were selected among all the animals with known genotypes, and (2) parents were selected only among the animals known to be a parent of at least one progeny. Results For the medium-density genotype data, SNPs selected for the largest differences in allele frequency between West African indigenous and European Bos taurus breeds performed best for most African crossbred populations and achieved a prediction accuracy (r2) for breed composition of 0.926 to 0.961 with 200 SNPs. For the high-density dataset, a panel with 70% of the SNPs selected on their largest difference in allele frequency between African and European Bos taurus performed best or very near best across all crossbred populations with r2 ranging from 0.978 to 0.984 with 200 SNPs. In all African crossbred populations, unambiguous parentage assignment was possible with ≥ 300 SNPs for the majority of the panels for Scenario 1 and ≥ 200 SNPs for Scenario 2. Conclusions The identified low-cost SNP assays could overcome incomplete or inaccurate pedigree records in African smallholder systems and allow effective breeding decisions to produce progeny of desired breed composition.


Genome ◽  
2021 ◽  
Author(s):  
Alejandra Maria Toro Ospina ◽  
Ignacio Aguilar ◽  
Matheus Henrique Vargas de Oliveira ◽  
Luiz eduardo Cruz dos Santos Correia ◽  
Anibal Eugenio Vercesi Filho ◽  
...  

The objective of this study was to evaluate the accuracy of imputation in a Gyr population using two medium density panels (Bos taurus - Bos indicus) and to test whether the inclusion of the Nellore breed increases the imputation accuracy in the Gyr population. The database consisted of 289 Gyr females from Brazil genotyped with the GGP Bovine LDv4 chip containing 30,000 SNPs and 158 Gyr females from Colombia genotyped with the GGP indicus chip containing 35,000 SNPs. A customized chip was created that contained the information of 9,109 SNPs (9K) to test the imputation accuracy in Gyr populations; 604 Nellore animals with information of LD SNPs tested in the scenarios were included in the reference population. Four scenarios were tested: LD9K_30KGIR, LD9K_35INDGIR, LD9K_30KGIR_NEL and LD9K_35INDGIR_NEL. Principal component analysis (PCA) was computed for the genomic matrix and sample-specific imputation accuracies were calculated using Pearson’s correlation (CS) and the concordance rate (CR) for imputed genotypes. The results of PCA of the Colombian and Brazilian Gyr populations demonstrated the genomic relationship between the two populations. The CS and CR ranged from 0.88 to 0.94 and from 0.93 to 0.96, respectively. Among the scenarios tested, the highest CS (0.94) was observed for the LD9K_30KGIR scenario.However, the variation in SNPs may reduce the imputation accuracy even when the chip of the Bos indicus subspecies is used


2019 ◽  
Vol 51 (1) ◽  
Author(s):  
Troy N. Rowan ◽  
Jesse L. Hoff ◽  
Tamar E. Crum ◽  
Jeremy F. Taylor ◽  
Robert D. Schnabel ◽  
...  

Abstract Background During the last decade, the use of common-variant array-based single nucleotide polymorphism (SNP) genotyping in the beef and dairy industries has produced an astounding amount of medium-to-low density genomic data. Although low-density assays work well in the context of genomic prediction, they are less useful for detecting and mapping causal variants and the effects of rare variants are not captured. The objective of this project was to maximize the accuracies of genotype imputation from medium- and low-density assays to the marker set obtained by combining two high-density research assays (~ 850,000 SNPs), the Illumina BovineHD and the GGP-F250 assays, which contains a large proportion of rare and potentially functional variants and for which the assay design is described here. This 850 K SNP set is useful for both imputation to sequence-level genotypes and direct downstream analysis. Results We found that a large multi-breed composite imputation reference panel that includes 36,131 samples with either BovineHD and/or GGP-F250 genotypes significantly increased imputation accuracy compared with a within-breed reference panel, particularly at variants with low minor allele frequencies. Individual animal imputation accuracies were maximized when more genetically similar animals were represented in the composite reference panel, particularly with complete 850 K genotypes. The addition of rare variants from the GGP-F250 assay to our composite reference panel significantly increased the imputation accuracy of rare variants that are exclusively present on the BovineHD assay. In addition, we show that an assay marker density of 50 K SNPs balances cost and accuracy for imputation to 850 K. Conclusions Using high-density genotypes on all available individuals in a multi-breed reference panel maximized imputation accuracy for tested cattle populations. Admixed animals or those from breeds with a limited representation in the composite reference panel were still imputed at high accuracy, which is expected to further increase as the reference panel expands. We anticipate that the addition of rare variants from the GGP-F250 assay will increase the accuracy of imputation to sequence level.


BMC Genetics ◽  
2013 ◽  
Vol 14 (1) ◽  
pp. 38 ◽  
Author(s):  
Jose L Gualdrón Duarte ◽  
Ronald O Bates ◽  
Catherine W Ernst ◽  
Nancy E Raney ◽  
Rodolfo JC Cantet ◽  
...  

2018 ◽  
Author(s):  
Andrew Whalen ◽  
John M Hickey ◽  
Gregor Gorjanc

In this paper we evaluate the performance of using a family-specific low-density genotype arrays to increase the accuracy of pedigree based imputation. Genotype imputation is a widely used tool that decreases the costs of genotyping a population by genotyping the majority of individuals using a low-density array and using statistical regularities between the low-density and high-density individuals to fill in the missing genotypes. Previous work on population based imputation has found that it is possible to increase the accuracy of imputation by maximizing the number of informative markers on an array. In the context of pedigree based imputation, where the informativeness of a marker depends only on the genotypes of an individual's parents, it may be beneficial to select the markers on each low-density array on a family-by-family basis. In this paper we examined four family-specific low-density marker selection strategies, and evaluated their performance in the context of a real pig breeding dataset. We found that family-specific or sire-specific arrays could increase imputation accuracy by 0.11 at 1 marker per chromosome, by 0.027 at 25 markers per chromosome and by 0.007 at 100 markers per chromosome. These results suggest that there may be a room to use family-specific genotyping for very-low-density arrays particularly if a given sire or sire-dam pairing have a large number of offspring.


2018 ◽  
Vol 135 (6) ◽  
pp. 420-431 ◽  
Author(s):  
Marjorie Chassier ◽  
Eric Barrey ◽  
Céline Robert ◽  
Arnaud Duluard ◽  
Sophie Danvy ◽  
...  

Author(s):  
Simon F Lashmar ◽  
Donagh P Berry ◽  
Rian Pierneef ◽  
Farai C Muchadeyi ◽  
Carina Visser

Abstract A major obstacle in applying genomic selection (GS) to uniquely adapted local breeds in less-developed countries has been the cost of genotyping at high densities of single nucleotide polymorphisms (SNP). Cost reduction can be achieved by imputing genotypes from lower to higher densities. Locally adapted breeds tend to be admixed and exhibit a high degree of genomic heterogeneity thus necessitating the optimization of SNP selection for downstream imputation. The aim of this study was to quantify the achievable imputation accuracy for a sample of 1,135 South African (SA) Drakensberger using several custom-derived lower-density panels varying in both SNP density and how the SNP were selected. From a pool of 120,608 genotyped SNP, subsets of SNP were chosen 1) at random, 2) with even genomic dispersion, 3) by maximizing the mean minor allele frequency (MAF), 4) using a combined score of MAF and linkage disequilibrium (LD), 5) using a partitioning-around-medoids (PAM) algorithm, and finally 6) using a hierarchical LD-based clustering algorithm. Imputation accuracy to higher density improved as SNP density increased; animal-wise imputation accuracy defined as the within-animal correlation between the imputed and actual alleles ranged from 0.625 to 0.990 when 2,500 randomly selected SNP were chosen versus a range of 0.918 to 0.999 when 50,000 randomly selected SNP were used. At a panel density of 10,000 SNP, the mean (standard deviation) animal-wise allele concordance rate was 0.976 (0.018) versus 0.982 (0.014) when the worst (i.e., random) as opposed to the best (i.e., combination of MAF and LD) SNP selection strategy was employed. A difference of 0.071 units was observed between the mean correlation-based accuracy of imputed SNP categorized as low (0.01<MAF≤0.1) versus high MAF (0.4<MAF≤0.5). Greater mean imputation accuracy was achieved for SNP located on autosomal extremes when these regions were populated with more SNP. The presented results suggested that genotype imputation can be a practical cost-saving strategy for indigenous breeds such as the South African Drakensberger. Based on the results, a genotyping panel consisting of approximately 10,000 SNP selected based on a combination of MAF and LD would suffice in achieving a less than 3% imputation error rate for a breed characterized by genomic admixture on the condition that these SNP are selected based on breed-specific selection criteria.


2017 ◽  
Vol 57 (10) ◽  
pp. 2096 ◽  
Author(s):  
T. J. Schatz

This study compares the performance of F1 Senepol × Brahman steers (F1 SEN) to Brahman (BRAH) steers in an Indonesian feedlot. The focus was to address concerns that crossbred cattle are discriminated against by live export cattle buyers due to a perception that they do not perform as well as Brahmans in Indonesian feedlots. F1 SEN (n = 54) and BRAH (n = 32) steers that had grazed together since weaning at Douglas Daly Research Farm (Northern Territory) were exported to Indonesia and fed for 121 days in a feedlot near Lampung (Sumatra, Indonesia). The average daily gain of the F1 SEN steers over the feeding period was 0.17 kg/day higher (P < 0.001) than the BRAH steers (1.71 vs 1.54 kg/day). As a result the F1 SEN put on an average of 21.6 kg more over the 121-day feeding period and they did not have a higher mortality rate. Consequently, F1 SEN steers performed better than BRAH in an Indonesian feedlot and these results should encourage live export cattle buyers to purchase this type of cattle (Brahman crossed with a tropically adapted Bos taurus breed) with confidence that they can perform at least as well as Brahmans in Indonesian feedlots, although it should be noted that growth rates are usually higher in F1 crosses than in subsequent generations.


2022 ◽  
Author(s):  
Lars Wienbrandt ◽  
David Ellinghaus

Background: Reference-based phasing and genotype imputation algorithms have been developed with sublinear theoretical runtime behaviour, but runtimes are still high in practice when large genome-wide reference datasets are used. Methods: We developed EagleImp, a software with algorithmic and technical improvements and new features for accurate and accelerated phasing and imputation in a single tool. Results: We compared accuracy and runtime of EagleImp with Eagle2, PBWT and prominent imputation servers using whole-genome sequencing data from the 1000 Genomes Project, the Haplotype Reference Consortium and simulated data with more than 1 million reference genomes. EagleImp is 2 to 10 times faster (depending on the single or multiprocessor configuration selected) than Eagle2/PBWT, with the same or better phasing and imputation quality in all tested scenarios. For common variants investigated in typical GWAS studies, EagleImp provides same or higher imputation accuracy than the Sanger Imputation Service, Michigan Imputation Server and the newly developed TOPMed Imputation Server, despite larger (not publicly available) reference panels. It has many new features, including automated chromosome splitting and memory management at runtime to avoid job aborts, fast reading and writing of large files, and various user-configurable algorithm and output options. Conclusions: Due to the technical optimisations, EagleImp can perform fast and accurate reference-based phasing and imputation for future very large reference panels with more than 1 million genomes. EagleImp is freely available for download from https://github.com/ikmb/eagleimp.


Sign in / Sign up

Export Citation Format

Share Document