scholarly journals Level-biases in estimated breeding values due to the use of different SNP panels over time in ssGBLUP

2019 ◽  
Vol 51 (1) ◽  
Author(s):  
Øyvind Nordbø ◽  
Arne B. Gjuvsland ◽  
Leiv Sigbjørn Eikje ◽  
Theo Meuwissen

Abstract Background The main aim of single-step genomic predictions was to facilitate optimal selection in populations consisting of both genotyped and non-genotyped individuals. However, in spite of intensive research, biases still occur, which make it difficult to perform optimal selection across groups of animals. The objective of this study was to investigate whether incomplete genotype datasets with errors could be a potential source of level-bias between genotyped and non-genotyped animals and between animals genotyped on different single nucleotide polymorphism (SNP) panels in single-step genomic predictions. Results Incomplete and erroneous genotypes of young animals caused biases in breeding values between groups of animals. Systematic noise or missing data for less than 1% of the SNPs in the genotype data had substantial effects on the differences in breeding values between genotyped and non-genotyped animals, and between animals genotyped on different chips. The breeding values of young genotyped individuals were biased upward, and the magnitude was up to 0.8 genetic standard deviations, compared with breeding values of non-genotyped individuals. Similarly, the magnitude of a small value added to the diagonal of the genomic relationship matrix affected the level of average breeding values between groups of genotyped and non-genotyped animals. Cross-validation accuracies and regression coefficients were not sensitive to these factors. Conclusions Because, historically, different SNP chips have been used for genotyping different parts of a population, fine-tuning of imputation within and across SNP chips and handling of missing genotypes are crucial for reducing bias. Although all the SNPs used for estimating breeding values are present on the chip used for genotyping young animals, incompleteness and some genotype errors might lead to level-biases in breeding values.

2020 ◽  
Author(s):  
Rafet Al-Tobasei ◽  
Ali R. Ali ◽  
Andre L. S. Garcia ◽  
Daniela Lourenco ◽  
Tim Leeds ◽  
...  

Abstract BackgroundOne of the most important goals for the rainbow trout aquaculture industry is to improve fillet yield and fillet quality. Previously, we showed that a 50K transcribed-SNP chip can be used to detect quantitative trait loci (QTL) associated with fillet yield and fillet firmness. In this study, data from 1,568 fish genotyped for the 50K transcribed-SNP chip and ~774 fish phenotyped for fillet yield and fillet firmness were used in a single-step genomic BLUP (ssGBLUP) model to compute the genomic estimated breeding values (GEBV). In addition, pedigree-based best linear unbiased prediction (PBLUP) was used to calculate traditional, family-based estimated breeding values (EBV). ResultsThe genomic predictions outperformed the traditional EBV by 35% for fillet yield and 42% for fillet firmness. The predictive ability for fillet yield and fillet firmness was 0.19 - 0.20 with PBLUP, and 0.27 with ssGBLUP. Additionally, reducing SNP panel densities indicated that using 500 – 800 SNPs in genomic predictions still provides predictive abilities higher than PBLUP. ConclusionThese results suggest that genomic evaluation is a feasible strategy to identify and select fish with superior genetic merit within rainbow trout families, even with low-density SNP panels.


2020 ◽  
Author(s):  
Rafet Al-Tobasei ◽  
Ali R. Ali ◽  
Andre L. S. Garcia ◽  
Daniela Lourenco ◽  
Tim Leeds ◽  
...  

Abstract Background One of the most important goals for the rainbow trout aquaculture industry is to improve muscle yield and fillet quality. Previously, we showed that a 50K transcribed-SNP chip can be used to detect quantitative trait loci (QTL) associated with muscle yield and fillet firmness. In this study, data from 1,568 fish genotyped for the 50K transcribed-SNP chip and ~774 fish phenotyped for muscle yield and fillet firmness were used in a single-step genomic BLUP (ssGBLUP) model to compute the genomic estimated breeding values (GEBV). In addition, pedigree-based best linear unbiased prediction (PBLUP) was used to calculate traditional, family-based estimated breeding values (EBV). Results The genomic predictions outperformed the traditional EBV by 35% for muscle yield and 42% for fillet firmness. The predictive ability for muscle yield and fillet firmness was 0.19 - 0.20 with PBLUP, and 0.27 with ssGBLUP. Additionally, reducing SNP panel densities indicated that using 500 – 800 SNPs in genomic predictions still provides predictive abilities higher than PBLUP. Conclusion These results suggest that genomic evaluation is a feasible strategy to identify and select fish with superior genetic merit within rainbow trout families, even with low-density SNP panels.


BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Rafet Al-Tobasei ◽  
Ali Ali ◽  
Andre L. S. Garcia ◽  
Daniela Lourenco ◽  
Tim Leeds ◽  
...  

Abstract Background One of the most important goals for the rainbow trout aquaculture industry is to improve fillet yield and fillet quality. Previously, we showed that a 50 K transcribed-SNP chip can be used to detect quantitative trait loci (QTL) associated with fillet yield and fillet firmness. In this study, data from 1568 fish genotyped for the 50 K transcribed-SNP chip and ~ 774 fish phenotyped for fillet yield and fillet firmness were used in a single-step genomic BLUP (ssGBLUP) model to compute the genomic estimated breeding values (GEBV). In addition, pedigree-based best linear unbiased prediction (PBLUP) was used to calculate traditional, family-based estimated breeding values (EBV). Results The genomic predictions outperformed the traditional EBV by 35% for fillet yield and 42% for fillet firmness. The predictive ability for fillet yield and fillet firmness was 0.19–0.20 with PBLUP, and 0.27 with ssGBLUP. Additionally, reducing SNP panel densities indicated that using 500–800 SNPs in genomic predictions still provides predictive abilities higher than PBLUP. Conclusion These results suggest that genomic evaluation is a feasible strategy to identify and select fish with superior genetic merit within rainbow trout families, even with low-density SNP panels.


2020 ◽  
Author(s):  
Rafet Al-Tobasei ◽  
Ali R. Ali ◽  
Andre L. S. Garcia ◽  
Daniela Lourenco ◽  
Tim Leeds ◽  
...  

Abstract Background One of the most important goals for the rainbow trout aquaculture industry is to improve fillet yield and fillet quality. Previously, we showed that a 50K transcribed-SNP chip can be used to detect quantitative trait loci (QTL) associated with fillet yield and fillet firmness. In this study, data from 1,568 fish genotyped for the 50K transcribed-SNP chip and ~774 fish phenotyped for fillet yield and fillet firmness were used in a single-step genomic BLUP (ssGBLUP) model to compute the genomic estimated breeding values (GEBV). In addition, pedigree-based best linear unbiased prediction (PBLUP) was used to calculate traditional, family-based estimated breeding values (EBV). Results The genomic predictions outperformed the traditional EBV by 35% for fillet yield and 42% for fillet firmness. The predictive ability for fillet yield and fillet firmness was 0.19 - 0.20 with PBLUP, and 0.27 with ssGBLUP. Additionally, reducing SNP panel densities indicated that using 500 – 800 SNPs in genomic predictions still provides predictive abilities higher than PBLUP. Conclusion These results suggest that genomic evaluation is a feasible strategy to identify and select fish with superior genetic merit within rainbow trout families, even with low-density SNP panels.


2020 ◽  
Vol 98 (12) ◽  
Author(s):  
Ignacy Misztal ◽  
Shogo Tsuruta ◽  
Ivan Pocrnic ◽  
Daniela Lourenco

Abstract Single-step genomic best linear unbiased prediction with the Algorithm for Proven and Young (APY) is a popular method for large-scale genomic evaluations. With the APY algorithm, animals are designated as core or noncore, and the computing resources to create the inverse of the genomic relationship matrix (GRM) are reduced by inverting only a portion of that matrix for core animals. However, using different core sets of the same size causes fluctuations in genomic estimated breeding values (GEBVs) up to one additive standard deviation without affecting prediction accuracy. About 2% of the variation in the GRM is noise. In the recursion formula for APY, the error term modeling the noise is different for every set of core animals, creating changes in breeding values. While average changes are small, and correlations between breeding values estimated with different core animals are close to 1.0, based on the normal distribution theory, outliers can be several times bigger than the average. Tests included commercial datasets from beef and dairy cattle and from pigs. Beyond a certain number of core animals, the prediction accuracy did not improve, but fluctuations decreased with more animals. Fluctuations were much smaller than the possible changes based on prediction error variance. GEBVs change over time even for animals with no new data as genomic relationships ties all the genotyped animals, causing reranking of top animals. In contrast, changes in nongenomic models without new data are small. Also, GEBV can change due to details in the model, such as redefinition of contemporary groups or unknown parent groups. In particular, increasing the fraction of blending of the GRM with a pedigree relationship matrix from 5% to 20% caused changes in GEBV up to 0.45 SD, with a correlation of GEBV > 0.99. Fluctuations in genomic predictions are part of genomic evaluation models and are also present without the APY algorithm when genomic evaluations are computed with updated data. The best approach to reduce the impact of fluctuations in genomic evaluations is to make selection decisions not on individual animals with limited individual accuracy but on groups of animals with high average accuracy.


2018 ◽  
Vol 98 (3) ◽  
pp. 565-575 ◽  
Author(s):  
Mario L. Piccoli ◽  
Luiz F. Brito ◽  
José Braccini ◽  
Fernanda V. Brito ◽  
Fernando F. Cardoso ◽  
...  

The statistical methods used in the genetic evaluations are a key component of the process and can be best compared by using simulated data. The latter is especially true in grazing beef cattle production systems, where the number of proven bulls with highly reliable estimated breeding values is limited to allow for a trustworthy validation of genomic predictions. Therefore, we simulated data for 4980 beef cattle aiming to compare single-step genomic best linear unbiased prediction (ssGBLUP), which simultaneously incorporates pedigree, phenotypic, and genomic data into genomic evaluations, and two-step GBLUP (tsGBLUP) procedures and genomic estimated breeding values (GEBVs) blending methods. The greatest increases in GEBV accuracies compared with the parents’ average estimated breeding values (EBVPA) were 0.364 and 0.341 for ssGBLUP and tsGBLUP, respectively. Direct genomic value and GEBV accuracies when using ssGBLUP and tsGBLUP procedures were similar, except for the GEBV accuracies using Hayes’ blending method in tsGBLUP. There was no significant or slight bias in genomic predictions from ssGBLUP or tsGBLUP (using VanRaden’s blending method), indicating that these predictions are on the same scale compared with the true breeding values. Overall, genetic evaluations including genomic information resulted in gains in accuracy >100% compared with the EBVPA. In addition, there were no significant differences between the selected animals (10% males and 50% females) by using ssGBLUP or tsGBLUP.


2012 ◽  
Vol 52 (3) ◽  
pp. 126 ◽  
Author(s):  
Andrew A. Swan ◽  
David J. Johnston ◽  
Daniel J. Brown ◽  
Bruce Tier ◽  
Hans-U. Graser

Genomic information has the potential to change the way beef cattle and sheep are selected and to substantially increase genetic gains. Ideally, genomic data will be used in combination with pedigree and phenotypic data to increase the accuracy of estimated breeding values (EBVs) and selection indexes. The first example of this in Australia was the integration of four markers for tenderness into beef cattle breeding values. Subsequently, the availability of high-density single nucleotide polymorphism (SNP) panels has made selection using genomic information possible, while at the same time creating significant challenges for genetic evaluation with regard to both data management and statistical modelling. Reference populations have been established in both the beef cattle and sheep industries, in which an extensive range of phenotypes have been collected and animals genotyped mainly using 50K SNP panels. From this information, genomic predictions of breeding value have been developed, albeit with varying levels of accuracy. These predictions have been incorporated into routine genetic evaluations using three approaches and trial results are now available to breeders. In the first, genomic predictions have been included in genetic evaluation models as additional traits. The challenges with this method have been the construction of consistent genetic covariance matrices, and a significant increase in computing time. The second approach has been to use a selection index procedure to blend genomic predictions with existing EBVs. This method has been shown to produce very similar results, and has the advantage of being simple to implement and fast to operate, although consistent genetic covariance matrices are still required. Third, in sheep a single-step analysis combining a genomic relationship matrix with a standard pedigree-based relationship matrix has been used to estimate breeding values for carcass and eating-quality traits. It is likely that this procedure or one similar will be incorporated into routine evaluations in the near future. While significant progress has been made in implementing methods of integrating genomic information in both beef and sheep evaluations in Australia, the major challenges for the future will be to continue to collect the phenotypes needed to derive accurate genomic predictions, and in managing much larger volumes of genomic data as the number of animals genotyped and the density of markers increase.


2021 ◽  
Vol 12 ◽  
Author(s):  
Andre C. Araujo ◽  
Paulo L. S. Carneiro ◽  
Hinayah R. Oliveira ◽  
Flavio S. Schenkel ◽  
Renata Veroneze ◽  
...  

The level of genetic diversity in a population is inversely proportional to the linkage disequilibrium (LD) between individual single nucleotide polymorphisms (SNPs) and quantitative trait loci (QTLs), leading to lower predictive ability of genomic breeding values (GEBVs) in high genetically diverse populations. Haplotype-based predictions could outperform individual SNP predictions by better capturing the LD between SNP and QTL. Therefore, we aimed to evaluate the accuracy and bias of individual-SNP- and haplotype-based genomic predictions under the single-step-genomic best linear unbiased prediction (ssGBLUP) approach in genetically diverse populations. We simulated purebred and composite sheep populations using literature parameters for moderate and low heritability traits. The haplotypes were created based on LD thresholds of 0.1, 0.3, and 0.6. Pseudo-SNPs from unique haplotype alleles were used to create the genomic relationship matrix (G) in the ssGBLUP analyses. Alternative scenarios were compared in which the pseudo-SNPs were combined with non-LD clustered SNPs, only pseudo-SNPs, or haplotypes fitted in a second G (two relationship matrices). The GEBV accuracies for the moderate heritability-trait scenarios fitting individual SNPs ranged from 0.41 to 0.55 and with haplotypes from 0.17 to 0.54 in the most (Ne ≅ 450) and less (Ne < 200) genetically diverse populations, respectively, and the bias fitting individual SNPs or haplotypes ranged between −0.14 and −0.08 and from −0.62 to −0.08, respectively. For the low heritability-trait scenarios, the GEBV accuracies fitting individual SNPs ranged from 0.24 to 0.32, and for fitting haplotypes, it ranged from 0.11 to 0.32 in the more (Ne ≅ 250) and less (Ne ≅ 100) genetically diverse populations, respectively, and the bias ranged between −0.36 and −0.32 and from −0.78 to −0.33 fitting individual SNPs or haplotypes, respectively. The lowest accuracies and largest biases were observed fitting only pseudo-SNPs from blocks constructed with an LD threshold of 0.3 (p < 0.05), whereas the best results were obtained using only SNPs or the combination of independent SNPs and pseudo-SNPs in one or two G matrices, in both heritability levels and all populations regardless of the level of genetic diversity. In summary, haplotype-based models did not improve the performance of genomic predictions in genetically diverse populations.


2021 ◽  
Vol 99 (Supplement_3) ◽  
pp. 254-254
Author(s):  
Matias Bermann ◽  
Daniela Lourenco ◽  
Vivian Breen ◽  
Rachel Hawken ◽  
Fernando Brito Lopes ◽  
...  

Abstract The objectives of this study were to model the inclusion of a group of external birds into a local broiler chicken population for the purpose of genomic evaluations and evaluating the behavior of two accuracy estimators under different model specifications. The pedigree was composed by 242,413 birds and genotypes were available for 107,216 birds. A five-trait model that included one growth, two yield, and two efficiency traits was used for the analyses. The strategies to model the introduction of external birds were to include a fixed effect representing the origin of parents and to use UPG or metafounders. Genomic estimated breeding values (GEBV) were obtained with single-step GBLUP (ssGBLUP) using the Algorithm for Proven and Young (APY). Bias, dispersion, and accuracy of GEBV for the validation birds, i.e., from the most recent generation, were computed. The bias and dispersion were estimated with the LR-method, whereas accuracy was estimated by the LR-method and predictive ability. Models with fixed UPG and estimated inbreeding or random UPG resulted in similar GEBV. The inclusion of an extra fixed effect in the model made the GEBV unbiased and reduced the inflation, while models without such an effect were significantly biased. Genomic predictions with metafounders were slightly biased and inflated due to the unbalanced number of observations assigned to each metafounder. When combining local and external populations, the greatest accuracy and smallest bias can be obtained by adding an extra fixed effect to account for the origin of parents plus UPG with estimated inbreeding or random UPG. To estimate the accuracy, the LR-method is more consistent among models, whereas predictive ability greatly depends on the model specification, that is, on the fixed effects included in the model. When changing model specification, the largest variation for the LR-method was 20%, while for predictive ability was 110%.


Sign in / Sign up

Export Citation Format

Share Document