scholarly journals Efficient approximation of reliabilities for single-step genomic BLUP models with the Algorithm for Proven and Young

Author(s):  
M Bermann ◽  
D Lourenco ◽  
I Misztal

Abstract The objectives of this study were to develop an efficient algorithm for calculating prediction error variances (PEV) for GBLUP models using the Algorithm for Proven and Young (APY), extend it to single-step GBLUP (ssGBLUP), and to apply this algorithm for approximating the theoretical reliabilities for single and multiple trait models in ssGBLUP. The PEV with APY was calculated by block-sparse inversion, efficiently exploiting the sparse structure of the inverse of the genomic relationship matrix with APY. Single-step GBLUP reliabilities were approximated by combining reliabilities with and without genomic information in terms of effective record contributions. Multi-trait reliabilities relied on single-trait results adjusted using the genetic and residual covariance matrices among traits. Tests involved two datasets provided by the American Angus Association. A small dataset (Data1) was used for comparing the approximated reliabilities with the reliabilities obtained by the inversion of the left-hand side of the mixed model equations. The large dataset (Data2) was used for evaluating the computational performance of the algorithm. Analyses with both datasets used single-trait and three-trait models. The number of animals in the pedigree ranged from 167,951 in Data1 to 10,213,401 in Data2, with 50,000 and 20,000 genotyped animals for single-trait and multiple trait-analysis, respectively, in Data1 and 335,325 in Data2. Correlations between estimated and exact reliabilities obtained by inversion ranged from 0.97 to 0.99, whereas the intercept and slope of the regression of the exact on the approximated reliabilities ranged from 0.00 to 0.04 and from 0.93 to 1.05, respectively. For the three-trait model with the largest dataset (Data2), the elapsed time for the reliability estimation was eleven minutes. The computational complexity of the proposed algorithm increased linearly with the number of genotyped animals and with the number of traits in the model. This algorithm can efficiently approximate the theoretical reliability of genomic estimated breeding values in ssGBLUP with APY for large numbers of genotyped animals at a low cost.

2019 ◽  
Vol 51 (1) ◽  
Author(s):  
Øyvind Nordbø ◽  
Arne B. Gjuvsland ◽  
Leiv Sigbjørn Eikje ◽  
Theo Meuwissen

Abstract Background The main aim of single-step genomic predictions was to facilitate optimal selection in populations consisting of both genotyped and non-genotyped individuals. However, in spite of intensive research, biases still occur, which make it difficult to perform optimal selection across groups of animals. The objective of this study was to investigate whether incomplete genotype datasets with errors could be a potential source of level-bias between genotyped and non-genotyped animals and between animals genotyped on different single nucleotide polymorphism (SNP) panels in single-step genomic predictions. Results Incomplete and erroneous genotypes of young animals caused biases in breeding values between groups of animals. Systematic noise or missing data for less than 1% of the SNPs in the genotype data had substantial effects on the differences in breeding values between genotyped and non-genotyped animals, and between animals genotyped on different chips. The breeding values of young genotyped individuals were biased upward, and the magnitude was up to 0.8 genetic standard deviations, compared with breeding values of non-genotyped individuals. Similarly, the magnitude of a small value added to the diagonal of the genomic relationship matrix affected the level of average breeding values between groups of genotyped and non-genotyped animals. Cross-validation accuracies and regression coefficients were not sensitive to these factors. Conclusions Because, historically, different SNP chips have been used for genotyping different parts of a population, fine-tuning of imputation within and across SNP chips and handling of missing genotypes are crucial for reducing bias. Although all the SNPs used for estimating breeding values are present on the chip used for genotyping young animals, incompleteness and some genotype errors might lead to level-biases in breeding values.


2019 ◽  
Vol 97 (12) ◽  
pp. 4761-4769 ◽  
Author(s):  
Pâmela A Alexandre ◽  
Laercio R Porto-Neto ◽  
Emre Karaman ◽  
Sigrid A Lehnert ◽  
Antonio Reverter

Abstract The growing concern with the environment is making important for livestock producers to focus on selection for efficiency-related traits, which is a challenge for commercial cattle herds due to the lack of pedigree information. To explore a cost-effective opportunity for genomic evaluations of commercial herds, this study compared the accuracy of bulls’ genomic estimated breeding values (GEBV) using different pooled genotype strategies. We used ten replicates of previously simulated genomic and phenotypic data for one low (t1) and one moderate (t2) heritability trait of 200 sires and 2,200 progeny. Sire’s GEBV were calculated using a univariate mixed model, with a hybrid genomic relationship matrix (h-GRM) relating sires to: 1) 1,100 pools of 2 animals; 2) 440 pools of 5 animals; 3) 220 pools of 10 animals; 4) 110 pools of 20 animals; 5) 88 pools of 25 animals; 6) 44 pools of 50 animals; and 7) 22 pools of 100 animals. Pooling criteria were: at random, grouped sorting by t1, grouped sorting by t2, and grouped sorting by a combination of t1 and t2. The same criteria were used to select 110, 220, 440, and 1,100 individual genotypes for GEBV calculation to compare GEBV accuracy using the same number of individual genotypes and pools. Although the best accuracy was achieved for a given trait when pools were grouped based on that same trait (t1: 0.50–0.56, t2: 0.66–0.77), pooling by one trait impacted negatively on the accuracy of GEBV for the other trait (t1: 0.25–0.46, t2: 0.29–0.71). Therefore, the combined measure may be a feasible alternative to use the same pools to calculate GEBVs for both traits (t1: 0.45–0.57, t2: 0.62–0.76). Pools of 10 individuals were identified as representing a good compromise between loss of accuracy (~10%–15%) and cost savings (~90%) from genotype assays. In addition, we demonstrated that in more than 90% of the simulations, pools present higher sires’ GEBV accuracy than individual genotypes when the number of genotype assays is limited (i.e., 110 or 220) and animals are assigned to pools based on phenotype. Pools assigned at random presented the poorest results (t1: 0.07–0.45, t2: 0.14–0.70). In conclusion, pooling by phenotype is the best approach to implementing genomic evaluation using commercial herd data, particularly when pools of 10 individuals are evaluated. While combining phenotypes seems a promising strategy to allow more flexibility to the estimates made using pools, more studies are necessary in this regard.


2019 ◽  
Vol 97 (Supplement_3) ◽  
pp. 49-50
Author(s):  
Yvette Steyn ◽  
Daniela Lourenco ◽  
Ignacy Misztal

Abstract Multi-breed evaluations have the advantage of increasing the size of the reference population for genomic evaluations and are quite simple; however, combining breeds usually have a negative impact on prediction accuracy. The aim of this study was to evaluate the use of a multi-breed genomic relationship matrix (G), where SNP for each breed are non-shared. The multi-breed G is set assuming known genotypes for one breed and missing genotypes for the remaining breeds. This setup may avoid spurious IBS relationships between breeds and considers breed-specific allele frequencies. This scenario was contrasted to multi-breed evaluations where all SNP are shared, i.e., the same SNP, and to single-breed evaluations. Different SNP densities, namely 9k and 45k, and different effective population sizes (Ne) were tested. Five breeds mimicking recent beef cattle populations that diverged from the same historical population were simulated using different selection criteria. It was assumed that QTL effects were the same over all breeds. For the recent population, generations 1 to 9 had approximately half of the animals genotyped, whereas all 1200 animals were genotyped in generation 10. Genotyped animals in generation 10 were set as validation; therefore, each breed had a validation set. Analysis were performed using single-step GBLUP (ssGBLUP). Prediction accuracy was calculated as correlation between true (T) and genomic estimated (GE) BV. Accuracies of GEBV were lower for the larger Ne and low SNP density. All three scenarios using 45K resulted in similar accuracies, suggesting that the marker density is high enough to account for relationships and linkage disequilibrium with QTL. A shared multi-breed evaluation using 9K resulted in a decrease of accuracy of 0.08 for a smaller Ne and 0.11 for a larger Ne. This loss was mostly avoided when markers were treated as non-shared within the same genomic relationship matrix.


2021 ◽  
Author(s):  
Mitchell J. Feldmann ◽  
Hans-Peter Piepho ◽  
Steven J. Knapp

Many important traits in plants, animals, and microbes are polygenic and are therefore difficult to improve through traditional marker?assisted selection. Genomic prediction addresses this by enabling the inclusion of all genetic data in a mixed model framework. The main method for predicting breeding values is genomic best linear unbiased prediction (GBLUP), which uses the realized genomic relationship or kinship matrix (K) to connect genotype to phenotype. The use of relationship matrices allows information to be shared for estimating the genetic values for observed entries and predicting genetic values for unobserved entries. One of the key parameters of such models is genomic heritability (h2g), or the variance of a trait associated with a genome-wide sample of DNA polymorphisms. Here we discuss the relationship between several common methods for calculating the genomic relationship matrix and propose a new matrix based on the average semivariance that yields accurate estimates of genomic variance in the observed population regardless of the focal population quality as well as accurate breeding value predictions in unobserved samples. Notably, our proposed method is highly similar to the approach presented by Legarra (2016) despite different mathematical derivations and statistical perspectives and only deviates from the classic approach presented in VanRaden (2008) by a scaling factor. With current approaches, we found that the genomic heritability tends to be either over- or underestimated depending on the scaling and centering applied to the marker matrix (Z), the value of the average diagonal element of K, and the assortment of alleles and heterozygosity (H) in the observed population and that, unlike its predecessors, our newly proposed kinship matrix KASV yields accurate estimates of h2g in the observed population, generalizes to larger populations, and produces BLUPs equivalent to common methods in plants and animals.


Author(s):  
Natalia S Forneris ◽  
Carolina A Garcia-Baccino ◽  
Rodolfo J C Cantet ◽  
Zulma G Vitezica

Abstract Inbreeding depression reduces mean phenotypic value of important traits in livestock populations. The goal of this work was to estimate the level of inbreeding and inbreeding depression for growth and reproductive traits in Argentinean Brangus cattle, in order to obtain a diagnosis and monitor breed management. Data comprised 359,257 (from which 1,990 were genotyped for 40,678 SNP) animals with phenotypic records for at least one of three growth traits: birth weight (BW), weaning weight (WW) and finishing weight (FW). For scrotal circumference (SC), 52,399 phenotypic records (of which 256 had genotype) were available. There were 530,938 animals in pedigree. Three methods to estimate inbreeding coefficients were used. Pedigree-based inbreeding coefficients were estimated accounting for missing parents. Inbreeding coefficients combining genotyped and nongenotyped animal information were also computed from matrix H of the single-step approach. Genomic inbreeding coefficients were estimated using homozygous segments obtained from a Hidden Markov model (HMM) approach. Inbreeding depression was estimated from the regression of the phenotype on inbreeding coefficients in a multiple-trait mixed model framework, either for the whole data set or the data set of genotyped animals. All traits were unfavorably affected by inbreeding depression. A 10% increase in pedigree-based or combined inbreeding would result in a reduction of 0.34 - 0.39 kg in BW, of 2.77 - 3.28 kg in WW and 0.23 cm in SC. For FW a 10% increase in pedigree-based, genomic or combined inbreeding would result in a decrease of 8.05 - 11.57 kg. Genomic inbreeding based on the HMM was able to capture inbreeding depression, even in such a compressed genotyped data set.


1985 ◽  
Vol 36 (3) ◽  
pp. 527 ◽  
Author(s):  
H-U Graser ◽  
K Hammond

A multiple-trait mixed model is defined for regular use in the Australian beef industry for the estimation of breeding values for continuous traits of sires used non-randomly across a number of herds and/or years. Maternal grandsires, the numerator relationship matrix, appropriate fixed effects, and the capacity to partition direct and maternal effects are incorporated in this parent model. The model was fitted to the National Beef Recording Scheme's data bank for three growth traits of the Australian Simental breed, viz 200-, 365- and 550-day weights. Estimates are obtained for the effects of sex, dam age, grade of dam, age of calf and breed of base dam. The range in estimated breeding value is reported for each trait, with 200-day weight being partitioned into 'calves' and 'daughters' calves', for the Simmental sires commonly used in Australia. Estimates of the fixed effects were large, and dam age, grade of dam and breed of base dam had an important influence on growth to 365 days of age. The faster growth of higher percentage Simmental calves to 200 days continued to 550 days. Estimates of genetic variance for the traits were lower than reported for overseas populations of Simmental cattle, and the genetic covariance between direct and maternal effects for 200-day weight was slightly positive.


Genes ◽  
2020 ◽  
Vol 11 (7) ◽  
pp. 790 ◽  
Author(s):  
Daniela Lourenco ◽  
Andres Legarra ◽  
Shogo Tsuruta ◽  
Yutaka Masuda ◽  
Ignacio Aguilar ◽  
...  

Single-step genomic evaluation became a standard procedure in livestock breeding, and the main reason is the ability to combine all pedigree, phenotypes, and genotypes available into one single evaluation, without the need of post-analysis processing. Therefore, the incorporation of data on genotyped and non-genotyped animals in this method is straightforward. Since 2009, two main implementations of single-step were proposed. One is called single-step genomic best linear unbiased prediction (ssGBLUP) and uses single nucleotide polymorphism (SNP) to construct the genomic relationship matrix; the other is the single-step Bayesian regression (ssBR), which is a marker effect model. Under the same assumptions, both models are equivalent. In this review, we focus solely on ssGBLUP. The implementation of ssGBLUP into the BLUPF90 software suite was done in 2009, and since then, several changes were made to make ssGBLUP flexible to any model, number of traits, number of phenotypes, and number of genotyped animals. Single-step GBLUP from the BLUPF90 software suite has been used for genomic evaluations worldwide. In this review, we will show theoretical developments and numerical examples of ssGBLUP using SNP data from regular chips to sequence data.


Sign in / Sign up

Export Citation Format

Share Document