Efficient computation of the genomic relationship matrix and other matrices used in single-step evaluation

2011 ◽  
Vol 128 (6) ◽  
pp. 422-428 ◽  
Author(s):  
I. Aguilar ◽  
I. Misztal ◽  
A. Legarra ◽  
S. Tsuruta
2019 ◽  
Vol 97 (Supplement_3) ◽  
pp. 49-50
Author(s):  
Yvette Steyn ◽  
Daniela Lourenco ◽  
Ignacy Misztal

Abstract Multi-breed evaluations have the advantage of increasing the size of the reference population for genomic evaluations and are quite simple; however, combining breeds usually have a negative impact on prediction accuracy. The aim of this study was to evaluate the use of a multi-breed genomic relationship matrix (G), where SNP for each breed are non-shared. The multi-breed G is set assuming known genotypes for one breed and missing genotypes for the remaining breeds. This setup may avoid spurious IBS relationships between breeds and considers breed-specific allele frequencies. This scenario was contrasted to multi-breed evaluations where all SNP are shared, i.e., the same SNP, and to single-breed evaluations. Different SNP densities, namely 9k and 45k, and different effective population sizes (Ne) were tested. Five breeds mimicking recent beef cattle populations that diverged from the same historical population were simulated using different selection criteria. It was assumed that QTL effects were the same over all breeds. For the recent population, generations 1 to 9 had approximately half of the animals genotyped, whereas all 1200 animals were genotyped in generation 10. Genotyped animals in generation 10 were set as validation; therefore, each breed had a validation set. Analysis were performed using single-step GBLUP (ssGBLUP). Prediction accuracy was calculated as correlation between true (T) and genomic estimated (GE) BV. Accuracies of GEBV were lower for the larger Ne and low SNP density. All three scenarios using 45K resulted in similar accuracies, suggesting that the marker density is high enough to account for relationships and linkage disequilibrium with QTL. A shared multi-breed evaluation using 9K resulted in a decrease of accuracy of 0.08 for a smaller Ne and 0.11 for a larger Ne. This loss was mostly avoided when markers were treated as non-shared within the same genomic relationship matrix.


2020 ◽  
Vol 60 (9) ◽  
pp. 1136
Author(s):  
M. A. Nilforooshan

Context In New Zealand, Romney is the most predominant breed and is reared as a dual-purpose sheep. The number of genotypes is rapidly increasing in the sheep population, and making use of both genotypes and pedigree information is of importance for genetic evaluations. Single-step genomic best linear unbiased prediction (ssGBLUP) is a method for simultaneous prediction of genetic merits for genotyped and non-genotyped animals. The combination and the compatibility of the genomic relationship matrix (G) and the pedigree relationship matrix for genotyped animals (A22) is important for unbiased ssGBLUP. Aims The aim of the present study was to find an optimum genetic relationship matrix for ssGBLUP weaning-weight evaluation of Romney sheep in New Zealand. Methods Data consisted of adjusted weaning weights for 2422011 sheep, 50K single-nucleotide polymorphism genotypes for 13304 animals and 3028688 animals in the pedigree. Blending of G and A22 was tested with weights (k) ranging from 0.2 to 0.99 (kG + (1 – k)A22), followed by none or one of the three methods of tuning G to A22. Key results The averages of G and A22 were close to each other for overall, diagonal and off-diagonal elements. Therefore, differently tuned G performed similarly. However, elements of G showed larger variation than did the elements of A22 and, on average, genotyped animals were less related in G than in A22. Correlations between genomic estimated breeding values (GEBV) for the top 500 genotyped animals, as well as the rank correlations, were almost 1 among ssGBLUP evaluations using tuned G. The corresponding correlations with BLUP evaluations were increased by blending G with a larger proportion of A22, and were further increased by tuning G, indicating improved compatibility between G and A22. Blending and tuning G suppressed the inflation of GEBV and bias and it moved the genetic trend closer to the genetic trend obtained from BLUP. Conclusions A combination of blending and tuning G to A22, with a blending rate of 0.5 at most, is recommended for weaning weight of Romney sheep in New Zealand. Failure to do that resulted in inflated GEBV that can reduce the accuracy of selection, especially for genotyped animals. Implications There is a growing interest in the single-step GBLUP method for simultaneous genetic evaluation of genotyped and non-genotyped animals, in which genomic and pedigree relationship matrices are admixed. Using data from New Zealand Romney sheep, we have shown that adjustment of the genomic relationship matrix on the basis of the pedigree relationship matrix is necessary to avoid inflated evaluations. Improving the compatibility between genomic and pedigree relationship matrices is important for obtaining accurate and unbiased single-step GBLUP evaluations.


2020 ◽  
Vol 98 (Supplement_4) ◽  
pp. 6-7
Author(s):  
Andre Garcia ◽  
Ignacio Aguilar ◽  
Andres Legarra ◽  
Stephen P Miller ◽  
Shogo Tsuruta ◽  
...  

Abstract With an ever-increasing number of genotyped animals, there is a question of whether to include all genotypes into single-step GBLUP (ssGBLUP) evaluations or to include only genotyped animals with phenotypes and use indirect predictions (IP) for the remaining young genotyped animals. Under ssGBLUP, SNP effects can be backsolved from GEBV, and IP can be calculated as the sum of SNP effects weighted by the gene content. To publish IP, a measure of accuracy that reflects the standard error of prediction, and that is comparable to GEBV accuracy, is needed. Our first objective was to test formulas to compute accuracy of IP by backsolving prediction error covariance (PEC) of GEBV into PEC of SNP effects. The second objective was to investigate the number of genotyped animals needed to obtain robust IP accuracy. Data were provided by the American Angus Association, with 38,000 post-weaning gain phenotypes and 60,000 genotyped animals. Correlations between GEBV and IP were ≥0.99. When all genotyped animals were used for PEC computations, accuracy correlations were also ≥0.99. Additionally, GEBV and IP accuracies were compatible, with both direct inversion of the genomic relationship matrix (G) or using the algorithm for proven and young (APY) to obtain G inverse. As the number of genotyped animals in PEC computations decreased to 15,000, accuracy correlations were still high (≥0.96), but IP accuracies were biased downwards. Indirect prediction accuracy can be successfully obtained from ssGBLUP without running an extra SNP-BLUP evaluation to compute SNP PEC. It is possible to reduce the number of genotyped animals in PEC computations, but accuracies may be slightly underestimated. When the amount of genomic and phenotypic data is large, the polygenic part of GEBV becomes small and IP can be very accurate. Further research is needed to approximate SNP PEC with a large number of genotyped animals.


2019 ◽  
Vol 97 (Supplement_3) ◽  
pp. 49-49
Author(s):  
Andre Garcia ◽  
Yutaka Masuda ◽  
Stephen P Miller ◽  
Ignacy Misztal ◽  
Daniela Lourenco

Abstract With the increasing number of genotyped animals, the algorithm for proven and young (APY) can be used to compute the inverse of the genomic relationship matrix (G-1apy) in genomic BLUP (GBLUP) and single-step GBLUP (ssGBLUP). This algorithm also allows the use of all genotyped animals to calculate SNP effects from genomic EBV (GEBV), which can then be used to obtain indirect predictions (IP) for interim evaluations, or as genomic prediction for animals not included in official evaluations. The objective of the study was to evaluate the quality of IP from GBLUP with increasing number of genotyped animals. Birth weight, weaning weight and post-wearing gain phenotypes and genotypes were provided by the American Angus Association. Phenotypes and genotypes were divided in 3 scenarios based on birth year: genotyped animals born up to 2013 (114,937), 2014 (183,847) and 2015 (280,506). A 3-trait model was fit and GBLUP with APY was used to calculate GEBV and SNP effects. To calculate G-1apy, 19,021 core animals were randomly sampled from animals born up to 2013. Core animals remained the same, whereas the number of non-core animals increased as more genotyped animals were added. Additional analyses had updated core animals for each scenario. SNP effects were also calculated based on G-1apy and G-1 only for core animals (G-1core). IP were computed for all animals in each scenario by multiplying the centered genotypes by the SNP effects. To access the quality of IP, correlation between IP and GEBV was calculated. The Correlations were greater than 0.99 for all traits in all scenarios. Despite the increase of non-core animals in APY, GEBV were successfully retrieved from SNP effects using IP. When SNP effects were calculated based on G-1core, updating the core animals as the number of genotyped animals increase seems to be the best choice.


2019 ◽  
Vol 97 (11) ◽  
pp. 4418-4427 ◽  
Author(s):  
Yvette Steyn ◽  
Daniela A L Lourenco ◽  
Ignacy Misztal

Abstract Combining breeds in a multibreed evaluation can have a negative impact on prediction accuracy, especially if single nucleotide polymorphism (SNP) effects differ among breeds. The aim of this study was to evaluate the use of a multibreed genomic relationship matrix (G), where SNP effects are considered to be unique to each breed, that is, nonshared. This multibreed G was created by treating SNP of different breeds as if they were on nonoverlapping positions on the chromosome, although, in reality, they were not. This simple setup may avoid spurious Identity by state (IBS) relationships between breeds and automatically considers breed-specific allele frequencies. This scenario was contrasted to a regular multibreed evaluation where all SNPs were shared, that is, the same position, and to single-breed evaluations. Different SNP densities (9k and 45k) and different effective population sizes (Ne) were tested. Five breeds mimicking recent beef cattle populations that diverged from the same historical population were simulated using different selection criteria. It was assumed that quantitative trait locus (QTL) effects were the same over all breeds. For the recent population, generations 1–9 had approximately half of the animals genotyped, whereas all animals in generation 10 were genotyped. Generation 10 animals were set for validation; therefore, each breed had a validation group. Analyses were performed using single-step genomic best linear unbiased prediction. Prediction accuracy was calculated as the correlation between true (T) and genomic estimated breeding values (GEBV). Accuracies of GEBV were lower for the larger Ne and low SNP density. All three evaluation scenarios using 45k resulted in similar accuracies, suggesting that the marker density is high enough to account for relationships and linkage disequilibrium with QTL. A shared multibreed evaluation using 9k resulted in a decrease of accuracy of 0.08 for a smaller Ne and 0.12 for a larger Ne. This loss was mostly avoided when markers were treated as nonshared within the same G matrix. A G matrix with nonshared SNP enables multibreed evaluations without considerably changing accuracy, especially with limited information per breed.


2019 ◽  
Vol 51 (1) ◽  
Author(s):  
Pascal Duenk ◽  
Mario P. L. Calus ◽  
Yvonne C. J. Wientjes ◽  
Vivian P. Breen ◽  
John M. Henshall ◽  
...  

Following publication of original article [1], we noticed that there was an error: Eq. (3) on page 5 is the genomic relationship matrix that


2018 ◽  
Vol 53 (6) ◽  
pp. 717-726 ◽  
Author(s):  
Michel Marques Farah ◽  
Marina Rufino Salinas Fortes ◽  
Matthew Kelly ◽  
Laercio Ribeiro Porto-Neto ◽  
Camila Tangari Meira ◽  
...  

Abstract: The objective of this work was to evaluate the effects of genomic information on the genetic evaluation of hip height in Brahman cattle using different matrices built from genomic and pedigree data. Hip height measurements from 1,695 animals, genotyped with high-density SNP chip or imputed from 50 K high-density SNP chip, were used. The numerator relationship matrix (NRM) was compared with the H matrix, which incorporated the NRM and genomic relationship (G) matrix simultaneously. The genotypes were used to estimate three versions of G: observed allele frequency (HGOF), average minor allele frequency (HGMF), and frequency of 0.5 for all markers (HG50). For matrix comparisons, animal data were either used in full or divided into calibration (80% older animals) and validation (20% younger animals) datasets. The accuracy values for the NRM, HGOF, and HG50 were 0.776, 0.813, and 0.594, respectively. The NRM and HGOF showed similar minor variances for diagonal and off-diagonal elements, as well as for estimated breeding values. The use of genomic information resulted in relationship estimates similar to those obtained based on pedigree; however, HGOF is the best option for estimating the genomic relationship matrix and results in a higher prediction accuracy. The ranking of the top 20% animals was very similar for all matrices, but the ranking within them varies depending on the method used.


Sign in / Sign up

Export Citation Format

Share Document