Improving accuracy of direct and maternal genetic effects in genomic evaluations using pooled boar semen: a simulation study1

Abstract Pooling semen of multiple boars is commonly used in swine production systems. Compared with single boar systems, this technique changes family structure creating maternal half-sib families. The aim of this simulation study was to investigate how pooling semen affects the accuracy of estimating direct and maternal effects for individual piglet birth weight, in purebred pigs. Different scenarios of pooling semen were simulated by allowing the same female to mate from 1 to 6 boars, per insemination, whereas litter size was kept constant (N = 12). In each pooled boar scenario, genomic information was used to construct either the genomic relationship matrix (G) or to reconstruct pedigree in addition to G. Genotypes were generated for 60,000 SNPs evenly distributed across 18 autosomes. From the 5 simulated generations, only animals from generations 3 to 5 were genotyped (N = 36,000). Direct and maternal true breeding values (TBV) were computed as the sum of the effects of the 1,080 QTLs. Phenotypes were constructed as the sum of direct TBV, maternal TBV, an overall mean of 1.25 kg, and a residual effect. The simulated heritabilities for direct and maternal effects were 0.056 and 0.19, respectively, and the genetic correlation between both effects was −0.25. All simulations were replicated 5 times. Variance components and direct and maternal heritability were estimated using average information REML. Predictions were computed via pedigree-based BLUP and single-step genomic BLUP (ssGBLUP). Genotyped littermates in the last generation were used for validation. Prediction accuracies were calculated as correlations between EBV and TBV for direct (accdirect) and maternal (accmat) effects. When boars were known, accdirect were 0.21 (1 boar) and 0.26 (6 boars) for BLUP, whereas for ssGBLUP, they were 0.38 (1 boar) and 0.43 (6 boars). When boars were unknown, accdirect was lower in BLUP but similar in ssGBLUP. For the scenario with known boars, accmat was 0.58 and 0.63 for 1 and 6 boars, respectively, under ssGBLUP. For unknown boars, accmat was 0.63 for 2 boars and 0.62 for 6 boars in ssGBLUP. In general, accdirect and accmat were lower in the single-boar scenario compared with pooled semen scenarios, indicating that a half-sib structure is more adequate to estimate direct and maternal effects. Using pooled semen from multiple boars can help us to improve accuracy of predicting maternal and direct effects when maternal half-sib families are larger than 2.

Download Full-text

PSI-B-21 Alternative SNP weighting for multi-step and single-step genomic BLUP in the presence of causative variants

Journal of Animal Science ◽

10.1093/jas/skab235.417 ◽

2021 ◽

Vol 99 (Supplement_3) ◽

pp. 228-229

Author(s):

Bruna Santana ◽

Molly Riser ◽

Breno O Fragomeni

Keyword(s):

Simulated Data ◽

Snp Markers ◽

Single Step ◽

Breeding Value ◽

Genomic Relationship Matrix ◽

Relationship Matrix ◽

True Breeding ◽

Non Linear ◽

Causal Variants ◽

True Breeding Value

Abstract This study aimed to evaluate the accuracy of genomic prediction with simulated data, using SNP markers, causal quantitative trait nucleotide (QTN), and the combination of both. The methods used were the best linear unbiased prediction (GBLUP) and single-step GBLUP (ssGBLUP), with alternative SNP weights. Data were simulated using the package AlphasimR. Trait heritability of 0.3 was assumed, and genetic variance was fully accounted for by 100 or 1000 QTNs. A population with an effective size of 200 was selected, and 20 generations were simulated. The genomic information mimicked the 29 bovine chromosomes and included 50k SNP markers evenly distributed across the genome. Approximately 16800 genotypes were available from selected sires and dams in generations 16–19, and 2000 animals in generation 20. Phenotypes for young animals were not included in the analysis, as they were used in the validation. For GBLUP, three pseudo-phenotypes were considered: the raw phenotype, the true breeding value, and the true breeding value with noise added. The genomic relationship matrix was weighted using quadratic weights, calculated based on the SNP variance, and non-linear A, following different equation parameters. The scenario with exclusively causal variants presented accuracies close to 1 for 100 QTL, and slightly lower in the 1000 QTL. For the SNP + QTN scenario, quadratic weights promoted higher accuracy gains than the SNPs alone, especially in the 100 QTN trait. Accuracies converged at higher values for both quadratic and non-linear A weights in the 100 QTN scenario. For the 1000 QTN trait, quadratic weights diverged and reduced accuracy, while non-linear A maintained accuracy at their peaks, depending on the equation parameters. Parameters of non-linear A for highest accuracy were different in each scenario and type of analysis. Proportionally, gains in accuracy were more prominent with GBLUP than with ssGBLUP.

Download Full-text

Level-biases in estimated breeding values due to the use of different SNP panels over time in ssGBLUP

Genetics Selection Evolution ◽

10.1186/s12711-019-0517-z ◽

2019 ◽

Vol 51 (1) ◽

Cited By ~ 1

Author(s):

Øyvind Nordbø ◽

Arne B. Gjuvsland ◽

Leiv Sigbjørn Eikje ◽

Theo Meuwissen

Keyword(s):

Value Added ◽

Single Step ◽

Fine Tuning ◽

Genomic Relationship Matrix ◽

Relationship Matrix ◽

Optimal Selection ◽

Breeding Values ◽

Estimated Breeding Values ◽

Snp Panels ◽

Genomic Predictions

Abstract Background The main aim of single-step genomic predictions was to facilitate optimal selection in populations consisting of both genotyped and non-genotyped individuals. However, in spite of intensive research, biases still occur, which make it difficult to perform optimal selection across groups of animals. The objective of this study was to investigate whether incomplete genotype datasets with errors could be a potential source of level-bias between genotyped and non-genotyped animals and between animals genotyped on different single nucleotide polymorphism (SNP) panels in single-step genomic predictions. Results Incomplete and erroneous genotypes of young animals caused biases in breeding values between groups of animals. Systematic noise or missing data for less than 1% of the SNPs in the genotype data had substantial effects on the differences in breeding values between genotyped and non-genotyped animals, and between animals genotyped on different chips. The breeding values of young genotyped individuals were biased upward, and the magnitude was up to 0.8 genetic standard deviations, compared with breeding values of non-genotyped individuals. Similarly, the magnitude of a small value added to the diagonal of the genomic relationship matrix affected the level of average breeding values between groups of genotyped and non-genotyped animals. Cross-validation accuracies and regression coefficients were not sensitive to these factors. Conclusions Because, historically, different SNP chips have been used for genotyping different parts of a population, fine-tuning of imputation within and across SNP chips and handling of missing genotypes are crucial for reducing bias. Although all the SNPs used for estimating breeding values are present on the chip used for genotyping young animals, incompleteness and some genotype errors might lead to level-biases in breeding values.

Download Full-text

335 Genomic predictions with a multi-breed genomic relationship matrix

Journal of Animal Science ◽

10.1093/jas/skz258.099 ◽

2019 ◽

Vol 97 (Supplement_3) ◽

pp. 49-50

Author(s):

Yvette Steyn ◽

Daniela Lourenco ◽

Ignacy Misztal

Keyword(s):

Prediction Accuracy ◽

Negative Impact ◽

Reference Population ◽

Single Step ◽

Genomic Relationship Matrix ◽

Relationship Matrix ◽

Genomic Relationship ◽

Effective Population ◽

Specific Allele ◽

Missing Genotypes

Abstract Multi-breed evaluations have the advantage of increasing the size of the reference population for genomic evaluations and are quite simple; however, combining breeds usually have a negative impact on prediction accuracy. The aim of this study was to evaluate the use of a multi-breed genomic relationship matrix (G), where SNP for each breed are non-shared. The multi-breed G is set assuming known genotypes for one breed and missing genotypes for the remaining breeds. This setup may avoid spurious IBS relationships between breeds and considers breed-specific allele frequencies. This scenario was contrasted to multi-breed evaluations where all SNP are shared, i.e., the same SNP, and to single-breed evaluations. Different SNP densities, namely 9k and 45k, and different effective population sizes (Ne) were tested. Five breeds mimicking recent beef cattle populations that diverged from the same historical population were simulated using different selection criteria. It was assumed that QTL effects were the same over all breeds. For the recent population, generations 1 to 9 had approximately half of the animals genotyped, whereas all 1200 animals were genotyped in generation 10. Genotyped animals in generation 10 were set as validation; therefore, each breed had a validation set. Analysis were performed using single-step GBLUP (ssGBLUP). Prediction accuracy was calculated as correlation between true (T) and genomic estimated (GE) BV. Accuracies of GEBV were lower for the larger Ne and low SNP density. All three scenarios using 45K resulted in similar accuracies, suggesting that the marker density is high enough to account for relationships and linkage disequilibrium with QTL. A shared multi-breed evaluation using 9K resulted in a decrease of accuracy of 0.08 for a smaller Ne and 0.11 for a larger Ne. This loss was mostly avoided when markers were treated as non-shared within the same genomic relationship matrix.

Download Full-text

Efficient computation of the genomic relationship matrix and other matrices used in single-step evaluation

Journal of Animal Breeding and Genetics ◽

10.1111/j.1439-0388.2010.00912.x ◽

2011 ◽

Vol 128 (6) ◽

pp. 422-428 ◽

Cited By ~ 89

Author(s):

I. Aguilar ◽

I. Misztal ◽

A. Legarra ◽

S. Tsuruta

Keyword(s):

Single Step ◽

Genomic Relationship Matrix ◽

Relationship Matrix ◽

Efficient Computation ◽

Genomic Relationship

Download Full-text

A comprehensive comparison between single- and two-step GBLUP methods in a simulated beef cattle population

Canadian Journal of Animal Science ◽

10.1139/cjas-2017-0176 ◽

2018 ◽

Vol 98 (3) ◽

pp. 565-575 ◽

Cited By ~ 5

Author(s):

Mario L. Piccoli ◽

Luiz F. Brito ◽

José Braccini ◽

Fernanda V. Brito ◽

Fernando F. Cardoso ◽

...

Keyword(s):

Beef Cattle ◽

Production Systems ◽

Simulated Data ◽

Single Step ◽

Breeding Values ◽

True Breeding ◽

Blending Method ◽

Estimated Breeding Values ◽

Genetic Evaluations ◽

Genomic Predictions

The statistical methods used in the genetic evaluations are a key component of the process and can be best compared by using simulated data. The latter is especially true in grazing beef cattle production systems, where the number of proven bulls with highly reliable estimated breeding values is limited to allow for a trustworthy validation of genomic predictions. Therefore, we simulated data for 4980 beef cattle aiming to compare single-step genomic best linear unbiased prediction (ssGBLUP), which simultaneously incorporates pedigree, phenotypic, and genomic data into genomic evaluations, and two-step GBLUP (tsGBLUP) procedures and genomic estimated breeding values (GEBVs) blending methods. The greatest increases in GEBV accuracies compared with the parents’ average estimated breeding values (EBVPA) were 0.364 and 0.341 for ssGBLUP and tsGBLUP, respectively. Direct genomic value and GEBV accuracies when using ssGBLUP and tsGBLUP procedures were similar, except for the GEBV accuracies using Hayes’ blending method in tsGBLUP. There was no significant or slight bias in genomic predictions from ssGBLUP or tsGBLUP (using VanRaden’s blending method), indicating that these predictions are on the same scale compared with the true breeding values. Overall, genetic evaluations including genomic information resulted in gains in accuracy >100% compared with the EBVPA. In addition, there were no significant differences between the selected animals (10% males and 50% females) by using ssGBLUP or tsGBLUP.

Download Full-text

An efficient exact method to obtain GBLUP and single-step GBLUP when the genomic relationship matrix is singular

Genetics Selection Evolution ◽

10.1186/s12711-016-0260-7 ◽

2016 ◽

Vol 48 (1) ◽

Cited By ~ 9

Author(s):

Rohan L. Fernando ◽

Hao Cheng ◽

Dorian J. Garrick

Keyword(s):

Single Step ◽

Exact Method ◽

Genomic Relationship Matrix ◽

Relationship Matrix ◽

Genomic Relationship

Download Full-text

Single-Step Genomic Evaluations from Theory to Practice: Using SNP Chips and Sequence Data in BLUPF90

Genes ◽

10.3390/genes11070790 ◽

2020 ◽

Vol 11 (7) ◽

pp. 790 ◽

Cited By ~ 3

Author(s):

Daniela Lourenco ◽

Andres Legarra ◽

Shogo Tsuruta ◽

Yutaka Masuda ◽

Ignacio Aguilar ◽

...

Keyword(s):

Sequence Data ◽

Standard Procedure ◽

Single Step ◽

Bayesian Regression ◽

Genomic Relationship Matrix ◽

Relationship Matrix ◽

Software Suite ◽

Livestock Breeding ◽

Snp Data ◽

Effect Model

Single-step genomic evaluation became a standard procedure in livestock breeding, and the main reason is the ability to combine all pedigree, phenotypes, and genotypes available into one single evaluation, without the need of post-analysis processing. Therefore, the incorporation of data on genotyped and non-genotyped animals in this method is straightforward. Since 2009, two main implementations of single-step were proposed. One is called single-step genomic best linear unbiased prediction (ssGBLUP) and uses single nucleotide polymorphism (SNP) to construct the genomic relationship matrix; the other is the single-step Bayesian regression (ssBR), which is a marker effect model. Under the same assumptions, both models are equivalent. In this review, we focus solely on ssGBLUP. The implementation of ssGBLUP into the BLUPF90 software suite was done in 2009, and since then, several changes were made to make ssGBLUP flexible to any model, number of traits, number of phenotypes, and number of genotyped animals. Single-step GBLUP from the BLUPF90 software suite has been used for genomic evaluations worldwide. In this review, we will show theoretical developments and numerical examples of ssGBLUP using SNP data from regular chips to sequence data.

Download Full-text

Application of single-step GBLUP in New Zealand Romney sheep

Animal Production Science ◽

10.1071/an19315 ◽

2020 ◽

Vol 60 (9) ◽

pp. 1136

Author(s):

M. A. Nilforooshan

Keyword(s):

New Zealand ◽

Single Step ◽

Pedigree Information ◽

Genomic Relationship Matrix ◽

Relationship Matrix ◽

Genomic Relationship ◽

Genetic Trend ◽

Weaning Weight ◽

Sheep Population ◽

Pedigree Relationship

Context In New Zealand, Romney is the most predominant breed and is reared as a dual-purpose sheep. The number of genotypes is rapidly increasing in the sheep population, and making use of both genotypes and pedigree information is of importance for genetic evaluations. Single-step genomic best linear unbiased prediction (ssGBLUP) is a method for simultaneous prediction of genetic merits for genotyped and non-genotyped animals. The combination and the compatibility of the genomic relationship matrix (G) and the pedigree relationship matrix for genotyped animals (A22) is important for unbiased ssGBLUP. Aims The aim of the present study was to find an optimum genetic relationship matrix for ssGBLUP weaning-weight evaluation of Romney sheep in New Zealand. Methods Data consisted of adjusted weaning weights for 2422011 sheep, 50K single-nucleotide polymorphism genotypes for 13304 animals and 3028688 animals in the pedigree. Blending of G and A22 was tested with weights (k) ranging from 0.2 to 0.99 (kG + (1 – k)A22), followed by none or one of the three methods of tuning G to A22. Key results The averages of G and A22 were close to each other for overall, diagonal and off-diagonal elements. Therefore, differently tuned G performed similarly. However, elements of G showed larger variation than did the elements of A22 and, on average, genotyped animals were less related in G than in A22. Correlations between genomic estimated breeding values (GEBV) for the top 500 genotyped animals, as well as the rank correlations, were almost 1 among ssGBLUP evaluations using tuned G. The corresponding correlations with BLUP evaluations were increased by blending G with a larger proportion of A22, and were further increased by tuning G, indicating improved compatibility between G and A22. Blending and tuning G suppressed the inflation of GEBV and bias and it moved the genetic trend closer to the genetic trend obtained from BLUP. Conclusions A combination of blending and tuning G to A22, with a blending rate of 0.5 at most, is recommended for weaning weight of Romney sheep in New Zealand. Failure to do that resulted in inflated GEBV that can reduce the accuracy of selection, especially for genotyped animals. Implications There is a growing interest in the single-step GBLUP method for simultaneous genetic evaluation of genotyped and non-genotyped animals, in which genomic and pedigree relationship matrices are admixed. Using data from New Zealand Romney sheep, we have shown that adjustment of the genomic relationship matrix on the basis of the pedigree relationship matrix is necessary to avoid inflated evaluations. Improving the compatibility between genomic and pedigree relationship matrices is important for obtaining accurate and unbiased single-step GBLUP evaluations.

Download Full-text

334 Investigating core-dependent changes in predictions using the algorithm for proven and young in ssGBLUP

Journal of Animal Science ◽

10.1093/jas/skz258.100 ◽

2019 ◽

Vol 97 (Supplement_3) ◽

pp. 50-50

Author(s):

Daniela Lourenco ◽

Shogo Tsuruta ◽

Ivan Pocrnic ◽

Ignacy Misztal

Keyword(s):

Dairy Cattle ◽

Large Scale ◽

Growth Traits ◽

Optimal Number ◽

Single Step ◽

Eigenvalue Decomposition ◽

Genomic Relationship Matrix ◽

Relationship Matrix ◽

Direct Inversion ◽

Lower Accuracy

Abstract Large-scale single-step GBLUP (ssGBLUP) evaluations rely on techniques to approximate or avoid the inversion of the genomic relationship matrix (G). The algorithm for proven and young (APY) was developed to create the inverse of G without explicit inversion, and relies on the clustering of genotyped animals into two groups, namely core and non-core. Although the correlation between GEBV from regular ssGBLUP and APY ssGBLUP is greater than 0.99 when the appropriate number of core animals is used, reranking is still observed when different core groups are used. We investigated which animals are more suitable to reranking and how the changes in GEBV can be minimized. Datasets from beef and dairy cattle, and pigs were used. The beef cattle data comprised phenotypes on 3 growth traits for up to 6.8M animals, pedigree for 8.2M, and genotypes for 66k. A dairy cattle data with 9M phenotypes for udder depth, 10M animals in pedigree, and 570K genotyped was used. The pig dataset had up to 770k phenotypes recorded on 4 traits, pedigree for 2.6M animals and genotypes for 54k. Investigations included using several different core groups, increasing the number of core animals beyond the optimal number obtained by the eigenvalue decomposition, and comparisons with GEBV from ssGBLUP with direct inversion (except for dairy). Additionally, observed changes were compared with possible changes based on SE of GEBV. In all datasets, larger changes in GEBV by using different core groups were observed for animals with lower accuracy. The observed changes relative to standard deviations of GEBV were, on average, 5% and ranged from 0 to 30%. Increasing the number of core animals beyond the optimal value helped to asymptotically reduce changes in GEBV. Although core-dependent changes in GEBV exist, they are small and can be reduced with larger core groups.

Download Full-text

A Comprehensive Comparison of Haplotype-Based Single-Step Genomic Predictions in Livestock Populations With Different Genetic Diversity Levels: A Simulation Study

Frontiers in Genetics ◽

10.3389/fgene.2021.729867 ◽

2021 ◽

Vol 12 ◽

Author(s):

Andre C. Araujo ◽

Paulo L. S. Carneiro ◽

Hinayah R. Oliveira ◽

Flavio S. Schenkel ◽

Renata Veroneze ◽

...

Keyword(s):

Genetic Diversity ◽

Predictive Ability ◽

Single Step ◽

Genomic Relationship Matrix ◽

Relationship Matrix ◽

Diverse Populations ◽

Nucleotide Polymorphisms ◽

Unique Haplotype ◽

Individual Snps ◽

Genomic Predictions

The level of genetic diversity in a population is inversely proportional to the linkage disequilibrium (LD) between individual single nucleotide polymorphisms (SNPs) and quantitative trait loci (QTLs), leading to lower predictive ability of genomic breeding values (GEBVs) in high genetically diverse populations. Haplotype-based predictions could outperform individual SNP predictions by better capturing the LD between SNP and QTL. Therefore, we aimed to evaluate the accuracy and bias of individual-SNP- and haplotype-based genomic predictions under the single-step-genomic best linear unbiased prediction (ssGBLUP) approach in genetically diverse populations. We simulated purebred and composite sheep populations using literature parameters for moderate and low heritability traits. The haplotypes were created based on LD thresholds of 0.1, 0.3, and 0.6. Pseudo-SNPs from unique haplotype alleles were used to create the genomic relationship matrix (G) in the ssGBLUP analyses. Alternative scenarios were compared in which the pseudo-SNPs were combined with non-LD clustered SNPs, only pseudo-SNPs, or haplotypes fitted in a second G (two relationship matrices). The GEBV accuracies for the moderate heritability-trait scenarios fitting individual SNPs ranged from 0.41 to 0.55 and with haplotypes from 0.17 to 0.54 in the most (Ne ≅ 450) and less (Ne < 200) genetically diverse populations, respectively, and the bias fitting individual SNPs or haplotypes ranged between −0.14 and −0.08 and from −0.62 to −0.08, respectively. For the low heritability-trait scenarios, the GEBV accuracies fitting individual SNPs ranged from 0.24 to 0.32, and for fitting haplotypes, it ranged from 0.11 to 0.32 in the more (Ne ≅ 250) and less (Ne ≅ 100) genetically diverse populations, respectively, and the bias ranged between −0.36 and −0.32 and from −0.78 to −0.33 fitting individual SNPs or haplotypes, respectively. The lowest accuracies and largest biases were observed fitting only pseudo-SNPs from blocks constructed with an LD threshold of 0.3 (p < 0.05), whereas the best results were obtained using only SNPs or the combination of independent SNPs and pseudo-SNPs in one or two G matrices, in both heritability levels and all populations regardless of the level of genetic diversity. In summary, haplotype-based models did not improve the performance of genomic predictions in genetically diverse populations.

Download Full-text