A Simple Method for Computing the Inverse of a Numerator Relationship Matrix Used in Prediction of Breeding Values

Biometrics ◽  
1976 ◽  
Vol 32 (1) ◽  
pp. 69 ◽  
Author(s):  
C. R. Henderson
2021 ◽  
Vol 12 ◽  
Author(s):  
Mohammad Ali Nilforooshan ◽  
Dorian Garrick

Reduced models are equivalent models to the full model that enable reduction in the computational demand for solving the problem, here, mixed model equations for estimating breeding values of selection candidates. Since phenotyped animals provide data to the model, the aim of this study was to reduce animal models to those equations corresponding to phenotyped animals. Non-phenotyped ancestral animals have normally been included in analyses as they facilitate formation of the inverse numerator relationship matrix. However, a reduced model can exclude those animals and obtain identical solutions for the breeding values of the animals of interest. Solutions corresponding to non-phenotyped animals can be back-solved from the solutions of phenotyped animals and specific blocks of the inverted relationship matrix. This idea was extended to other forms of animal model and the results from each reduced model (and back-solving) were identical to the results from the corresponding full model. Previous studies have been mainly focused on reduced animal models that absorb equations corresponding to non-parents and solve equations only for parents of phenotyped animals. These two types of reduced animal model can be combined to formulate only equations corresponding to phenotyped parents of phenotyped progeny.


Genome ◽  
2010 ◽  
Vol 53 (11) ◽  
pp. 876-883 ◽  
Author(s):  
Ben Hayes ◽  
Mike Goddard

Results from genome-wide association studies in livestock, and humans, has lead to the conclusion that the effect of individual quantitative trait loci (QTL) on complex traits, such as yield, are likely to be small; therefore, a large number of QTL are necessary to explain genetic variation in these traits. Given this genetic architecture, gains from marker-assisted selection (MAS) programs using only a small number of DNA markers to trace a limited number of QTL is likely to be small. This has lead to the development of alternative technology for using the available dense single nucleotide polymorphism (SNP) information, called genomic selection. Genomic selection uses a genome-wide panel of dense markers so that all QTL are likely to be in linkage disequilibrium with at least one SNP. The genomic breeding values are predicted to be the sum of the effect of these SNPs across the entire genome. In dairy cattle breeding, the accuracy of genomic estimated breeding values (GEBV) that can be achieved and the fact that these are available early in life have lead to rapid adoption of the technology. Here, we discuss the design of experiments necessary to achieve accurate prediction of GEBV in future generations in terms of the number of markers necessary and the size of the reference population where marker effects are estimated. We also present a simple method for implementing genomic selection using a genomic relationship matrix. Future challenges discussed include using whole genome sequence data to improve the accuracy of genomic selection and management of inbreeding through genomic relationships.


1985 ◽  
Vol 36 (3) ◽  
pp. 527 ◽  
Author(s):  
H-U Graser ◽  
K Hammond

A multiple-trait mixed model is defined for regular use in the Australian beef industry for the estimation of breeding values for continuous traits of sires used non-randomly across a number of herds and/or years. Maternal grandsires, the numerator relationship matrix, appropriate fixed effects, and the capacity to partition direct and maternal effects are incorporated in this parent model. The model was fitted to the National Beef Recording Scheme's data bank for three growth traits of the Australian Simental breed, viz 200-, 365- and 550-day weights. Estimates are obtained for the effects of sex, dam age, grade of dam, age of calf and breed of base dam. The range in estimated breeding value is reported for each trait, with 200-day weight being partitioned into 'calves' and 'daughters' calves', for the Simmental sires commonly used in Australia. Estimates of the fixed effects were large, and dam age, grade of dam and breed of base dam had an important influence on growth to 365 days of age. The faster growth of higher percentage Simmental calves to 200 days continued to 550 days. Estimates of genetic variance for the traits were lower than reported for overseas populations of Simmental cattle, and the genetic covariance between direct and maternal effects for 200-day weight was slightly positive.


2018 ◽  
Author(s):  
G. R. Gowane ◽  
Sang Hong Lee ◽  
Sam Clark ◽  
Nasir Moghaddar ◽  
Hawlader A Al-Mamun ◽  
...  

AbstractReference populations for genomic selection (GS) usually involve highly selected individuals, which may result in biased prediction of estimated genomic breeding values (GEBV). In the present study, bias and accuracy of GEBV were explored for various genetic models and prediction methods when using selected individuals for a reference. Data were simulated for an animal breeding program to compare Best Linear Unbiased Prediction of breeding values using pedigree based relationships (PBLUP), genomic relationships for genotyped animals only (GBLUP) and a Single Step approach (SSGBLUP), where information on genotyped individuals was used to infer a matrix H with relationships among all available genotyped and non-genotyped individuals that were linked through pedigree. In SSGBLUP, various weights (α=0.95, 0.80, 0.50) for the genomic relationship matrix (G) relative to the numerator relationship matrix (A) were applied to construct H and in another version (SSGBLUP_F), inbreeding was accounted for while computing A-1. With GBLUP, accuracy of GEBV prediction increased linearly with an increase in the number of animals selected in reference. For the scenario with no-selection and random mating (RR) prediction was unbiased. For GBLUP, lower accuracy and bias observed in the scenarios with selection and random mating (SR) or selection and positive assortative mating (SA), in which prediction bias increased when a smaller and highly selected proportion genotyped. Bias disappeared when all individuals were genotyped. SSGBLUP_F showed higher accuracy compared to GBLUP and bias of prediction was negligible even with selective genotyping. However, PBLUP and SSGBLUP showed bias in SA owing to not fully accounting for allele frequency changes because of selection of quantitative trait loci (QTL) with larger effects and also due to high inbreeding rate. In genetic models with fewer QTL but each with larger effect, predictions were less accurate and more biased for selection scenarios. Results suggest that prediction accuracy and bias is affected by the genetic architecture of the trait. Selective genotyping lead to significant bias in GEBV prediction. SSGBLUP with appropriate scaling of A and G matrices can provide accurate and less biased prediction but scaling requires careful consideration in populations under selection and with high levels of inbreeding.


2019 ◽  
Vol 97 (Supplement_2) ◽  
pp. 34-35
Author(s):  
Johnna Baller ◽  
Jeremy T Howard ◽  
Stephen Kachman ◽  
Matthew L Spangler

Abstract The objective of the study was to evaluate the impact of clustering methods for cross-validation on the accuracy of prediction of molecular breeding values (MBV) in Red Angus cattle (n = 9,763) and in simulation. Individuals were clustered using seven methods [k-means, k-medoids, principal component analysis on the numerator relationship matrix (A) and identical-by-state genomic matrix (G) as data and covariance matrices, and random] and two response variables [deregressed Estimated Breeding Values (DEBV) and adjusted phenotypes]. Genotypes were imputed to a 50K reference panel. Using cross-validation and a Bayes C model, MBV were estimated for traits including birth weight (BWT), marbling (MARB), rib-eye area (REA), and yearling weight (YWT) for DEBV and BWT, YWT, and ultrasonically measured intramuscular fat percentage and rib eye area for adjusted phenotypes. A bivariate animal model was used to estimate prediction accuracies calculated using the genetic correlation between estimated MBV and the associated response variable. To quantify the difference between true and estimated accuracies, a simulation mimicking a cattle population was replicated five times. The same clustering methods were used as with the Red Angus data with the addition of forward validation and two genotyping methods (random selection and selection of the top 25% of animals). Predicted accuracies were estimated similarly and true accuracies were estimated using the residual correlation of a bivariate model using MBV and true breeding values (TBV). The Rand index was used to quantify the similarity between clustering methods, showing relationship-based clusters were clearly different from random clusters. In simulation, random genotyping led to higher estimated accuracies than selection of top individuals; however, estimated accuracies over predicted true accuracies with random genotyping but under predicted true accuracies with the selection of top individuals. When forward validation was evaluated within simulation, results suggested DEBV led to less biased estimates of MBV accuracy.


1987 ◽  
Vol 67 (1) ◽  
pp. 201-204
Author(s):  
R. A. KEMP ◽  
J. W. WILTON

A numerator relationship matrix (Ac) due to sires and dams was compared with a numerator relationship matrix (Ai) due to sires and maternal grandsires in a multiple-trait-reduced animal model (MT-RAM). Best linear unbiased predictors of estimated breeding values (EBV) for 200-d weight (WW) and postweaning gain (PG) (gain from 200 to 365 d of age) were estimated from data simulating a beef cattle population. As expected, mean EBV and bias (EBV-BV) for both traits were not significantly affected by different relationship matrices. The mean variances of EBV with Ac were larger than those with Ai for both traits. The mean EBV variances were closer to mean BV variances with Ac compared to Ai, which is consistent with increased precision of EBV. Product-moment correlations of EBV and BV (accuracy of prediction) were not equal (P < 0.01) for Ac compared to Ai with WW or PG. The EBV using Ac were more accurate than EBV using Ai. The increased precision and accuracy of EBV from a MT-RAM with Ac would result in greater genetic progress in the population. Key words: Relationship matrices, estimated breeding values, MT-RAM


Forests ◽  
2020 ◽  
Vol 11 (11) ◽  
pp. 1169
Author(s):  
Gary R. Hodge ◽  
Juan Jose Acosta

Research Highlights: An algorithm is presented that allows for the analysis of full-sib genetic datasets using generalized mixed-model software programs. The algorithm produces variance component estimates, genetic parameter estimates, and Best Linear Unbiased Prediction (BLUP) solutions for genetic values that are, for all practical purposes, identical to those produced by dedicated genetic software packages. Background and Objectives: The objective of this manuscript is to demonstrate an approach with a simulated full-sib dataset representing a typical forest tree breeding population (40 parents, 80 full-sib crosses, 4 tests, and 6000 trees) using two widely available mixed-model packages. Materials and Methods: The algorithm involves artificially doubling the dataset, so that each observation is in the dataset twice, once with the original female and male parent identification, and once with the female and male parent identities switched. Five linear models were examined: two models using a dedicated genetic software program (ASREML) with the capacity to specify A or other pedigree-related functions, and three models with the doubled dataset and a parent (or sire) linear model (ASREML, SAS Proc Mixed, and R lme4). Results: The variance components, genetic parameters, and BLUPs of the parental breeding values, progeny breeding values, and full-sib family-specific combining abilities were compared. Genetic parameter estimates were essentially the same across all the analyses (e.g., the heritability ranged from h2 = 0.220 to 0.223, and the proportion of dominance variance ranged from d2 = 0.057 to 0.058). The correlations between the BLUPs from the baseline analysis (ASREML with an individual tree model) and the doubled-dataset/parent models using SAS Proc Mixed or R lme4 were never lower than R = 0.99997. Conclusions: The algorithm can be useful for analysts who need to analyze full-sib genetic datasets and who are familiar with general-purpose statistical packages, but less familiar with or lacking access to other software.


2021 ◽  
Vol 99 (2) ◽  
Author(s):  
Yutaka Masuda ◽  
Shogo Tsuruta ◽  
Matias Bermann ◽  
Heather L Bradford ◽  
Ignacy Misztal

Abstract Pedigree information is often missing for some animals in a breeding program. Unknown-parent groups (UPGs) are assigned to the missing parents to avoid biased genetic evaluations. Although the use of UPGs is well established for the pedigree model, it is unclear how UPGs are integrated into the inverse of the unified relationship matrix (H-inverse) required for single-step genomic best linear unbiased prediction. A generalization of the UPG model is the metafounder (MF) model. The objectives of this study were to derive 3 H-inverses and to compare genetic trends among models with UPG and MF H-inverses using a simulated purebred population. All inverses were derived using the joint density function of the random breeding values and genetic groups. The breeding values of genotyped animals (u2) were assumed to be adjusted for UPG effects (g) using matrix Q2 as u2∗=u2+Q2g before incorporating genomic information. The Quaas–Pollak-transformed (QP) H-inverse was derived using a joint density function of u2∗ and g updated with genomic information and assuming nonzero cov(u2∗,g′). The modified QP (altered) H-inverse also assumes that the genomic information updates u2∗ and g, but cov(u2∗,g′)=0. The UPG-encapsulated (EUPG) H-inverse assumed genomic information updates the distribution of u2∗. The EUPG H-inverse had the same structure as the MF H-inverse. Fifty percent of the genotyped females in the simulation had a missing dam, and missing parents were replaced with UPGs by generation. The simulation study indicated that u2∗ and g in models using the QP and altered H-inverses may be inseparable leading to potential biases in genetic trends. Models using the EUPG and MF H-inverses showed no genetic trend biases. These 2 H-inverses yielded the same genomic EBV (GEBV). The predictive ability and inflation of GEBVs from young genotyped animals were nearly identical among models using the QP, altered, EUPG, and MF H-inverses. Although the choice of H-inverse in real applications with enough data may not result in biased genetic trends, the EUPG and MF H-inverses are to be preferred because of theoretical justification and possibility to reduce biases.


2020 ◽  
Vol 98 (Supplement_3) ◽  
pp. 41-42
Author(s):  
B Victor Oribamise ◽  
Lauren L Hulsman Hanna

Abstract Without appropriate relationships present in a given population, identifying dominance effects in the expression of desirable traits is challenging. Including non-additive effects is desirable to increase accuracy of breeding values. There is no current user-friendly tool package to investigate genetic relatedness in large pedigrees. The objective was to develop and implement efficient algorithms in R to calculate and visualize measures of relatedness (e.g., sibling and family structure, numerator relationship matrices) for large pedigrees. Comparisons to current R packages (Table 1) are also made. Functions to assign animals to families, summary of sibling counts, calculation of numerator relationship matrix (NRM), and NRM summary by groups were created, providing a comprehensive toolkit (Sibs package) not found in other packages. Pedigrees of various sizes (n = 20, 4,035, 120,000 and 132,833) were used to test functionality and compare to current packages. All runs were conducted on a Windows-based computer with an 8 GB RAM, 2.5 GHz Intel Core i7 processor. Other packages had no significant difference in runtime when constructing the NRM for small pedigrees (n = 20) compared to Sibs (0 to 0.05 s difference). However, packages such as ggroups, AGHmatrix, and pedigree were 10 to 15 min slower than Sibs for a 4,035-individual pedigree. Packages nadiv and pedigreemm competed with Sibs (0.30 to 60 s slower than Sibs), but no package besides Sibs was able to complete the 132,833-individual pedigree due to memory allocation issues in R. The nadiv package was closest with a pedigree of 120,000 individuals, but took 37 min to complete (13 min slower than Sibs). This package also provides easier input of pedigrees and is more encompassing of such relatedness measures than other packages (Table 1). Furthermore, it can provide an option to utilize other packages such as GCA for connectedness calculations when using large pedigrees.


2011 ◽  
Vol 93 (3) ◽  
pp. 203-219 ◽  
Author(s):  
KATHRYN E. KEMPER ◽  
DAVID L. EMERY ◽  
STEPHEN C. BISHOP ◽  
HUTTON ODDY ◽  
BENJAMIN J. HAYES ◽  
...  

SummaryGenetic resistance to gastrointestinal worms is a complex trait of great importance in both livestock and humans. In order to gain insights into the genetic architecture of this trait, a mixed breed population of sheep was artificially infected with Trichostrongylus colubriformis (n=3326) and then Haemonchus contortus (n=2669) to measure faecal worm egg count (WEC). The population was genotyped with the Illumina OvineSNP50 BeadChip and 48 640 single nucleotide polymorphism (SNP) markers passed the quality controls. An independent population of 316 sires of mixed breeds with accurate estimated breeding values for WEC were genotyped for the same SNP to assess the results obtained from the first population. We used principal components from the genomic relationship matrix among genotyped individuals to account for population stratification, and a novel approach to directly account for the sampling error associated with each SNP marker regression. The largest marker effects were estimated to explain an average of 0·48% (T. colubriformis) or 0·08% (H. contortus) of the phenotypic variance in WEC. These effects are small but consistent with results from other complex traits. We also demonstrated that methods which use all markers simultaneously can successfully predict genetic merit for resistance to worms, despite the small effects of individual markers. Correlations of genomic predictions with breeding values of the industry sires reached a maximum of 0·32. We estimate that effective across-breed predictions of genetic merit with multi-breed populations will require an average marker spacing of approximately 10 kbp.


Sign in / Sign up

Export Citation Format

Share Document