Genomic Relationship Matrix for Correcting Pedigree Errors in Breeding Populations: Impact on Genetic Parameters and Genomic Selection Accuracy

A dramatic increase in the density of marker panels has been expected to increase the accuracy of genomic selection (GS), unfortunately, little to no improvement has been observed. By including all variants in the association model, the dimensionality of the problem should be dramatically increased, and it could undoubtedly reduce the statistical power. Using all Single nucleotide polymorphisms (SNPs) to compute the genomic relationship matrix (G) does not necessarily increase accuracy as the additive relationships can be accurately estimated using a much smaller number of markers. Due to these limitations, variant prioritization has become a necessity to improve accuracy. The fixation index (FST) as a measure of population differentiation has been used to identify genome segments and variants under selection pressure. Using prioritized variants has increased the accuracy of GS. Additionally, FST can be used to weight the relative contribution of prioritized SNPs in computing G. In this study, relative weights based on FST scores were developed and incorporated into the calculation of G and their impact on the estimation of variance components and accuracy was assessed. The results showed that prioritizing SNPs based on their FST scores resulted in an increase in the genetic similarity between training and validation animals and improved the accuracy of GS by more than 5%.

Download Full-text

Accuracy of genomic BLUP when considering a genomic relationship matrix based on the number of the largest eigenvalues: a simulation study

Genetics Selection Evolution ◽

10.1186/s12711-019-0516-0 ◽

2019 ◽

Vol 51 (1) ◽

Cited By ~ 3

Author(s):

Ivan Pocrnic ◽

Daniela A. L. Lourenco ◽

Yutaka Masuda ◽

Ignacy Misztal

Keyword(s):

Genomic Selection ◽

Large Fraction ◽

Genomic Variation ◽

Eigenvalue Decomposition ◽

Genomic Relationship Matrix ◽

Relationship Matrix ◽

Genomic Relationship ◽

Genomic Information ◽

Phenotypic Information ◽

Largest Eigenvalues

Abstract Background The dimensionality of genomic information is limited by the number of independent chromosome segments (Me), which is a function of the effective population size. This dimensionality can be determined approximately by singular value decomposition of the gene content matrix, by eigenvalue decomposition of the genomic relationship matrix (GRM), or by the number of core animals in the algorithm for proven and young (APY) that maximizes the accuracy of genomic prediction. In the latter, core animals act as proxies to linear combinations of Me. Field studies indicate that a moderate accuracy of genomic selection is achieved with a small dataset, but that further improvement of the accuracy requires much more data. When only one quarter of the optimal number of core animals are used in the APY algorithm, the accuracy of genomic selection is only slightly below the optimal value. This suggests that genomic selection works on clusters of Me. Results The simulation included datasets with different population sizes and amounts of phenotypic information. Computations were done by genomic best linear unbiased prediction (GBLUP) with selected eigenvalues and corresponding eigenvectors of the GRM set to zero. About four eigenvalues in the GRM explained 10% of the genomic variation, and less than 2% of the total eigenvalues explained 50% of the genomic variation. With limited phenotypic information, the accuracy of GBLUP was close to the peak where most of the smallest eigenvalues were set to zero. With a large amount of phenotypic information, accuracy increased as smaller eigenvalues were added. Conclusions A small amount of phenotypic data is sufficient to estimate only the effects of the largest eigenvalues and the associated eigenvectors that contain a large fraction of the genomic information, and a very large amount of data is required to estimate the remaining eigenvalues that account for a limited amount of genomic information. Core animals in the APY algorithm act as proxies of almost the same number of eigenvalues. By using an eigenvalues-based approach, it was possible to explain why the moderate accuracy of genomic selection based on small datasets only increases slowly as more data are added.

Download Full-text

Simulation studies to optimize genomic selection in honey bees

Genetics Selection Evolution ◽

10.1186/s12711-021-00654-x ◽

2021 ◽

Vol 53 (1) ◽

Author(s):

Richard Bernstein ◽

Manuel Du ◽

Andreas Hoppe ◽

Kaspar Bienefeld

Keyword(s):

Genomic Selection ◽

Genetic Gain ◽

Honey Bees ◽

Reference Population ◽

Breeding Program ◽

Genomic Relationship Matrix ◽

Relationship Matrix ◽

Genomic Relationship ◽

Genomic Breeding ◽

Breeding Values

Abstract Background With the completion of a single nucleotide polymorphism (SNP) chip for honey bees, the technical basis of genomic selection is laid. However, for its application in practice, methods to estimate genomic breeding values need to be adapted to the specificities of the genetics and breeding infrastructure of this species. Drone-producing queens (DPQ) are used for mating control, and usually, they head non-phenotyped colonies that will be placed on mating stations. Breeding queens (BQ) head colonies that are intended to be phenotyped and used to produce new queens. Our aim was to evaluate different breeding program designs for the initiation of genomic selection in honey bees. Methods Stochastic simulations were conducted to evaluate the quality of the estimated breeding values. We developed a variation of the genomic relationship matrix to include genotypes of DPQ and tested different sizes of the reference population. The results were used to estimate genetic gain in the initial selection cycle of a genomic breeding program. This program was run over six years, and different numbers of genotyped queens per year were considered. Resources could be allocated to increase the reference population, or to perform genomic preselection of BQ and/or DPQ. Results Including the genotypes of 5000 phenotyped BQ increased the accuracy of predictions of breeding values by up to 173%, depending on the size of the reference population and the trait considered. To initiate a breeding program, genotyping a minimum number of 1000 queens per year is required. In this case, genetic gain was highest when genomic preselection of DPQ was coupled with the genotyping of 10–20% of the phenotyped BQ. For maximum genetic gain per used genotype, more than 2500 genotyped queens per year and preselection of all BQ and DPQ are required. Conclusions This study shows that the first priority in a breeding program is to genotype phenotyped BQ to obtain a sufficiently large reference population, which allows successful genomic preselection of queens. To maximize genetic gain, DPQ should be preselected, and their genotypes included in the genomic relationship matrix. We suggest, that the developed methods for genomic prediction are suitable for implementation in genomic honey bee breeding programs.

Download Full-text

Genomic selection using a realized genomic relationship matrix in a Pinus taeda L. cloned population

BMC Proceedings ◽

10.1186/1753-6561-5-s7-p60 ◽

2011 ◽

Vol 5 (Suppl 7) ◽

pp. P60 ◽

Cited By ~ 1

Author(s):

Jaime Zapata-Valenzuela ◽

Fikret Isik ◽

Christian Maltecca ◽

Jill Wegryzn ◽

David Neale ◽

...

Keyword(s):

Genomic Selection ◽

Pinus Taeda ◽

Genomic Relationship Matrix ◽

Relationship Matrix ◽

Genomic Relationship ◽

Pinus Taeda L

Download Full-text

PSVIII-27 A weighted genomic relationship matrix based on FST prioritized SNPs for genomic selection

Journal of Animal Science ◽

10.1093/jas/skz258.533 ◽

2019 ◽

Vol 97 (Supplement_3) ◽

pp. 262-262

Author(s):

Ling-Yun Chang ◽

Sajjad Toghiani ◽

E L Hamidi Hay ◽

Samuel E Aggrey ◽

Romdhane Rekaya

Keyword(s):

Genomic Selection ◽

Mixed Model ◽

Relative Weight ◽

Snp Markers ◽

Genomic Relationship Matrix ◽

Relationship Matrix ◽

Genomic Relationship ◽

Improve Accuracy ◽

Increase In Accuracy ◽

Generation Sequencing

Abstract Using low to moderate density SNP marker panels, a substantial increase in accuracy was achieved. The dramatic increase in the number of identified variants due to advances in next generation sequencing was expected to significantly increase the accuracy of genomic selection (GS). Unfortunately, little to no improvement was observed. For mixed model-based approaches, using all SNPs in the panel to compute the observed relationship matrix (G) will not increase accuracy as the additive relationships between individuals can be accurately estimated using a much smaller number of markers. Due to these limitations, variant prioritization has become a necessity to improve accuracy. Further, it has been shown that weighting SNPs when calculating G could be effective in improving the accuracy of GS. FST as a measure population differential has been successfully used to identify genome segments under selection pressure. Consequently, FST could be used to both prioritize SNPs and to derive their relative weight in the calculation of the genomic relationship matrix. A population of 15,000 animals genotyped for 400K SNP markers uniformly-distributed along 10 chromosomes was simulated. A trait with heritability 0.3 genetically controlled by two hundred QTL was generated. The top 20K SNPs based on their FST scores were used either alone or with the remaining 380K SNPs to compute G with or without weighting. When only the top 20K SNPs were used to compute G, two scenarios were considered: 1) equal weights for all SNPs or 2) weights proportional to the SNP FST scores. When all 400K SNP markers were used, different weighting scenarios were evaluated. The results clearly showed that prioritizing SNP markers based on their FST score and using the latter to compute relative weights has increased the genetic similarity between training and validations animals and resulted in more than 5% improvement in the accuracy of GS.

Download Full-text

A recursive algorithm for decomposition and creation of the inverse of the genomic relationship matrix

Journal of Dairy Science ◽

10.3168/jds.2011-5249 ◽

2012 ◽

Vol 95 (10) ◽

pp. 6093-6102 ◽

Cited By ~ 6

Author(s):

P. Faux ◽

N. Gengler ◽

I. Misztal

Keyword(s):

Recursive Algorithm ◽

Genomic Relationship Matrix ◽

Relationship Matrix ◽

Genomic Relationship

Download Full-text

Correction to: Validation of genomic predictions for body weight in broilers using crossbred information and considering breed-of-origin of alleles

Genetics Selection Evolution ◽

10.1186/s12711-019-0507-1 ◽

2019 ◽

Vol 51 (1) ◽

Author(s):

Pascal Duenk ◽

Mario P. L. Calus ◽

Yvonne C. J. Wientjes ◽

Vivian P. Breen ◽

John M. Henshall ◽

...

Keyword(s):

Body Weight ◽

Genomic Relationship Matrix ◽

Relationship Matrix ◽

Genomic Relationship ◽

Genomic Predictions

Following publication of original article [1], we noticed that there was an error: Eq. (3) on page 5 is the genomic relationship matrix that

Download Full-text

330 A hybrid model for genomic selection using prioritized SNPs based on FST scores in the presence of non-genotyped animals

Journal of Animal Science ◽

10.1093/jas/skz258.102 ◽

2019 ◽

Vol 97 (Supplement_3) ◽

pp. 51-51

Author(s):

Sajjad Toghiani ◽

Ling-Yun Chang ◽

El H Hay ◽

Andrew J Roberts ◽

Samuel E Aggrey ◽

...

Keyword(s):

Genomic Selection ◽

Hybrid Approach ◽

Computational Cost ◽

Simulated Data ◽

Snp Markers ◽

Genomic Relationship Matrix ◽

Polygenic Effect ◽

Relationship Matrix ◽

Continuous Increase ◽

Missing Genotypes

Abstract The dramatic advancement in genotyping technology has greatly reduced the complexity and cost of genotyping. The continuous increase in the density of marker panels is resulting in little to no improvement in the accuracy of genomic selection. Direct inversion of the genomic relationship matrix is infeasible for some livestock populations due to the excessive computational cost. In addition, most animals in genetic evaluation programs are non-genotyped. Including these animals in a genomic evaluation requires the imputation of the missing genotypes when using regression methods. To overcome these challenges, a hybrid approach is proposed. This approach fits a subset of SNP markers selected based on FST scores and a classical polygenic effect. The method was first tested using only genotyped animals and then extended to accommodate non-genotyped animals. The proposed approach was evaluated using simulated data for a trait with heritability of 0.1 and 0.4 and weaning weight in a crossbred beef cattle population. When all animals were genotyped, the hybrid approach using only 2.5% of prioritized SNPs exceeded the prediction accuracies of BayesB, BayesC, and GBLUP by more than 7%. When non-genotyped animals were incorporated, the proposed approach significantly outperformed ss-GBLUP method in terms of prediction accuracy under both simulated heritability scenarios. Although the results seem to depend on the genetic complexity of the trait, the proposed approach resulted in higher prediction accuracies than current methods. Furthermore, its computational costs in terms of CPU time and peak memory are substantially lower than the current methods.

Download Full-text

Accuracy of genomic selection predictions for hip height in Brahman cattle using different relationship matrices

Pesquisa Agropecuária Brasileira ◽

10.1590/s0100-204x2018000600008 ◽

2018 ◽

Vol 53 (6) ◽

pp. 717-726 ◽

Cited By ~ 1

Author(s):

Michel Marques Farah ◽

Marina Rufino Salinas Fortes ◽

Matthew Kelly ◽

Laercio Ribeiro Porto-Neto ◽

Camila Tangari Meira ◽

...

Keyword(s):

Allele Frequency ◽

High Density ◽

Genomic Relationship Matrix ◽

Relationship Matrix ◽

Genomic Relationship ◽

Genomic Information ◽

Pedigree Data ◽

Snp Chip ◽

Numerator Relationship Matrix ◽

Brahman Cattle

Abstract: The objective of this work was to evaluate the effects of genomic information on the genetic evaluation of hip height in Brahman cattle using different matrices built from genomic and pedigree data. Hip height measurements from 1,695 animals, genotyped with high-density SNP chip or imputed from 50 K high-density SNP chip, were used. The numerator relationship matrix (NRM) was compared with the H matrix, which incorporated the NRM and genomic relationship (G) matrix simultaneously. The genotypes were used to estimate three versions of G: observed allele frequency (HGOF), average minor allele frequency (HGMF), and frequency of 0.5 for all markers (HG50). For matrix comparisons, animal data were either used in full or divided into calibration (80% older animals) and validation (20% younger animals) datasets. The accuracy values for the NRM, HGOF, and HG50 were 0.776, 0.813, and 0.594, respectively. The NRM and HGOF showed similar minor variances for diagonal and off-diagonal elements, as well as for estimated breeding values. The use of genomic information resulted in relationship estimates similar to those obtained based on pedigree; however, HGOF is the best option for estimating the genomic relationship matrix and results in a higher prediction accuracy. The ranking of the top 20% animals was very similar for all matrices, but the ranking within them varies depending on the method used.

Download Full-text