scholarly journals Genomic Heritability: A Ragged Diagonal Between Bias and Variance

2021 ◽  
Author(s):  
Mitchell J. Feldmann ◽  
Hans-Peter Piepho ◽  
Steven J. Knapp

Many important traits in plants, animals, and microbes are polygenic and are therefore difficult to improve through traditional marker?assisted selection. Genomic prediction addresses this by enabling the inclusion of all genetic data in a mixed model framework. The main method for predicting breeding values is genomic best linear unbiased prediction (GBLUP), which uses the realized genomic relationship or kinship matrix (K) to connect genotype to phenotype. The use of relationship matrices allows information to be shared for estimating the genetic values for observed entries and predicting genetic values for unobserved entries. One of the key parameters of such models is genomic heritability (h2g), or the variance of a trait associated with a genome-wide sample of DNA polymorphisms. Here we discuss the relationship between several common methods for calculating the genomic relationship matrix and propose a new matrix based on the average semivariance that yields accurate estimates of genomic variance in the observed population regardless of the focal population quality as well as accurate breeding value predictions in unobserved samples. Notably, our proposed method is highly similar to the approach presented by Legarra (2016) despite different mathematical derivations and statistical perspectives and only deviates from the classic approach presented in VanRaden (2008) by a scaling factor. With current approaches, we found that the genomic heritability tends to be either over- or underestimated depending on the scaling and centering applied to the marker matrix (Z), the value of the average diagonal element of K, and the assortment of alleles and heterozygosity (H) in the observed population and that, unlike its predecessors, our newly proposed kinship matrix KASV yields accurate estimates of h2g in the observed population, generalizes to larger populations, and produces BLUPs equivalent to common methods in plants and animals.

2020 ◽  
Vol 10 (6) ◽  
pp. 2069-2078 ◽  
Author(s):  
Christos Palaiokostas ◽  
Shannon M. Clarke ◽  
Henrik Jeuthe ◽  
Rudiger Brauning ◽  
Timothy P. Bilton ◽  
...  

Arctic charr (Salvelinus alpinus) is a species of high economic value for the aquaculture industry, and of high ecological value due to its Holarctic distribution in both marine and freshwater environments. Novel genome sequencing approaches enable the study of population and quantitative genetic parameters even on species with limited or no prior genomic resources. Low coverage genotyping by sequencing (GBS) was applied in a selected strain of Arctic charr in Sweden originating from a landlocked freshwater population. For the needs of the current study, animals from year classes 2013 (171 animals, parental population) and 2017 (759 animals; 13 full sib families) were used as a template for identifying genome wide single nucleotide polymorphisms (SNPs). GBS libraries were constructed using the PstI and MspI restriction enzymes. Approximately 14.5K SNPs passed quality control and were used for estimating a genomic relationship matrix. Thereafter a wide range of analyses were conducted in order to gain insights regarding genetic diversity and investigate the efficiency of the genomic information for parentage assignment and breeding value estimation. Heterozygosity estimates for both year classes suggested a slight excess of heterozygotes. Furthermore, FST estimates among the families of year class 2017 ranged between 0.009 – 0.066. Principal components analysis (PCA) and discriminant analysis of principal components (DAPC) were applied aiming to identify the existence of genetic clusters among the studied population. Results obtained were in accordance with pedigree records allowing the identification of individual families. Additionally, DNA parentage verification was performed, with results in accordance with the pedigree records with the exception of a putative dam where full sib genotypes suggested a potential recording error. Breeding value estimation for juvenile growth through the usage of the estimated genomic relationship matrix clearly outperformed the pedigree equivalent in terms of prediction accuracy (0.51 opposed to 0.31). Overall, low coverage GBS has proven to be a cost-effective genotyping platform that is expected to boost the selection efficiency of the Arctic charr breeding program.


2021 ◽  
Author(s):  
Adam R Festa ◽  
Ross Whetten

Computer simulations of breeding strategies are an essential resource for tree breeders because they allow exploratory analyses into potential long-term impacts on genetic gain and inbreeding consequences without bearing the cost, time, or resource requirements of field experiments. Previous work has modeled the potential long-term implications on inbreeding and genetic gain using random mating and phenotypic selection. Reduction in sequencing costs has enabled the use of DNA marker-based relationship matrices in addition to or in place of pedigree-based allele sharing estimates; this has been shown to provide a significant increase in the accuracy of progeny breeding value prediction. A potential pitfall of genomic selection using genetic relationship matrices is increased coancestry among selections, leading to the accumulation of deleterious alleles and inbreeding depression. We used simulation to compare the relative genetic gain and risk of inbreeding depression within a breeding program similar to loblolly pine, utilizing pedigree-based or marker-based relationships over ten generations. We saw a faster rate of purging deleterious alleles when using a genomic relationship matrix based on markers that track identity-by-descent of segments of the genome. Additionally, we observed an increase in the rate of genetic gain when using a genomic relationship matrix instead of a pedigree-based relationship matrix. While the genetic variance of populations decreased more rapidly when using genomic-based relationship matrices as opposed to pedigree-based, there appeared to be no long-term consequences on the accumulation of deleterious alleles within the simulated breeding strategy.


2013 ◽  
Vol 55 (3) ◽  
pp. 165-171
Author(s):  
Joon-Ho Lee ◽  
Kwang-Hyun Cho ◽  
Chung-Il Cho ◽  
Kyung-Do Park ◽  
Deuk Hwan Lee

2018 ◽  
Vol 98 (4) ◽  
pp. 750-759 ◽  
Author(s):  
Z. Karimi ◽  
M. Sargolzaei ◽  
J.A.B. Robinson ◽  
F.S. Schenkel

A single-nucleotide polymorphisms-based genomic relationship matrix (GSNP) discriminate less identity by state from identity by descent (IBD) alleles compared with a multi-locus haplotype-based relationship matrix (GHAP), which can better capture IBD alleles and recent relationships. We aimed to compare the prediction reliability and prediction bias of genomic best linear unbiased prediction (GBLUP) using either GSNP or GHAP in Holstein cattle. Therefore, a total of 57 traits with a wide range of heritability values were analyzed. Classical validation tests were done using a validation dataset comprised of 50k genotype records of 561–669 proven bulls born in 2010–2011 with an official estimated breeding value (EBV) in 2016 and a training set of 5314–19 678 bulls born before 2010, depending on the trait. The method for building the genomic relationship matrix (G) had significant, but small effect on observed reliability (r2GEBV) (p < 0.0001) and bias (p < 0.0001). A significant interaction between G and the level of trait heritability on r2GEBV and bias was also observed (p < 0.0001). The small gains in r2GEBV and small reductions in the bias by using GHAPBLUP were increased when predicting moderate to high-heritability traits compared with low-heritability traits.


2019 ◽  
Vol 97 (Supplement_3) ◽  
pp. 262-262
Author(s):  
Ling-Yun Chang ◽  
Sajjad Toghiani ◽  
E L Hamidi Hay ◽  
Samuel E Aggrey ◽  
Romdhane Rekaya

Abstract Using low to moderate density SNP marker panels, a substantial increase in accuracy was achieved. The dramatic increase in the number of identified variants due to advances in next generation sequencing was expected to significantly increase the accuracy of genomic selection (GS). Unfortunately, little to no improvement was observed. For mixed model-based approaches, using all SNPs in the panel to compute the observed relationship matrix (G) will not increase accuracy as the additive relationships between individuals can be accurately estimated using a much smaller number of markers. Due to these limitations, variant prioritization has become a necessity to improve accuracy. Further, it has been shown that weighting SNPs when calculating G could be effective in improving the accuracy of GS. FST as a measure population differential has been successfully used to identify genome segments under selection pressure. Consequently, FST could be used to both prioritize SNPs and to derive their relative weight in the calculation of the genomic relationship matrix. A population of 15,000 animals genotyped for 400K SNP markers uniformly-distributed along 10 chromosomes was simulated. A trait with heritability 0.3 genetically controlled by two hundred QTL was generated. The top 20K SNPs based on their FST scores were used either alone or with the remaining 380K SNPs to compute G with or without weighting. When only the top 20K SNPs were used to compute G, two scenarios were considered: 1) equal weights for all SNPs or 2) weights proportional to the SNP FST scores. When all 400K SNP markers were used, different weighting scenarios were evaluated. The results clearly showed that prioritizing SNP markers based on their FST score and using the latter to compute relative weights has increased the genetic similarity between training and validations animals and resulted in more than 5% improvement in the accuracy of GS.


2019 ◽  
Vol 51 (1) ◽  
Author(s):  
Pascal Duenk ◽  
Mario P. L. Calus ◽  
Yvonne C. J. Wientjes ◽  
Vivian P. Breen ◽  
John M. Henshall ◽  
...  

Following publication of original article [1], we noticed that there was an error: Eq. (3) on page 5 is the genomic relationship matrix that


2014 ◽  
Vol 54 (5) ◽  
pp. 544 ◽  
Author(s):  
N. Moghaddar ◽  
A. A. Swan ◽  
J. H. J. van der Werf

The objective of this study was to predict the accuracy of genomic prediction for 26 traits, including weight, muscle, fat, and wool quantity and quality traits, in Australian sheep based on a large, multi-breed reference population. The reference population consisted of two research flocks, with the main breeds being Merino, Border Leicester (BL), Poll Dorset (PD), and White Suffolk (WS). The genomic estimated breeding value (GEBV) was based on GBLUP (genomic best linear unbiased prediction), applying a genomic relationship matrix calculated from the 50K Ovine SNP chip marker genotypes. The accuracy of GEBV was evaluated as the Pearson correlation coefficient between GEBV and accurate estimated breeding value based on progeny records in a set of genotyped industry animals. The accuracies of weight traits were relatively low to moderate in PD and WS breeds (0.11–0.27) and moderate to relatively high in BL and Merino (0.25–0.63). The accuracy of muscle and fat traits was moderate to relatively high across all breeds (between 0.21 and 0.55). The accuracy of GEBV of yearling and adult wool traits in Merino was, on average, high (0.33–0.75). The results showed the accuracy of genomic prediction depends on trait heritability and the effective size of the reference population, whereas the observed GEBV accuracies were more related to the breed proportions in the multi-breed reference population. No extra gain in within-breed GEBV accuracy was observed based on across breed information. More investigations are required to determine the precise effect of across-breed information on within-breed genomic prediction.


Sign in / Sign up

Export Citation Format

Share Document