scholarly journals A deterministic equation to predict the accuracy of multi-population genomic prediction with multiple genomic relationship matrices

2020 ◽  
Vol 52 (1) ◽  
Author(s):  
Biaty Raymond ◽  
Yvonne C. J. Wientjes ◽  
Aniek C. Bouwman ◽  
Chris Schrooten ◽  
Roel F. Veerkamp
BMC Genetics ◽  
2014 ◽  
Vol 15 (1) ◽  
pp. 53 ◽  
Author(s):  
Liuhong Chen ◽  
Changxi Li ◽  
Stephen Miller ◽  
Flavio Schenkel

2021 ◽  
Author(s):  
Charlotte Brault ◽  
Vincent Segura ◽  
Patrice This ◽  
Loïc Le Cunff ◽  
Timothée Flutre ◽  
...  

Crop breeding involves two selection steps: choosing progenitors and selecting offspring within progenies. Genomic prediction, based on genome-wide marker estimation of genetic values, could facilitate these steps. However, its potential usefulness in grapevine (Vitis vinifera L.) has only been evaluated in non-breeding contexts mainly through cross-validation within a single population. We tested across-population genomic prediction in a more realistic breeding configuration, from a diversity panel to ten bi-parental crosses connected within a half-diallel mating design. Prediction quality was evaluated over 15 traits of interest (related to yield, berry composition, phenology and vigour), for both the average genetic value of each cross (cross mean) and the genetic values of individuals within each cross (individual values). Genomic prediction in these conditions was found useful: for cross mean, average per-trait predictive ability was 0.6, while per-cross predictive ability was halved on average, but reached a maximum of 0.7. Mean predictive ability for individual values within crosses was 0.26, about half the within-half-diallel value taken as a reference. For some traits and/or crosses, these across-population predictive ability values are promising for implementing genomic selection in grapevine breeding. This study also provided key insights on variables affecting predictive ability. Per-cross predictive ability was well predicted by genetic distance between parents and when this predictive ability was below 0.6, it was improved by training set optimization. For individual values, predictive ability mostly depended on trait-related variables (magnitude of the cross effect and heritability). These results will greatly help designing grapevine breeding programs assisted by genomic prediction.


2021 ◽  
Vol 12 ◽  
Author(s):  
Malachy T. Campbell ◽  
Haixiao Hu ◽  
Trevor H. Yeats ◽  
Lauren J. Brzozowski ◽  
Melanie Caffe-Treml ◽  
...  

The observable phenotype is the manifestation of information that is passed along different organization levels (transcriptional, translational, and metabolic) of a biological system. The widespread use of various omic technologies (RNA-sequencing, metabolomics, etc.) has provided plant genetics and breeders with a wealth of information on pertinent intermediate molecular processes that may help explain variation in conventional traits such as yield, seed quality, and fitness, among others. A major challenge is effectively using these data to help predict the genetic merit of new, unobserved individuals for conventional agronomic traits. Trait-specific genomic relationship matrices (TGRMs) model the relationships between individuals using genome-wide markers (SNPs) and place greater emphasis on markers that most relevant to the trait compared to conventional genomic relationship matrices. Given that these approaches define relationships based on putative causal loci, it is expected that these approaches should improve predictions for related traits. In this study we evaluated the use of TGRMs to accommodate information on intermediate molecular phenotypes (referred to as endophenotypes) and to predict an agronomic trait, total lipid content, in oat seed. Nine fatty acids were quantified in a panel of 336 oat lines. Marker effects were estimated for each endophenotype, and were used to construct TGRMs. A multikernel TRGM model (MK-TRGM-BLUP) was used to predict total seed lipid content in an independent panel of 210 oat lines. The MK-TRGM-BLUP approach significantly improved predictions for total lipid content when compared to a conventional genomic BLUP (gBLUP) approach. Given that the MK-TGRM-BLUP approach leverages information on the nine fatty acids to predict genetic values for total lipid content in unobserved individuals, we compared the MK-TGRM-BLUP approach to a multi-trait gBLUP (MT-gBLUP) approach that jointly fits phenotypes for fatty acids and total lipid content. The MK-TGRM-BLUP approach significantly outperformed MT-gBLUP. Collectively, these results highlight the utility of using TGRM to accommodate information on endophenotypes and improve genomic prediction for a conventional agronomic trait.


BMC Genetics ◽  
2015 ◽  
Vol 16 (1) ◽  
Author(s):  
S. van den Berg ◽  
M. P. L. Calus ◽  
T. H. E. Meuwissen ◽  
Y. C. J. Wientjes

2019 ◽  
Vol 97 (Supplement_3) ◽  
pp. 49-49
Author(s):  
Andre Garcia ◽  
Yutaka Masuda ◽  
Stephen P Miller ◽  
Ignacy Misztal ◽  
Daniela Lourenco

Abstract With the increasing number of genotyped animals, the algorithm for proven and young (APY) can be used to compute the inverse of the genomic relationship matrix (G-1apy) in genomic BLUP (GBLUP) and single-step GBLUP (ssGBLUP). This algorithm also allows the use of all genotyped animals to calculate SNP effects from genomic EBV (GEBV), which can then be used to obtain indirect predictions (IP) for interim evaluations, or as genomic prediction for animals not included in official evaluations. The objective of the study was to evaluate the quality of IP from GBLUP with increasing number of genotyped animals. Birth weight, weaning weight and post-wearing gain phenotypes and genotypes were provided by the American Angus Association. Phenotypes and genotypes were divided in 3 scenarios based on birth year: genotyped animals born up to 2013 (114,937), 2014 (183,847) and 2015 (280,506). A 3-trait model was fit and GBLUP with APY was used to calculate GEBV and SNP effects. To calculate G-1apy, 19,021 core animals were randomly sampled from animals born up to 2013. Core animals remained the same, whereas the number of non-core animals increased as more genotyped animals were added. Additional analyses had updated core animals for each scenario. SNP effects were also calculated based on G-1apy and G-1 only for core animals (G-1core). IP were computed for all animals in each scenario by multiplying the centered genotypes by the SNP effects. To access the quality of IP, correlation between IP and GEBV was calculated. The Correlations were greater than 0.99 for all traits in all scenarios. Despite the increase of non-core animals in APY, GEBV were successfully retrieved from SNP effects using IP. When SNP effects were calculated based on G-1core, updating the core animals as the number of genotyped animals increase seems to be the best choice.


2019 ◽  
Vol 51 (1) ◽  
Author(s):  
Hailiang Song ◽  
Shaopan Ye ◽  
Yifan Jiang ◽  
Zhe Zhang ◽  
Qin Zhang ◽  
...  

Abstract Background For genomic selection in populations with a small reference population, combining populations of the same breed or populations of related breeds is an effective way to increase the size of the reference population. However, genomic predictions based on single nucleotide polymorphism (SNP)-chip genotype data using combined populations with different genetic backgrounds or from different breeds have not shown a clear advantage over using within-population or within-breed predictions. The increasing availability of whole-genome sequencing (WGS) data provides new opportunities for combined population genomic prediction. Our objective was to investigate the accuracy of genomic prediction using imputation-based WGS data from combined populations in pigs. Using 80K SNP panel genotypes, WGS genotypes, or genotypes on WGS variants that were pruned based on linkage disequilibrium (LD), three methods [genomic best linear unbiased prediction (GBLUP), single-step (ss)GBLUP, and genomic feature (GF)BLUP] were implemented with different prior information to identify the best method to improve the accuracy of genomic prediction for combined populations in pigs. Results In total, 2089 and 2043 individuals with production and reproduction phenotypes, respectively, from three Yorkshire populations with different genetic backgrounds were genotyped with the PorcineSNP80 panel. Imputation accuracy from 80K to WGS variants reached 92%. The results showed that use of the WGS data compared to the 80K SNP panel did not increase the accuracy of genomic prediction in a single population, but using WGS data with LD pruning and GFBLUP with prior information did yield higher accuracy than the 80K SNP panel. For the 80K SNP panel genotypes, using the combined population resulted in a slight improvement, no change, or even a slight decrease in accuracy in comparison with the single population for GBLUP and ssGBLUP, while accuracy increased by 1 to 2.4% when using WGS data. Notably, the GFBLUP method did not perform well for both the combined population and the single populations. Conclusions The use of WGS data was beneficial for combined population genomic prediction. Simply increasing the number of SNPs to the WGS level did not increase accuracy for a single population, while using pruned WGS data based on LD and GFBLUP with prior information could yield higher accuracy than the 80K SNP panel.


2015 ◽  
Vol 47 (1) ◽  
pp. 5 ◽  
Author(s):  
Yvonne Wientjes ◽  
Roel F Veerkamp ◽  
Piter Bijma ◽  
Henk Bovenhuis ◽  
Chris Schrooten ◽  
...  

2017 ◽  
Author(s):  
Yvonne C.J. Wientjes ◽  
Piter Bijma ◽  
Jérémie Vandenplas ◽  
Mario P.L. Calus

ABSTRACTDifferent methods are available to calculate multi-population genomic relationship matrices. Since those matrices differ in base population, it is anticipated that the method used to calculate the genomic relationship matrix affect the estimate of genetic variances, covariances and correlations. The aim of this paper is to define a multi-population genomic relationship matrix to estimate current genetic variances within and genetic correlations between populations. The genomic relationship matrix containing two populations consists of four blocks, one block for population 1, one block for population 2, and two blocks for relationships between the populations. It is known, based on literature, that current genetic variances are estimated when the current population is used as base population of the relationship matrix. In this paper, we theoretically derived the properties of the genomic relationship matrix to estimate genetic correlations and validated it using simulations. When the scaling factors of the genomic relationship matrix fulfill the property , the genetic correlation is estimated even though estimated variance components are not necessarily related to the current population. When this property is not met, the correlation based on estimated variance components should be multiplied by to rescale the genetic correlation. In this study we present a genomic relationship matrix which directly results in current genetic variances as well as genetic correlations between populations.


Sign in / Sign up

Export Citation Format

Share Document