scholarly journals Predicting genomic selection efficiency to optimize calibration set and to assess prediction accuracy in highly structured populations

2017 ◽  
Vol 130 (11) ◽  
pp. 2231-2247 ◽  
Author(s):  
R. Rincent ◽  
A. Charcosset ◽  
L. Moreau
BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Katie M. O’Connor ◽  
Ben J. Hayes ◽  
Craig M. Hardner ◽  
Mobashwer Alam ◽  
Robert J. Henry ◽  
...  

Abstract Background Improving yield prediction and selection efficiency is critical for tree breeding. This is vital for macadamia trees with the time from crossing to production of new cultivars being almost a quarter of a century. Genomic selection (GS) is a useful tool in plant breeding, particularly with perennial trees, contributing to an increased rate of genetic gain and reducing the length of the breeding cycle. We investigated the potential of using GS methods to increase genetic gain and accelerate selection efficiency in the Australian macadamia breeding program with comparison to traditional breeding methods. This study evaluated the prediction accuracy of GS in a macadamia breeding population of 295 full-sib progeny from 32 families (29 parents, reciprocals combined), along with a subset of parents. Historical yield data for tree ages 5 to 8 years were used in the study, along with a set of 4113 SNP markers. The traits of focus were average nut yield from tree ages 5 to 8 years and yield stability, measured as the standard deviation of yield over these 4 years. GBLUP GS models were used to obtain genomic estimated breeding values for each genotype, with a five-fold cross-validation method and two techniques: prediction across related populations and prediction across unrelated populations. Results Narrow-sense heritability of yield and yield stability was low (h2 = 0.30 and 0.04, respectively). Prediction accuracy for yield was 0.57 for predictions across related populations and 0.14 when predicted across unrelated populations. Accuracy of prediction of yield stability was high (r = 0.79) for predictions across related populations. Predicted genetic gain of yield using GS in related populations was 474 g/year, more than double that of traditional breeding methods (226 g/year), due to the halving of generation length from 8 to 4 years. Conclusions The results of this study indicate that the incorporation of GS for yield into the Australian macadamia breeding program may accelerate genetic gain due to reduction in generation length, though the cost of genotyping appears to be a constraint at present.


Genes ◽  
2021 ◽  
Vol 12 (2) ◽  
pp. 210
Author(s):  
Sang V. Vu ◽  
Cedric Gondro ◽  
Ngoc T. H. Nguyen ◽  
Arthur R. Gilmour ◽  
Rick Tearle ◽  
...  

Genomic selection has been widely used in terrestrial animals but has had limited application in aquaculture due to relatively high genotyping costs. Genomic information has an important role in improving the prediction accuracy of breeding values, especially for traits that are difficult or expensive to measure. The purposes of this study were to (i) further evaluate the use of genomic information to improve prediction accuracies of breeding values from, (ii) compare different prediction methods (BayesA, BayesCπ and GBLUP) on prediction accuracies in our field data, and (iii) investigate the effects of different SNP marker densities on prediction accuracies of traits in the Portuguese oyster (Crassostrea angulata). The traits studied are all of economic importance and included morphometric traits (shell length, shell width, shell depth, shell weight), edibility traits (tenderness, taste, moisture content), and disease traits (Polydora sp. and Marteilioides chungmuensis). A total of 18,849 single nucleotide polymorphisms were obtained from genotyping by sequencing and used to estimate genetic parameters (heritability and genetic correlation) and the prediction accuracy of genomic selection for these traits. Multi-locus mixed model analysis indicated high estimates of heritability for edibility traits; 0.44 for moisture content, 0.59 for taste, and 0.72 for tenderness. The morphometric traits, shell length, shell width, shell depth and shell weight had estimated genomic heritabilities ranging from 0.28 to 0.55. The genomic heritabilities were relatively low for the disease related traits: Polydora sp. prevalence (0.11) and M. chungmuensis (0.10). Genomic correlations between whole weight and other morphometric traits were from moderate to high and positive (0.58–0.90). However, unfavourably positive genomic correlations were observed between whole weight and the disease traits (0.35–0.37). The genomic best linear unbiased prediction method (GBLUP) showed slightly higher accuracy for the traits studied (0.240–0.794) compared with both BayesA and BayesCπ methods but these differences were not significant. In addition, there is a large potential for using low-density SNP markers for genomic selection in this population at a number of 3000 SNPs. Therefore, there is the prospect to improve morphometric, edibility and disease related traits using genomic information in this species.


Cells ◽  
2021 ◽  
Vol 10 (12) ◽  
pp. 3372
Author(s):  
Cesar A. Medina ◽  
Harpreet Kaur ◽  
Ian Ray ◽  
Long-Xi Yu

Agronomic traits such as biomass yield and abiotic stress tolerance are genetically complex and challenging to improve through conventional breeding approaches. Genomic selection (GS) is an alternative approach in which genome-wide markers are used to determine the genomic estimated breeding value (GEBV) of individuals in a population. In alfalfa (Medicago sativa L.), previous results indicated that low to moderate prediction accuracy values (<70%) were obtained in complex traits, such as yield and abiotic stress resistance. There is a need to increase the prediction value in order to employ GS in breeding programs. In this paper we reviewed different statistic models and their applications in polyploid crops, such as alfalfa and potato. Specifically, we used empirical data affiliated with alfalfa yield under salt stress to investigate approaches that use DNA marker importance values derived from machine learning models, and genome-wide association studies (GWAS) of marker-trait association scores based on different GWASpoly models, in weighted GBLUP analyses. This approach increased prediction accuracies from 50% to more than 80% for alfalfa yield under salt stress. Finally, we expended the weighted GBLUP approach to potato and analyzed 13 phenotypic traits and obtained similar results. This is the first report on alfalfa to use variable importance and GWAS-assisted approaches to increase the prediction accuracy of GS, thus helping to select superior alfalfa lines based on their GEBVs.


Crop Science ◽  
2017 ◽  
Vol 57 (3) ◽  
pp. 1325-1337 ◽  
Author(s):  
Alexandra Duhnen ◽  
Amandine Gras ◽  
Simon Teyssèdre ◽  
Michel Romestant ◽  
Bruno Claustres ◽  
...  

2018 ◽  
Vol 132 (1) ◽  
pp. 81-96 ◽  
Author(s):  
Simon Rio ◽  
Tristan Mary-Huard ◽  
Laurence Moreau ◽  
Alain Charcosset

Author(s):  
Sikiru Adeniyi Atanda ◽  
Michael Olsen ◽  
Juan Burgueño ◽  
Jose Crossa ◽  
Daniel Dzidzienyo ◽  
...  

Abstract Key message Historical data from breeding programs can be efficiently used to improve genomic selection accuracy, especially when the training set is optimized to subset individuals most informative of the target testing set. Abstract The current strategy for large-scale implementation of genomic selection (GS) at the International Maize and Wheat Improvement Center (CIMMYT) global maize breeding program has been to train models using information from full-sibs in a “test-half-predict-half approach.” Although effective, this approach has limitations, as it requires large full-sib populations and limits the ability to shorten variety testing and breeding cycle times. The primary objective of this study was to identify optimal experimental and training set designs to maximize prediction accuracy of GS in CIMMYT’s maize breeding programs. Training set (TS) design strategies were evaluated to determine the most efficient use of phenotypic data collected on relatives for genomic prediction (GP) using datasets containing 849 (DS1) and 1389 (DS2) DH-lines evaluated as testcrosses in 2017 and 2018, respectively. Our results show there is merit in the use of multiple bi-parental populations as TS when selected using algorithms to maximize relatedness between the training and prediction sets. In a breeding program where relevant past breeding information is not readily available, the phenotyping expenditure can be spread across connected bi-parental populations by phenotyping only a small number of lines from each population. This significantly improves prediction accuracy compared to within-population prediction, especially when the TS for within full-sib prediction is small. Finally, we demonstrate that prediction accuracy in either sparse testing or “test-half-predict-half” can further be improved by optimizing which lines are planted for phenotyping and which lines are to be only genotyped for advancement based on GP.


Plants ◽  
2020 ◽  
Vol 9 (6) ◽  
pp. 719
Author(s):  
Mulusew Fikere ◽  
Denise M. Barbulescu ◽  
M. Michelle Malmberg ◽  
Pankaj Maharjan ◽  
Phillip A. Salisbury ◽  
...  

Genomic selection accelerates genetic progress in crop breeding through the prediction of future phenotypes of selection candidates based on only their genomic information. Here we report genetic correlations and genomic prediction accuracies in 22 agronomic, disease, and seed quality traits measured across multiple years (2015–2017) in replicated trials under rain-fed and irrigated conditions in Victoria, Australia. Two hundred and two spring canola lines were genotyped for 62,082 Single Nucleotide Polymorphisms (SNPs) using transcriptomic genotype-by-sequencing (GBSt). Traits were evaluated in single trait and bivariate genomic best linear unbiased prediction (GBLUP) models and cross-validation. GBLUP were also expanded to include genotype-by-environment G × E interactions. Genomic heritability varied from 0.31to 0.66. Genetic correlations were highly positive within traits across locations and years. Oil content was positively correlated with most agronomic traits. Strong, not previously documented, negative correlations were observed between average internal infection (a measure of blackleg disease) and arachidic and stearic acids. The genetic correlations between fatty acid traits followed the expected patterns based on oil biosynthesis pathways. Genomic prediction accuracy ranged from 0.29 for emergence count to 0.69 for seed yield. The incorporation of G × E translates into improved prediction accuracy by up to 6%. The genomic prediction accuracies achieved indicate that genomic selection is ready for application in canola breeding.


2021 ◽  
Vol 53 (1) ◽  
Author(s):  
Cheng Bian ◽  
Dzianis Prakapenka ◽  
Cheng Tan ◽  
Ruifei Yang ◽  
Di Zhu ◽  
...  

Abstract Background Genomic selection using single nucleotide polymorphism (SNP) markers has been widely used for genetic improvement of livestock, but most current methods of genomic selection are based on SNP models. In this study, we investigated the prediction accuracies of haplotype models based on fixed chromosome distances and gene boundaries compared to those of SNP models for genomic prediction of phenotypic values. We also examined the reasons for the successes and failures of haplotype genomic prediction. Methods We analyzed a swine population of 3195 Duroc boars with records on eight traits: body judging score (BJS), teat number (TN), age (AGW), loin muscle area (LMA), loin muscle depth (LMD) and back fat thickness (BF) at 100 kg live weight, and average daily gain (ADG) and feed conversion rate (FCR) from 30 to100 kg live weight. Ten-fold validation was used to evaluate the prediction accuracy of each SNP model and each multi-allelic haplotype model based on 488,124 autosomal SNPs from low-coverage sequencing. Haplotype blocks were defined using fixed chromosome distances or gene boundaries. Results Compared to the best SNP model, the accuracy of predicting phenotypic values using a haplotype model was greater by 7.4% for BJS, 7.1% for AGW, 6.6% for ADG, 4.9% for FCR, 2.7% for LMA, 1.9% for LMD, 1.4% for BF, and 0.3% for TN. The use of gene-based haplotype blocks resulted in the best prediction accuracy for LMA, LMD, and TN. Compared to estimates of SNP additive heritability, estimates of haplotype epistasis heritability were strongly correlated with the increase in prediction accuracy by haplotype models. The increase in prediction accuracy was largest for BJS, AGW, ADG, and FCR, which also had the largest estimates of haplotype epistasis heritability, 24.4% for BJS, 14.3% for AGW, 14.5% for ADG, and 17.7% for FCR. SNP and haplotype heritability profiles across the genome identified several genes with large genetic contributions to phenotypes: NUDT3 for LMA, LMD and BF, VRTN for TN, COL5A2 for BJS, BSND for ADG, and CARTPT for FCR. Conclusions Haplotype prediction models improved the accuracy for genomic prediction of phenotypes in Duroc pigs. For some traits, the best prediction accuracy was obtained with haplotypes defined using gene regions, which provides evidence that functional genomic information can improve the accuracy of haplotype genomic prediction for certain traits.


Sign in / Sign up

Export Citation Format

Share Document