Improvement of genomic prediction in advanced wheat breeding lines by including additive-by-additive epistasis

Theoretical and Applied Genetics ◽

10.1007/s00122-021-04009-4 ◽

2022 ◽

Author(s):

Miguel Angel Raffo ◽

Pernille Sarup ◽

Xiangyu Guo ◽

Huiming Liu ◽

Jeppe Reitan Andersen ◽

...

Keyword(s):

Grain Yield ◽

Genomic Prediction ◽

Cross Validation ◽

Predictive Ability ◽

Wheat Breeding ◽

Breeding Cycle ◽

Wheat Grain ◽

Additive Variance ◽

Genetic Merit ◽

Genetic Variances

Abstract Key message Including additive and additive-by-additive epistasis in a NOIA parametrization did not yield orthogonal partitioning of genetic variances, nevertheless, it improved predictive ability in a leave-one-out cross-validation for wheat grain yield. Abstract Additive-by-additive epistasis is the principal non-additive genetic effect in inbred wheat lines and is potentially useful for developing cultivars based on total genetic merit; nevertheless, its practical benefits have been highly debated. In this article, we aimed to (i) evaluate the performance of models including additive and additive-by-additive epistatic effects for variance components (VC) estimation of grain yield in a wheat-breeding population, and (ii) to investigate whether including additive-by-additive epistasis in genomic prediction enhance wheat grain yield predictive ability (PA). In total, 2060 sixth-generation (F6) lines from Nordic Seed A/S breeding company were phenotyped in 21 year-location combinations in Denmark, and genotyped using a 15 K-Illumina-BeadChip. Three models were used to estimate VC and heritability at plot level: (i) “I-model” (baseline), (ii) “I + GA-model”, extending I-model with an additive genomic effect, and (iii) “I + GA + GAA-model”, extending I + GA-model with an additive-by-additive genomic effects. The I + GA-model and I + GA + GAA-model were based on the Natural and Orthogonal Interactions Approach (NOIA) parametrization. The I + GA + GAA-model failed to achieve orthogonal partition of genetic variances, as revealed by a change in estimated additive variance of I + GA-model when epistasis was included in the I + GA + GAA-model. The PA was studied using leave-one-line-out and leave-one-breeding-cycle-out cross-validations. The I + GA + GAA-model increased PA significantly (16.5%) compared to the I + GA-model in leave-one-line-out cross-validation. However, the improvement due to including epistasis was not observed in leave-one-breeding-cycle-out cross-validation. We conclude that epistatic models can be useful to enhance predictions of total genetic merit. However, even though we used the NOIA parameterization, the variance partition into orthogonal genetic effects was not possible.

Download Full-text

Improvement of Genomic Prediction in Advanced Wheat Breeding Lines by Including Additive × additive Epistasis

10.21203/rs.3.rs-424490/v1 ◽

2021 ◽

Author(s):

Miguel Angel Raffo ◽

Pernille Sarup ◽

Xiangyu Guo ◽

Huiming Liu ◽

Jeppe Reitan Andersen ◽

...

Keyword(s):

Genomic Prediction ◽

Cross Validation ◽

Best Linear Unbiased Prediction ◽

Wheat Breeding ◽

Breeding Cycle ◽

Linear Unbiased Prediction ◽

Breeding Lines ◽

Genetic Merit ◽

Best Linear Unbiased ◽

Unbiased Prediction

Abstract Epistasis is the principal non-additive genetic effect in inbred wheat lines and can be used to develop cultivars based on total genetic merit. Correct models for variance components (VCs) estimation are needed to disentangle the genetic architecture of complex traits in wheat. We aimed to i) evaluate the performance of extended genomic best linear unbiased prediction (EG-BLUP) and the natural and orthogonal interactions approach (NOIA) for VCs estimation in a commercial wheat-breeding population, and ii) investigate whether including epistasis in genomic prediction enhance predictive ability (PA) for wheat breeding lines. In total, 2,060 sixth-generation (F6) lines from Nordic Seed A/S breeding company were phenotyped for grain yield over 21-year-x-location combinations in Denmark, and genotyped using 15K Illumina-BeadChip. Four models were used to estimate VCs and heritability at plot level: i) Baseline, ii) Genomic best linear unbiased prediction (G-BLUP), iii) EG-BLUP, and iv) NOIA. Narrow- and broad-sense heritabilities estimated with G-BLUP were 0.15 and 0.31, respectively. EG-BLUP and NOIA failed to achieve orthogonal partition of genetic variances. Even though NOIA removed Hardy-Weinberg equilibrium assumption, both models yielded very similar estimates, indicating that linkage disequilibrium causes the lack of orthogonality. The PA was studied using leave-one-line-out and leave-one-breeding-cycle-out cross-validations. Both EG-BLUP and NOIA increased PA significantly (16.5%) compared to G-BLUP in leave-one-line-out cross-validation. However, the improvement for including epistasis was not observed in the leave-one-breeding-cycle-out cross-validation. We conclude that although the variance partition into orthogonal genetic effects was not possible, epistatic models can be useful to enhance predictions of total genetic merit.

Download Full-text

Genomic Selection for End-Use Quality and Processing Traits in Soft White Winter Wheat Breeding Program with Machine and Deep Learning Models

10.1101/2021.05.24.445513 ◽

2021 ◽

Author(s):

Karansher S Sandhu ◽

Meriem Aoun ◽

Craig Morris ◽

Arron H Carter

Keyword(s):

Deep Learning ◽

Grain Yield ◽

Cross Validation ◽

Wheat Breeding ◽

Breeding Program ◽

Quality Traits ◽

Learning Models ◽

Breeding Programs ◽

Wheat Breeding Program ◽

End Use

Breeding for grain yield, biotic and abiotic stress resistance, and end-use quality are important goals of wheat breeding programs. Screening for end-use quality traits is usually secondary to grain yield due to high labor needs, cost of testing, and large seed requirements for phenotyping. Hence, testing is delayed until later stages in the breeding program. Delayed phenotyping results in advancement of inferior end-use quality lines into the program. Genomic selection provides an alternative to predict performance using genome-wide markers. Due to large datasets in breeding programs, we explored the potential of the machine and deep learning models to predict fourteen end-use quality traits in a winter wheat breeding program. The population used consisted of 666 wheat genotypes screened for five years (2015-19) at two locations (Pullman and Lind, WA, USA). Nine different models, including two machine learning (random forest and support vector machine) and two deep learning models (convolutional neural network and multilayer perceptron), were explored for cross-validation, forward, and across locations predictions. The prediction accuracies for different traits varied from 0.45-0.81, 0.29-0.55, and 0.27-0.50 under cross-validation, forward, and across location predictions. In general, forward prediction accuracies kept increasing over time due to increments in training data size and was more evident for machine and deep learning models. Deep learning models performed superior over the traditional ridge regression best linear unbiased prediction (RRBLUP) and Bayesian models under all prediction scenarios. The high accuracy observed for end-use quality traits in this study support predicting them in early generations, leading to the advancement of superior genotypes to more extensive grain yield trailing. Furthermore, the superior performance of machine and deep learning models strengthen the idea to include them in large scale breeding programs for predicting complex traits.

Download Full-text

CV-α: designing validations sets to increase the precision and enable multiple comparison tests in genomic prediction

10.1101/2020.11.11.376343 ◽

2020 ◽

Author(s):

Rafael Massahiro Yassue ◽

José Felipe Gonzaga Sabadin ◽

Giovanni Galli ◽

Filipe Couto Alves ◽

Roberto Fritsche-Neto

Keyword(s):

Genomic Prediction ◽

Cross Validation ◽

Prediction Models ◽

Mean Squared Error ◽

Predictive Ability ◽

Proof Of Concept ◽

Squared Error ◽

High Effect ◽

The Mean ◽

Fold Cross Validation

AbstractUsually, the comparison among genomic prediction models is based on validation schemes as Repeated Random Subsampling (RRS) or K-fold cross-validation. Nevertheless, the design of training and validation sets has a high effect on the way and subjectiveness that we compare models. Those procedures cited above have an overlap across replicates that might cause an overestimated estimate and lack of residuals independence due to resampling issues and might cause less accurate results. Furthermore, posthoc tests, such as ANOVA, are not recommended due to assumption unfulfilled regarding residuals independence. Thus, we propose a new way to sample observations to build training and validation sets based on cross-validation alpha-based design (CV-α). The CV-α was meant to create several scenarios of validation (replicates x folds), regardless of the number of treatments. Using CV-α, the number of genotypes in the same fold across replicates was much lower than K-fold, indicating higher residual independence. Therefore, based on the CV-α results, as proof of concept, via ANOVA, we could compare the proposed methodology to RRS and K-fold, applying four genomic prediction models with a simulated and real dataset. Concerning the predictive ability and bias, all validation methods showed similar performance. However, regarding the mean squared error and coefficient of variation, the CV-α method presented the best performance under the evaluated scenarios. Moreover, as it has no additional cost nor complexity, it is more reliable and allows the use of non-subjective methods to compare models and factors. Therefore, CV-α can be considered a more precise validation methodology for model selection.

Download Full-text

Multitrait, Random Regression, or Simple Repeatability Model in High‐Throughput Phenotyping Data Improve Genomic Prediction for Wheat Grain Yield

The Plant Genome ◽

10.3835/plantgenome2016.11.0111 ◽

2017 ◽

Vol 10 (2) ◽

Cited By ~ 55

Author(s):

Jin Sun ◽

Jessica E. Rutkoski ◽

Jesse A. Poland ◽

José Crossa ◽

Jean‐Luc Jannink ◽

...

Keyword(s):

Grain Yield ◽

High Throughput ◽

Genomic Prediction ◽

Random Regression ◽

Wheat Grain ◽

High Throughput Phenotyping

Download Full-text

Multi-Trait Multi-Environment Genomic Prediction of Agronomic Traits in Advanced Breeding Lines of Winter Wheat

Frontiers in Plant Science ◽

10.3389/fpls.2021.709545 ◽

2021 ◽

Vol 12 ◽

Author(s):

Harsimardeep S. Gill ◽

Jyotirmoy Halder ◽

Jinfeng Zhang ◽

Navreet K. Brar ◽

Teerath S. Rai ◽

...

Keyword(s):

Winter Wheat ◽

Grain Yield ◽

Genomic Prediction ◽

Complex Traits ◽

Agronomic Traits ◽

Wheat Breeding ◽

Successful Implementation ◽

Limited Information ◽

Multiple Traits ◽

Breeding Lines

Genomic prediction is a promising approach for accelerating the genetic gain of complex traits in wheat breeding. However, increasing the prediction accuracy (PA) of genomic prediction (GP) models remains a challenge in the successful implementation of this approach. Multivariate models have shown promise when evaluated using diverse panels of unrelated accessions; however, limited information is available on their performance in advanced breeding trials. Here, we used multivariate GP models to predict multiple agronomic traits using 314 advanced and elite breeding lines of winter wheat evaluated in 10 site-year environments. We evaluated a multi-trait (MT) model with two cross-validation schemes representing different breeding scenarios (CV1, prediction of completely unphenotyped lines; and CV2, prediction of partially phenotyped lines for correlated traits). Moreover, extensive data from multi-environment trials (METs) were used to cross-validate a Bayesian multi-trait multi-environment (MTME) model that integrates the analysis of multiple-traits, such as G × E interaction. The MT-CV2 model outperformed all the other models for predicting grain yield with significant improvement in PA over the single-trait (ST-CV1) model. The MTME model performed better for all traits, with average improvement over the ST-CV1 reaching up to 19, 71, 17, 48, and 51% for grain yield, grain protein content, test weight, plant height, and days to heading, respectively. Overall, the empirical analyses elucidate the potential of both the MT-CV2 and MTME models when advanced breeding lines are used as a training population to predict related preliminary breeding lines. Further, we evaluated the practical application of the MTME model in the breeding program to reduce phenotyping cost using a sparse testing design. This showed that complementing METs with GP can substantially enhance resource efficiency. Our results demonstrate that multivariate GS models have a great potential in implementing GS in breeding programs.

Download Full-text

Genomic prediction of arsenic tolerance and grain yield in rice. Contribution of trait-specific markers and multi environment models

10.1101/2020.09.28.316356 ◽

2020 ◽

Author(s):

Nourollah Ahmadi ◽

Tuong-Vi Cao ◽

Julien Frouin ◽

Gareth J. Norton ◽

Adam H. Price

Keyword(s):

Grain Yield ◽

Genomic Prediction ◽

Complex Traits ◽

Computing Time ◽

Predictive Ability ◽

Arsenic Content ◽

Genomic Relationship Matrix ◽

Relationship Matrix ◽

Base Line ◽

Rice Varieties

AbstractMany rice-growing areas are affected by high concentrations of arsenic (As). Rice varieties that prevent As uptake and/or accumulation can mitigate As threats to human health. Genomic selection is known to facilitate rapid selection of superior genotypes for complex traits. We explored the predictive ability (PA) of genomic prediction with single-environment models, accounting or not for trait-specific markers, multi-environment models, and multi-trait and multi-environment models, using the genotypic (1600 K SNP) and phenotypic (grain arsenic content, grain yield and days to flowering, observed under two irrigation systems over two years) data of the Bengal and Assam Aus Panel (BAAP). Under the base-line single environment model, PA of up to 0.707 and 0.654 was obtained for grain yield and grain As respectively, the three prediction methods (BL, GBLUP and RKHS) considered performed similarly, and marker selection based on linkage disequilibrium allowed to reduce the number of SNP to 17 K, without negative effect on PA of genomic predictions. Single environment models giving distinct weight to trait-specific markers in the genomic relationship matrix outperformed the base-line models up to 32%. Multi-environment models, accounting for G × E interactions, and multi-trait and multi-environment models outperformed the base-line models by up to 47% and 61%, respectively. Among the multi-trait and multi-environment models, the Bayesian multi-output regressor stacking function obtained the highest PA (0.831 for grain As) with much higher efficiency for computing time. These findings pave the way for breeding for As-tolerance in the progenies of biparental crosses involving members of the BAAP. It also applies to breeding for other complex traits evaluated under multiple environments.

Download Full-text

Genomic prediction for malting quality traits in practical barley breeding programs

10.1101/2020.07.30.228007 ◽

2020 ◽

Cited By ~ 1

Author(s):

Pernille Sarup ◽

Vahid Edriss ◽

Nanna Hellum Kristensen ◽

Jens Due Jensen ◽

Jihad Orabi ◽

...

Keyword(s):

Genomic Prediction ◽

Prediction Accuracy ◽

Cross Validation ◽

Spring Barley ◽

Malting Quality ◽

Breeding Cycle ◽

Quality Traits ◽

Training Population ◽

Barley Breeding ◽

Breeding Cycles

AbstractGenomic prediction can be advantageous in barley breeding for traits such as yield and malting quality to increase selection accuracy and minimize expensive phenotyping. In this paper, we investigate the possibilities of genomic selection for malting quality traits using a limited training population. The size of the training population is an important factor in determining the prediction accuracy of a trait. We investigated the potential for genomic prediction of malting quality within breeding cycles with leave one out (LOO) cross-validation, and across breeding cycles with leave set out (LSO) cross-validation. In addition, we investigated the effect of training population size on prediction accuracy by random two, four, and ten-fold cross-validation. The material used in this study was a population of 1329 spring barley lines from four breeding cycles. We found medium to high narrow sense heritabilities of the malting traits (0.31 to 0.65). Accuracies of predicting breeding values from LOO tests ranged from 0.6 to 0.9 making it worth the effort to use genomic prediction within breeding cycles. Accuracies from LSO tests ranged from 0.39 to 0.70 showing that genomic prediction across the breeding cycles were possible as well. Accuracy of prediction increased when the size of the training population increased. Therefore, prediction accuracy might be increased both within and across breeding cycle by increasing size of the training population

Download Full-text

The effect of tree (and cambium) age on genomic prediction for solid wood properties in Norway spruce

10.21203/rs.2.22694/v1 ◽

2020 ◽

Author(s):

Linghua Zhou ◽

Zhiqiang Chen ◽

Lars Olsson ◽

Thomas Grahn ◽

Bo Karlsson ◽

...

Keyword(s):

Norway Spruce ◽

Genomic Prediction ◽

Microfibril Angle ◽

Predictive Ability ◽

Tree Breeding ◽

Wood Properties ◽

Solid Wood ◽

Breeding Cycle ◽

Old Trees ◽

The Impact

Abstract Genomic selection (GS) or genomic prediction is considered as a promising approach to accelerate tree breeding and increase genetic gain by shortening breeding cycle. We investigated the predictive ability (PA) of GS based on 484 progeny trees from 62 half-sib families in Norway spruce ( Picea abies (L.) Karst.) for wood density, modulus of elasticity (MOE) and microfibril angle (MFA) measured with SilviScan, as well as for measurements on standing trees by Pilodyn and Hitman instruments. GS predictive abilities (PA) were comparable with those based on pedigree-based selection. The highest PAs were reached with at least 80-90% of the dataset used as training set. Use of different statistical methods had no significant impact on the estimated PAs. We also compared the abilities to predict density, MFA and MOE of 19 year old trees with use of models trained on data from coring at different ages and to different depths into the stem. The comparison indicated that close to the maximal PAs can be reached at age 10-12 by drilling only half way (ringwise) towards the pith, thereby reducing the impact on the tree.

Download Full-text

Impact of early genomiC prediction for recurrent selection in AN upland rice synthetic population

G3 Genes|Genome|Genetics ◽

10.1093/g3journal/jkab320 ◽

2021 ◽

Author(s):

Cédric Baertschi ◽

Tuong-Vi Cao ◽

Jérôme Bartholomé ◽

Yolima Ospina ◽

Constanza Quintero ◽

...

Keyword(s):

Grain Yield ◽

Plant Height ◽

Genomic Prediction ◽

Recurrent Selection ◽

Zinc Concentration ◽

Upland Rice ◽

Predictive Ability ◽

Single Site ◽

Progeny Testing ◽

Grain Zinc

Abstract Population breeding through recurrent selection is based on the repetition of evaluation and recombination among best-selected individuals. In this type of breeding strategy, early evaluation of selection candidates combined with genomic prediction could substantially shorten the breeding cycle length, thus increasing the rate of genetic gain. The objective of the present study was to optimize early genomic prediction in an upland rice (Oryza sativa L.) synthetic population improved through recurrent selection via shuttle breeding in two sites. To this end, we used genomic prediction on 334 S0 genotypes evaluated with early generation progeny testing (S0:2 and S0:3) across two sites. Four traits were measured (plant height, days to flowering, grain yield and grain zinc concentration) and the predictive ability was assessed for the target site. For days to flowering and plant height, which correlate well among sites (0.51–0.62), an increase of up to 0.4 in predictive ability was observed when the model was trained using the two sites. For grain zinc concentration, adding the phenotype of the predicted lines in the non-target site to the model improved the predictive ability (0.51 with two-site and 0.31 with single-site model), while for grain yield the gain was less (0.42 with two-site and 0.35 with single-site calibration). Through these results, we found a good opportunity to optimize the genomic recurrent selection scheme and maximize the use of resources by performing early progeny testing in two sites for traits with best expression and/or relevance in each specific environment.

Download Full-text

The performance of phenomic selection depends on the genetic architecture of the target trait

Theoretical and Applied Genetics ◽

10.1007/s00122-021-03997-7 ◽

2021 ◽

Author(s):

Xintian Zhu ◽

Hans Peter Maurer ◽

Mario Jenz ◽

Volker Hahn ◽

Arno Ruckelshausen ◽

...

Keyword(s):

Molecular Markers ◽

Grain Yield ◽

Spectral Data ◽

Genomic Prediction ◽

Complex Traits ◽

Genetic Architecture ◽

Genetic Relatedness ◽

Yellow Rust ◽

Predictive Ability ◽

Nirs Data

Abstract Key message The phenomic predictive ability depends on the genetic architecture of the target trait, being high for complex traits and low for traits with major QTL. Abstract Genomic selection is a powerful tool to assist breeding of complex traits, but a limitation is the costs required for genotyping. Recently, phenomic selection has been suggested, which uses spectral data instead of molecular markers as predictors. It was shown to be competitive with genomic prediction, as it achieved predictive abilities as high or even higher than its genomic counterpart. The objective of this study was to evaluate the performance of phenomic prediction for triticale and the dependency of the predictive ability on the genetic architecture of the target trait. We found that for traits with a complex genetic architecture, like grain yield, phenomic prediction with NIRS data as predictors achieved high predictive abilities and performed better than genomic prediction. By contrast, for mono- or oligogenic traits, for example, yellow rust, marker-based approaches achieved high predictive abilities, while those of phenomic prediction were very low. Compared with molecular markers, the predictive ability obtained using NIRS data was more robust to varying degrees of genetic relatedness between the training and prediction set. Moreover, for grain yield, smaller training sets were required to achieve a similar predictive ability for phenomic prediction than for genomic prediction. In addition, our results illustrate the potential of using field-based spectral data for phenomic prediction. Overall, our result confirmed phenomic prediction as an efficient approach to improve the selection gain for complex traits in plant breeding.

Download Full-text