Predictive ability of genomic selection models in a multi-population perennial ryegrass training set using genotyping-by-sequencing

Marty J. Faville; Siva Ganesh; Mingshu Cao; M. Z. Zulfi Jahufer; Timothy P. Bilton; H. Sydney Easton; Douglas L. Ryan; Jason A. K. Trethewey; M. Philip Rolston; Andrew G. Griffiths; Roger Moraga; Casey Flay; Jana Schmidt; Rachel Tan; Brent A. Barrett

doi:10.1007/s00122-017-3030-1

Genomic Predictive Ability for Foliar Nutritive Traits in Perennial Ryegrass

G3 Genes|Genome|Genetics ◽

10.1534/g3.119.400880 ◽

2019 ◽

Vol 10 (2) ◽

pp. 695-708 ◽

Cited By ~ 6

Author(s):

Sai Krishna Arojju ◽

Mingshu Cao ◽

M. Z. Zulfi Jahufer ◽

Brent A. Barrett ◽

Marty J. Faville

Keyword(s):

Genomic Selection ◽

Perennial Ryegrass ◽

Nutritive Value ◽

Prediction Models ◽

Predictive Ability ◽

Genotyping By Sequencing ◽

Water Soluble ◽

Soluble Carbohydrate ◽

Training Set ◽

Sib Families

Forage nutritive value impacts animal nutrition, which underpins livestock productivity, reproduction and health. Genetic improvement for nutritive traits in perennial ryegrass has been limited, as they are typically expensive and time-consuming to measure through conventional methods. Genomic selection is appropriate for such complex and expensive traits, enabling cost-effective prediction of breeding values using genome-wide markers. The aims of the present study were to assess the potential of genomic selection for a range of nutritive traits in a multi-population training set, and to quantify contributions of family, location and family-by-location variance components to trait variation and heritability for nutritive traits. The training set consisted of a total of 517 half-sibling (half-sib) families, from five advanced breeding populations, evaluated in two distinct New Zealand grazing environments. Autumn-harvested samples were analyzed for 18 nutritive traits and maternal parents of the half-sib families were genotyped using genotyping-by-sequencing. Significant (P < 0.05) family variance was detected for all nutritive traits and genomic heritability (h2g) was moderate to high (0.20 to 0.74). Family-by-location interactions were significant and particularly large for water soluble carbohydrate (WSC), crude fat, phosphorus (P) and crude protein. GBLUP, KGD-GBLUP and BayesCπ genomic prediction models displayed similar predictive ability, estimated by 10-fold cross validation, for all nutritive traits with values ranging from r = 0.16 to 0.45 using phenotypes from across two locations. High predictive ability was observed for the mineral traits sulfur (0.44), sodium (0.45) and magnesium (0.45) and the lowest values were observed for P (0.16), digestibility (0.22) and high molecular weight WSC (0.23). Predictive ability estimates for most nutritive traits were retained when marker number was reduced from one million to as few as 50,000. The moderate to high predictive abilities observed suggests implementation of genomic selection is feasible for most of the nutritive traits examined.

Download Full-text

Genotyping-by-sequencing for genomic selection of perennial ryegrass (Lolium perenne)

Acta Horticulturae ◽

10.17660/actahortic.2018.1203.2 ◽

2018 ◽

pp. 9-16

Author(s):

J. Jacobs ◽

M. Faville ◽

A. Griffiths ◽

M. Cao ◽

R. Tan ◽

...

Keyword(s):

Genomic Selection ◽

Lolium Perenne ◽

Perennial Ryegrass ◽

Genotyping By Sequencing ◽

Selection Of

Download Full-text

Combining genetic resources and elite material populations to improve the accuracy of genomic prediction in apple

G3 Genes|Genome|Genetics ◽

10.1093/g3journal/jkab420 ◽

2021 ◽

Author(s):

Xabi Cazenave ◽

Bernard Petit ◽

Marc Lateur ◽

Hilde Nybom ◽

Jiri Sedlak ◽

...

Keyword(s):

Genetic Resources ◽

Genomic Selection ◽

Predictive Ability ◽

Practical Implementation ◽

Specific Marker ◽

Training Set ◽

High Genetic Diversity ◽

Breeding Programs ◽

Breeding Cycles ◽

Two Populations

Abstract Genomic selection is an attractive strategy for apple breeding that could reduce the length of breeding cycles. A possible limitation to the practical implementation of this approach lies in the creation of a training set large and diverse enough to ensure accurate predictions. In this study, we investigated the potential of combining two available populations, i.e. genetic resources and elite material, in order to obtain a large training set with a high genetic diversity. We compared the predictive ability of genomic predictions within-population, across-population or when combining both populations, and tested a model accounting for population-specific marker effects in this last case. The obtained predictive abilities were moderate to high according to the studied trait and small increases in predictive ability could be obtained for some traits when the two populations were combined into a unique training set. We also investigated the potential of such a training set to predict hybrids resulting from crosses between the two populations, with a focus on the method to design the training set and the best proportion of each population to optimize predictions. The measured predictive abilities were very similar for all the proportions, except for the extreme cases where only one of the two populations was used in the training set, in which case predictive abilities could be lower than when using both populations. Using an optimization algorithm to choose the genotypes in the training set also led to higher predictive abilities than when the genotypes were chosen at random. Our results provide guidelines to initiate breeding programs that use genomic selection when the implementation of the training set is a limitation.

Download Full-text

Combining genetic resources and elite material populations to improve the accuracy of genomic prediction in apple

10.1101/2021.08.27.457920 ◽

2021 ◽

Author(s):

Xabi Cazenave ◽

Bernard Petit ◽

Francois Laurens ◽

Charles-Eric Durel ◽

Helene Muranty

Keyword(s):

Genetic Resources ◽

Genomic Selection ◽

Predictive Ability ◽

Practical Implementation ◽

Specific Marker ◽

Training Set ◽

High Genetic Diversity ◽

Breeding Programs ◽

Breeding Cycles ◽

Two Populations

Genomic selection is an attractive strategy for apple breeding that could reduce the length of breeding cycles. A possible limitation to the practical implementation of this approach lies in the creation of a training set large and diverse enough to ensure accurate predictions. In this study, we investigated the potential of combining two available populations, i.e. genetic resources and elite material, in order to obtain a large training set with a high genetic diversity. We compared the predictive ability of genomic predictions within-population, across-population or when combining both populations, and tested a model accounting for population-specific marker effects in this last case. The obtained predictive abilities were moderate to high according to the studied trait and were always highest when the two populations were combined into a unique training set. We also investigated the potential of such a training set to predict hybrids resulting from crosses between the two populations, with a focus on the method to design the training set and the best proportion of each population to optimize predictions. The measured predictive abilities were very similar for all the proportions, except for the extreme cases where only one of the two populations was used in the training set, in which case predictive abilities could be lower than when using both populations. Using an optimization algorithm to choose the genotypes in the training set also led to higher predictive abilities than when the genotypes were chosen at random. Our results provide guidelines to initiate breeding programs that use genomic selection when the implementation of the training set is a limitation.

Download Full-text

Genomic predictive ability for foliar nutritive traits in perennial ryegrass

10.1101/727958 ◽

2019 ◽

Author(s):

Sai Krishna Arojju ◽

Mingshu Cao ◽

M. Z. Zulfi Jahufer ◽

Brent A Barrett ◽

Marty J Faville

Keyword(s):

Genomic Selection ◽

Genomic Prediction ◽

Nutritive Value ◽

Prediction Models ◽

Genotypic Variation ◽

Genetic Correlations ◽

Predictive Ability ◽

Water Soluble ◽

Training Set ◽

Sib Families

AbstractForage nutritive value impacts animal nutrition, which underpins livestock productivity, reproduction and health. Genetic improvement for nutritive traits has been limited, as they are typically expensive and time-consuming to measure through conventional methods. Genomic selection is appropriate for such complex and expensive traits, enabling cost-effective prediction of breeding values using genome-wide markers. The aims of the present study were to assess the potential of genomic selection for a range of nutritive traits in a multi-population training set, and to quantify contributions of genotypic, environmental and genotype-by-environment (G × E) variance components to trait variation and heritability for nutritive traits. The training set consisted of a total of 517 half-sibling (half-sib) families, from five advanced breeding populations, evaluated in two distinct New Zealand grazing environments. Autumn-harvested samples were analyzed for 18 nutritive traits and maternal parents of the half-sib families were genotyped using genotyping-by-sequencing. Significant (P<0.05) genotypic variation was detected for all nutritive traits and genomic heritability (h2g) was moderate to high (0.20 to 0.74). G × E interactions were significant and particularly large for water soluble carbohydrate (WSC), crude fat, phosphorus (P) and crude protein. GBLUP, KGD-GBLUP and BayesC genomic prediction models displayed similar predictive ability, estimated by 10-fold cross validation, for all nutritive traits with values ranging from r = 0.16 to 0.45 using phenotypes from across two environments. High predictive ability was observed for the mineral traits sulphur (0.44), sodium (0.45) and magnesium (0.45) and the lowest values were observed for P (0.16), digestibility (0.22) and high molecular weight WSC (0.23). Predictive ability estimates for most nutritive traits were retained when marker number was reduced from 1 million to as few as 50,000. The moderate to high predictive abilities observed suggests implementation of genomic selection is feasible for most of the nutritive traits examined. For traits with lower predictive ability, multi-trait genomic prediction approaches that exploit the strong genetic correlations observed amongst some nutritive traits may be useful. This appears to be particularly important for WSC, considered one of the primary constituent of nutritive value for forages.

Download Full-text

Predictive ability of genomic selection models for breeding value estimation on growth traits of Pacific white shrimp Litopenaeus vannamei

Chinese Journal of Oceanology and Limnology ◽

10.1007/s00343-017-6038-0 ◽

2016 ◽

Vol 35 (5) ◽

pp. 1221-1229 ◽

Cited By ~ 12

Author(s):

Quanchao Wang ◽

Yang Yu ◽

Fuhua Li ◽

Xiaojun Zhang ◽

Jianhai Xiang

Keyword(s):

Genomic Selection ◽

Litopenaeus Vannamei ◽

Growth Traits ◽

Predictive Ability ◽

Pacific White Shrimp ◽

Breeding Value ◽

White Shrimp ◽

Selection Models ◽

Value Estimation

Download Full-text

Divergent Genomic Selection for Herbage Accumulation and Days-To-Heading in Perennial Ryegrass

Agronomy ◽

10.3390/agronomy10030340 ◽

2020 ◽

Vol 10 (3) ◽

pp. 340

Author(s):

Marty Faville ◽

Mingshu Cao ◽

Jana Schmidt ◽

Douglas Ryan ◽

Siva Ganesh ◽

...

Keyword(s):

Genomic Selection ◽

Perennial Ryegrass ◽

Genetic Gain ◽

Genomic Prediction ◽

Prediction Models ◽

Selection Response ◽

Training Set ◽

Days To Heading ◽

Selection For ◽

Target Environment

Increasing the rate of genetic gain for dry matter (DM) yield in perennial ryegrass (Lolium perenne L.), which is a key source of nutrition for ruminants in temperate environments, is an important goal for breeders. Genomic selection (GS) is a strategy used to improve genetic gain by using molecular marker information to predict breeding values in selection candidates. An empirical assessment of GS for herbage accumulation (HA; proxy for DM yield) and days-to-heading (DTH) was completed by using existing genomic prediction models to conduct one cycle of divergent GS in four selection populations (Pop I G1 and G3; Pop III G1 and G3), for each trait. G1 populations were the offspring of the training set and G3 populations were two generations further on from that. The HA of the High GEBV selection group (SG) progenies, averaged across all four populations, was 28% higher (p < 0.05) than Low GEBV SGs when assessed in the target environment, while it did not differ significantly in a second environment. Divergence was greater in Pop I (43%–65%) than Pop III (10%–16%) and the selection response was higher in G1 than in G3. Divergent GS for DTH also produced significant (p < 0.05) differences between High and Low GEBV SGs in G1 populations (+6.3 to 9.1 days; 31%–61%) and smaller, non-significant (p > 0.05) responses in G3. This study shows that genomic prediction models, trained from a small, composite reference set, can be used to improve traits with contrasting genetic architectures in perennial ryegrass. The results highlight the importance of target environment selection for training models, as well as the influence of relatedness between the training set and selection populations.

Download Full-text

BWGS: a R package for genomic selection and its application to a wheat breeding programme

10.1101/763037 ◽

2019 ◽

Author(s):

Gilles Charmet ◽

Louis Gautier Tran ◽

Jérôme Auzanneau ◽

Renaud Rincent ◽

Sophie Bouchet

Keyword(s):

Missing Data ◽

Genomic Selection ◽

Prediction Models ◽

Predictive Accuracy ◽

Predictive Ability ◽

Breeding Programme ◽

Training Set ◽

Desktop Computer ◽

Marker Selection ◽

Breeding Programmes

AbstractWe developed an integrated R library called BWGS to enable easy computation of Genomic Estimates of Breeding values (GEBV) for genomic selection. BWGS relies on existing R-libraries, all freely available from CRAN servers. The two main functions enable to run 1) replicated random cross validations within a training set of genotyped and phenotyped lines and 2) GEBV prediction, for a set of genotyped-only lines. Options are available for 1) missing data imputation, 2) markers and training set selection and 3) genomic prediction with 15 different methods, either parametric or semi-parametric.The usefulness and efficiency of BWGS are illustrated using a population of wheat lines from a real breeding programme. Adjusted yield data from historical trials (highly unbalanced design) were used for testing the options of BWGS. On the whole, 760 candidate lines with adjusted phenotypes and genotypes for 47 839 robust SNP were used. With a simple desktop computer, we obtained results which compared with previously published results on wheat genomic selection. As predicted by the theory, factors that are most influencing predictive ability, for a given trait of moderate heritability, are the size of the training population and a minimum number of markers for capturing every QTL information. Missing data up to 40%, if randomly distributed, do not degrade predictive ability once imputed, and up to 80% randomly distributed missing data are still acceptable once imputed with Expectation-Maximization method of package rrBLUP. It is worth noticing that selecting markers that are most associated to the trait do improve predictive ability, compared with the whole set of markers, but only when marker selection is made on the whole population. When marker selection is made only on the sampled training set, this advantage nearly disappeared, since it was clearly due to overfitting. Few differences are observed between the 15 prediction models with this dataset. Although non-parametric methods that are supposed to capture non-additive effects have slightly better predictive accuracy, differences remain small. Finally, the GEBV from the 15 prediction models are all highly correlated to each other. These results are encouraging for an efficient use of genomic selection in applied breeding programmes and BWGS is a simple and powerful toolbox to apply in breeding programmes or training activities.

Download Full-text

Development of PCR-based markers and whole-genome selection model for anthracnose resistance in white lupin (Lupinus albus L.)

Journal of Applied Genetics ◽

10.1007/s13353-020-00585-1 ◽

2020 ◽

Vol 61 (4) ◽

pp. 531-545

Author(s):

Sandra Rychel-Bielska ◽

Nelson Nazzicari ◽

Piotr Plewiński ◽

Wojciech Bielski ◽

Paolo Annicchiarico ◽

...

Keyword(s):

Genomic Selection ◽

Lupinus Albus ◽

Predictive Ability ◽

Genotyping By Sequencing ◽

High Specificity ◽

White Lupin ◽

Grain Legume ◽

Anthracnose Resistance ◽

Legume Crop ◽

Molecular Tracking

Abstract White lupin (Lupinus albus L.) is a high-protein grain legume crop, grown since ancient Greece and Rome. Despite long domestication history, its cultivation remains limited, partly because of susceptibility to anthracnose. Only some late-flowering, bitter, low-yielding landraces from Ethiopian mountains displayed resistance to this devastating disease. The resistance is controlled by various genes, thereby complicating the breeding efforts. The objective of this study was developing tools for molecular tracking of Ethiopian resistance genes based on genotyping-by-sequencing (GBS) data, envisaging linkage mapping and genomic selection approaches. Twenty GBS markers from two major quantitative trait loci (QTLs), antr04_1/antr05_1 and antr04_2/antr05_2, were converted to PCR-based markers using assigned transcriptome sequences. Newly developed markers improved mapping resolution around both anthracnose resistance loci, providing more precise QTL estimation. PCR-based screening of diversified domesticated and primitive germplasm revealed the high specificity of two markers for the antr04_1/antr05_1 locus (TP222136 and TP47110) and one for the antr04_2/antr05_2 locus (TP338761), highlighted by simple matching coefficients of 0.96 and 0.89, respectively. Moreover, a genomic selection approach based on GBS data of a recombinant inbred line mapping population was assessed, providing an average predictive ability of 0.56. These tools can be used for preselection of candidate white lupin germplasm for anthracnose resistance assays.

Download Full-text

Response to Early Generation Genomic Selection for Yield in Wheat

Frontiers in Plant Science ◽

10.3389/fpls.2021.718611 ◽

2022 ◽

Vol 12 ◽

Author(s):

David Bonnett ◽

Yongle Li ◽

Jose Crossa ◽

Susanne Dreisigacker ◽

Bhoja Basnet ◽

...

Keyword(s):

Grain Yield ◽

Genomic Selection ◽

Reproducing Kernel ◽

Reproducing Kernel Hilbert Space ◽

Predictive Ability ◽

Field Testing ◽

Wheat Breeding ◽

Gaussian Kernel ◽

Training Set ◽

Early Generation

We investigated increasing genetic gain for grain yield using early generation genomic selection (GS). A training set of 1,334 elite wheat breeding lines tested over three field seasons was used to generate Genomic Estimated Breeding Values (GEBVs) for grain yield under irrigated conditions applying markers and three different prediction methods: (1) Genomic Best Linear Unbiased Predictor (GBLUP), (2) GBLUP with the imputation of missing genotypic data by Ridge Regression BLUP (rrGBLUP_imp), and (3) Reproducing Kernel Hilbert Space (RKHS) a.k.a. Gaussian Kernel (GK). F2 GEBVs were generated for 1,924 individuals from 38 biparental cross populations between 21 parents selected from the training set. Results showed that F2 GEBVs from the different methods were not correlated. Experiment 1 consisted of selecting F2s with the highest average GEBVs and advancing them to form genomically selected bulks and make intercross populations aiming to combine favorable alleles for yield. F4:6 lines were derived from genomically selected bulks, intercrosses, and conventional breeding methods with similar numbers from each. Results of field-testing for Experiment 1 did not find any difference in yield with genomic compared to conventional selection. Experiment 2 compared the predictive ability of the different GEBV calculation methods in F2 using a set of single plant-derived F2:4 lines from randomly selected F2 plants. Grain yield results from Experiment 2 showed a significant positive correlation between observed yields of F2:4 lines and predicted yield GEBVs of F2 single plants from GK (the predictive ability of 0.248, P < 0.001) and GBLUP (0.195, P < 0.01) but no correlation with rrGBLUP_imp. Results demonstrate the potential for the application of GS in early generations of wheat breeding and the importance of using the appropriate statistical model for GEBV calculation, which may not be the same as the best model for inbreds.

Download Full-text