scholarly journals Response to Early Generation Genomic Selection for Yield in Wheat

2022 ◽  
Vol 12 ◽  
Author(s):  
David Bonnett ◽  
Yongle Li ◽  
Jose Crossa ◽  
Susanne Dreisigacker ◽  
Bhoja Basnet ◽  
...  

We investigated increasing genetic gain for grain yield using early generation genomic selection (GS). A training set of 1,334 elite wheat breeding lines tested over three field seasons was used to generate Genomic Estimated Breeding Values (GEBVs) for grain yield under irrigated conditions applying markers and three different prediction methods: (1) Genomic Best Linear Unbiased Predictor (GBLUP), (2) GBLUP with the imputation of missing genotypic data by Ridge Regression BLUP (rrGBLUP_imp), and (3) Reproducing Kernel Hilbert Space (RKHS) a.k.a. Gaussian Kernel (GK). F2 GEBVs were generated for 1,924 individuals from 38 biparental cross populations between 21 parents selected from the training set. Results showed that F2 GEBVs from the different methods were not correlated. Experiment 1 consisted of selecting F2s with the highest average GEBVs and advancing them to form genomically selected bulks and make intercross populations aiming to combine favorable alleles for yield. F4:6 lines were derived from genomically selected bulks, intercrosses, and conventional breeding methods with similar numbers from each. Results of field-testing for Experiment 1 did not find any difference in yield with genomic compared to conventional selection. Experiment 2 compared the predictive ability of the different GEBV calculation methods in F2 using a set of single plant-derived F2:4 lines from randomly selected F2 plants. Grain yield results from Experiment 2 showed a significant positive correlation between observed yields of F2:4 lines and predicted yield GEBVs of F2 single plants from GK (the predictive ability of 0.248, P < 0.001) and GBLUP (0.195, P < 0.01) but no correlation with rrGBLUP_imp. Results demonstrate the potential for the application of GS in early generations of wheat breeding and the importance of using the appropriate statistical model for GEBV calculation, which may not be the same as the best model for inbreds.

2016 ◽  
Vol 14 (03) ◽  
pp. 449-477 ◽  
Author(s):  
Andreas Christmann ◽  
Ding-Xuan Zhou

Additive models play an important role in semiparametric statistics. This paper gives learning rates for regularized kernel-based methods for additive models. These learning rates compare favorably in particular in high dimensions to recent results on optimal learning rates for purely nonparametric regularized kernel-based quantile regression using the Gaussian radial basis function kernel, provided the assumption of an additive model is valid. Additionally, a concrete example is presented to show that a Gaussian function depending only on one variable lies in a reproducing kernel Hilbert space generated by an additive Gaussian kernel, but does not belong to the reproducing kernel Hilbert space generated by the multivariate Gaussian kernel of the same variance.


Author(s):  
Xabi Cazenave ◽  
Bernard Petit ◽  
Marc Lateur ◽  
Hilde Nybom ◽  
Jiri Sedlak ◽  
...  

Abstract Genomic selection is an attractive strategy for apple breeding that could reduce the length of breeding cycles. A possible limitation to the practical implementation of this approach lies in the creation of a training set large and diverse enough to ensure accurate predictions. In this study, we investigated the potential of combining two available populations, i.e. genetic resources and elite material, in order to obtain a large training set with a high genetic diversity. We compared the predictive ability of genomic predictions within-population, across-population or when combining both populations, and tested a model accounting for population-specific marker effects in this last case. The obtained predictive abilities were moderate to high according to the studied trait and small increases in predictive ability could be obtained for some traits when the two populations were combined into a unique training set. We also investigated the potential of such a training set to predict hybrids resulting from crosses between the two populations, with a focus on the method to design the training set and the best proportion of each population to optimize predictions. The measured predictive abilities were very similar for all the proportions, except for the extreme cases where only one of the two populations was used in the training set, in which case predictive abilities could be lower than when using both populations. Using an optimization algorithm to choose the genotypes in the training set also led to higher predictive abilities than when the genotypes were chosen at random. Our results provide guidelines to initiate breeding programs that use genomic selection when the implementation of the training set is a limitation.


2021 ◽  
Author(s):  
Xabi Cazenave ◽  
Bernard Petit ◽  
Francois Laurens ◽  
Charles-Eric Durel ◽  
Helene Muranty

Genomic selection is an attractive strategy for apple breeding that could reduce the length of breeding cycles. A possible limitation to the practical implementation of this approach lies in the creation of a training set large and diverse enough to ensure accurate predictions. In this study, we investigated the potential of combining two available populations, i.e. genetic resources and elite material, in order to obtain a large training set with a high genetic diversity. We compared the predictive ability of genomic predictions within-population, across-population or when combining both populations, and tested a model accounting for population-specific marker effects in this last case. The obtained predictive abilities were moderate to high according to the studied trait and were always highest when the two populations were combined into a unique training set. We also investigated the potential of such a training set to predict hybrids resulting from crosses between the two populations, with a focus on the method to design the training set and the best proportion of each population to optimize predictions. The measured predictive abilities were very similar for all the proportions, except for the extreme cases where only one of the two populations was used in the training set, in which case predictive abilities could be lower than when using both populations. Using an optimization algorithm to choose the genotypes in the training set also led to higher predictive abilities than when the genotypes were chosen at random. Our results provide guidelines to initiate breeding programs that use genomic selection when the implementation of the training set is a limitation.


Genes ◽  
2020 ◽  
Vol 11 (7) ◽  
pp. 779
Author(s):  
Dennis N. Lozada ◽  
Arron H. Carter

Achieving optimal predictive ability is key to increasing the relevance of implementing genomic selection (GS) approaches in plant breeding programs. The potential of an item-based collaborative filtering (IBCF) recommender system in the context of multi-trait, multi-environment GS has been explored. Different GS scenarios for IBCF were evaluated for a diverse population of winter wheat lines adapted to the Pacific Northwest region of the US. Predictions across years through cross-validations resulted in improved predictive ability when there is a high correlation between environments. Using multiple spectral traits collected from high-throughput phenotyping resulted in better GS accuracies for grain yield (GY) compared to using only single traits for predictions. Trait adjustments through various Bayesian regression models using genomic information from SNP markers was the most effective in achieving improved accuracies for GY, heading date, and plant height among the GS scenarios evaluated. Bayesian LASSO had the highest predictive ability compared to other models for phenotypic trait adjustments. IBCF gave competitive accuracies compared to a genomic best linear unbiased predictor (GBLUP) model for predicting different traits. Overall, an IBCF approach could be used as an alternative to traditional prediction models for important target traits in wheat breeding programs.


2021 ◽  
Vol 12 ◽  
Author(s):  
Yoseph Beyene ◽  
Manje Gowda ◽  
Paulino Pérez-Rodríguez ◽  
Michael Olsen ◽  
Kelly R. Robbins ◽  
...  

In maize, doubled haploid (DH) line production capacity of large-sized maize breeding programs often exceeds the capacity to phenotypically evaluate the complete set of testcross candidates in multi-location trials. The ability to partially select DH lines based on genotypic data while maintaining or improving genetic gains for key traits using phenotypic selection can result in significant resource savings. The present study aimed to evaluate genomic selection (GS) prediction scenarios for grain yield and agronomic traits of one of the tropical maize breeding pipelines of CIMMYT in eastern Africa, based on multi-year empirical data for designing a GS-based strategy at the early stages of the pipeline. We used field data from 3,068 tropical maize DH lines genotyped using rAmpSeq markers and evaluated as test crosses in well-watered (WW) and water-stress (WS) environments in Kenya from 2017 to 2019. Three prediction schemes were compared: (1) 1 year of performance data to predict a second year; (2) 2 years of pooled data to predict performance in the third year, and (3) using individual or pooled data plus converting a certain proportion of individuals from the testing set (TST) to the training set (TRN) to predict the next year's data. Employing five-fold cross-validation, the mean prediction accuracies for grain yield (GY) varied from 0.19 to 0.29 under WW and 0.22 to 0.31 under WS, when the 1-year datasets were used training set to predict a second year's data as a testing set. The mean prediction accuracies increased to 0.32 under WW and 0.31 under WS when the 2-year datasets were used as a training set to predict the third-year data set. In a forward prediction scenario, good predictive abilities (0.53 to 0.71) were found when the training set consisted of the previous year's breeding data and converting 30% of the next year's data from the testing set to the training set. The prediction accuracy for anthesis date and plant height across WW and WS environments obtained using 1-year data and integrating 10, 30, 50, 70, and 90% of the TST set to TRN set was much higher than those trained in individual years. We demonstrate that by increasing the TRN set to include genotypic and phenotypic data from the previous year and combining only 10–30% of the lines from the year of testing, the predicting accuracy can be increased, which in turn could be used to replace the first stage of field-based screening partially, thus saving significant costs associated with the testcross formation and multi-location testcross evaluation.


2019 ◽  
Author(s):  
Sai Krishna Arojju ◽  
Mingshu Cao ◽  
M. Z. Zulfi Jahufer ◽  
Brent A Barrett ◽  
Marty J Faville

AbstractForage nutritive value impacts animal nutrition, which underpins livestock productivity, reproduction and health. Genetic improvement for nutritive traits has been limited, as they are typically expensive and time-consuming to measure through conventional methods. Genomic selection is appropriate for such complex and expensive traits, enabling cost-effective prediction of breeding values using genome-wide markers. The aims of the present study were to assess the potential of genomic selection for a range of nutritive traits in a multi-population training set, and to quantify contributions of genotypic, environmental and genotype-by-environment (G × E) variance components to trait variation and heritability for nutritive traits. The training set consisted of a total of 517 half-sibling (half-sib) families, from five advanced breeding populations, evaluated in two distinct New Zealand grazing environments. Autumn-harvested samples were analyzed for 18 nutritive traits and maternal parents of the half-sib families were genotyped using genotyping-by-sequencing. Significant (P<0.05) genotypic variation was detected for all nutritive traits and genomic heritability (h2g) was moderate to high (0.20 to 0.74). G × E interactions were significant and particularly large for water soluble carbohydrate (WSC), crude fat, phosphorus (P) and crude protein. GBLUP, KGD-GBLUP and BayesC genomic prediction models displayed similar predictive ability, estimated by 10-fold cross validation, for all nutritive traits with values ranging from r = 0.16 to 0.45 using phenotypes from across two environments. High predictive ability was observed for the mineral traits sulphur (0.44), sodium (0.45) and magnesium (0.45) and the lowest values were observed for P (0.16), digestibility (0.22) and high molecular weight WSC (0.23). Predictive ability estimates for most nutritive traits were retained when marker number was reduced from 1 million to as few as 50,000. The moderate to high predictive abilities observed suggests implementation of genomic selection is feasible for most of the nutritive traits examined. For traits with lower predictive ability, multi-trait genomic prediction approaches that exploit the strong genetic correlations observed amongst some nutritive traits may be useful. This appears to be particularly important for WSC, considered one of the primary constituent of nutritive value for forages.


2016 ◽  
Vol 26 (03) ◽  
pp. 1650011 ◽  
Author(s):  
Shasha Yuan ◽  
Weidong Zhou ◽  
Qi Wu ◽  
Yanli Zhang

Epileptic seizure detection plays an important role in the diagnosis of epilepsy and reducing the massive workload of reviewing electroencephalography (EEG) recordings. In this work, a novel algorithm is developed to detect seizures employing log-Euclidean Gaussian kernel-based sparse representation (SR) in long-term EEG recordings. Unlike the traditional SR for vector data in Euclidean space, the log-Euclidean Gaussian kernel-based SR framework is proposed for seizure detection in the space of the symmetric positive definite (SPD) matrices, which form a Riemannian manifold. Since the Riemannian manifold is nonlinear, the log-Euclidean Gaussian kernel function is applied to embed it into a reproducing kernel Hilbert space (RKHS) for performing SR. The EEG signals of all channels are divided into epochs and the SPD matrices representing EEG epochs are generated by covariance descriptors. Then, the testing samples are sparsely coded over the dictionary composed by training samples utilizing log-Euclidean Gaussian kernel-based SR. The classification of testing samples is achieved by computing the minimal reconstructed residuals. The proposed method is evaluated on the Freiburg EEG dataset of 21 patients and shows its notable performance on both epoch-based and event-based assessments. Moreover, this method handles multiple channels of EEG recordings synchronously which is more speedy and efficient than traditional seizure detection methods.


Author(s):  
Jianzhong Wang

Let [Formula: see text] be a data set in [Formula: see text], where [Formula: see text] is the training set and [Formula: see text] is the test one. Many unsupervised learning algorithms based on kernel methods have been developed to provide dimensionality reduction (DR) embedding for a given training set [Formula: see text] ([Formula: see text]) that maps the high-dimensional data [Formula: see text] to its low-dimensional feature representation [Formula: see text]. However, these algorithms do not straightforwardly produce DR of the test set [Formula: see text]. An out-of-sample extension method provides DR of [Formula: see text] using an extension of the existent embedding [Formula: see text], instead of re-computing the DR embedding for the whole set [Formula: see text]. Among various out-of-sample DR extension methods, those based on Nyström approximation are very attractive. Many papers have developed such out-of-extension algorithms and shown their validity by numerical experiments. However, the mathematical theory for the DR extension still need further consideration. Utilizing the reproducing kernel Hilbert space (RKHS) theory, this paper develops a preliminary mathematical analysis on the out-of-sample DR extension operators. It treats an out-of-sample DR extension operator as an extension of the identity on the RKHS defined on [Formula: see text]. Then the Nyström-type DR extension turns out to be an orthogonal projection. In the paper, we also present the conditions for the exact DR extension and give the estimate for the error of the extension.


2017 ◽  
Vol 131 (3) ◽  
pp. 703-720 ◽  
Author(s):  
Marty J. Faville ◽  
Siva Ganesh ◽  
Mingshu Cao ◽  
M. Z. Zulfi Jahufer ◽  
Timothy P. Bilton ◽  
...  

2020 ◽  
Vol 11 ◽  
Author(s):  
Philomin Juliana ◽  
Ravi Prakash Singh ◽  
Hans-Joachim Braun ◽  
Julio Huerta-Espino ◽  
Leonardo Crespo-Herrera ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document