scholarly journals Multi-Trait Machine and Deep Learning Models for Genomic Selection using Spectral Information in a Wheat Breeding Program

Author(s):  
Karansher S. Sandhu ◽  
Shruti S. Patil ◽  
Michael O. Pumphrey ◽  
Arron H. Carter

AbstractPrediction of breeding values and phenotypes is central to plant breeding and has been revolutionized by the adoption of genomic selection (GS). Use of machine and deep learning algorithms applied to complex traits in plants can improve prediction accuracies in the context of GS. Spectral reflectance indices further provide information about various physiological parameters previously undetectable in plants. This research explores the potential of multi-trait (MT) machine and deep learning models for predicting grain yield and grain protein content in wheat using spectral information in GS models. This study compares the performance of four machine and deep learning-based uni-trait (UT) and MT models with traditional GBLUP and Bayesian models. The dataset consisted of 650 recombinant inbred lines from a spring wheat breeding program, grown for three years (2014-2016), and spectral data were collected at heading and grain filling stages. MT-GS models performed 0-28.5% and −0.04-15% superior to the UT-GS models for predicting grain yield and grain protein content. Random forest and multilayer perceptron were the best performing machine and deep learning models to predict both traits. These two models performed similarly under UT and MT-GS models. Four explored Bayesian models gave similar accuracies, which were less than machine and deep learning-based models, and required increased computational time. Green normalized difference vegetation index best predicted grain protein content in seven out of the nine MT-GS models. Overall, this study concluded that machine and deep learning-based MT-GS models increased prediction accuracy and should be employed in large-scale breeding programs.Core IdeasPotential for combining high throughput phenotyping, machine and deep learning in breeding.Multi-trait models exploit information from secondary correlated traits efficiently.Spectral information improves genomic selection models.Deep learning can aid plant breeders owing to increased data generated in breeding programs

2021 ◽  
Author(s):  
Karansher S Sandhu ◽  
Meriem Aoun ◽  
Craig Morris ◽  
Arron H Carter

Breeding for grain yield, biotic and abiotic stress resistance, and end-use quality are important goals of wheat breeding programs. Screening for end-use quality traits is usually secondary to grain yield due to high labor needs, cost of testing, and large seed requirements for phenotyping. Hence, testing is delayed until later stages in the breeding program. Delayed phenotyping results in advancement of inferior end-use quality lines into the program. Genomic selection provides an alternative to predict performance using genome-wide markers. Due to large datasets in breeding programs, we explored the potential of the machine and deep learning models to predict fourteen end-use quality traits in a winter wheat breeding program. The population used consisted of 666 wheat genotypes screened for five years (2015-19) at two locations (Pullman and Lind, WA, USA). Nine different models, including two machine learning (random forest and support vector machine) and two deep learning models (convolutional neural network and multilayer perceptron), were explored for cross-validation, forward, and across locations predictions. The prediction accuracies for different traits varied from 0.45-0.81, 0.29-0.55, and 0.27-0.50 under cross-validation, forward, and across location predictions. In general, forward prediction accuracies kept increasing over time due to increments in training data size and was more evident for machine and deep learning models. Deep learning models performed superior over the traditional ridge regression best linear unbiased prediction (RRBLUP) and Bayesian models under all prediction scenarios. The high accuracy observed for end-use quality traits in this study support predicting them in early generations, leading to the advancement of superior genotypes to more extensive grain yield trailing. Furthermore, the superior performance of machine and deep learning models strengthen the idea to include them in large scale breeding programs for predicting complex traits.


2021 ◽  
Author(s):  
Karansher S. Sandhu ◽  
Paul D. Mihalyov ◽  
Megan J. Lewien ◽  
Michael O. Pumphrey ◽  
Arron H Carter

Grain protein content (GPC) is controlled by complex genetic systems and their interactions, and is an important quality determinant for hard spring wheat as it has a positive effect on bread and pasta quality. GPC is variable among genotypes and strongly influenced by environment. Thus, understanding the genetic control of wheat GPC and identifying genotypes with improved stability is an important breeding goal. The objectives of this research were to identify genetic backgrounds with less variation for GPC across environments and identify quantitative trait loci (QTLs) controlling the stability of GPC. A spring wheat nested association mapping (NAM) population of 650 recombinant inbred lines (RIL) derived from 26 diverse founder parents crossed to one common parent, 'Berkut', was phenotyped over three years of field trials (2014-2016). Genomic selection models were developed and compared based on prediction of GPC and GPC stability. After observing variable genetic control of GPC within the NAM population, seven RIL families displaying reduced marker-by-environment interaction were selected based on a stability index derived from Finlay-Wilkinson regression. A genome-wide association study identified seven significant QTLs for GPC stability with a Bonferroni-adjusted P value <0.05. This study also demonstrated that genome-wide prediction of GPC with ridge regression best linear unbiased estimates reached up to r = 0.69. Genomic selection can be used to apply selection pressure for GPC and improve genetic gain for GPC.


Agronomy ◽  
2021 ◽  
Vol 11 (12) ◽  
pp. 2528
Author(s):  
Karansher S. Sandhu ◽  
Paul D. Mihalyov ◽  
Megan J. Lewien ◽  
Michael O. Pumphrey ◽  
Arron H. Carter

Grain protein content (GPC) is controlled by complex genetic systems and their interactions and is an important quality determinant for hard spring wheat as it has a positive effect on bread and pasta quality. GPC is variable among genotypes and strongly influenced by the environment. Thus, understanding the genetic control of wheat GPC and identifying genotypes with improved stability is an important breeding goal. The objectives of this research were to identify genetic backgrounds with less variation for GPC across environments and identify quantitative trait loci (QTLs) controlling the stability of GPC. A spring wheat nested association mapping (NAM) population of 650 recombinant inbred lines (RIL) derived from 26 diverse founder parents crossed to one common parent, ‘Berkut’, was phenotyped over three years of field trials (2014–2016). Genomic selection models were developed and compared based on predictions of GPC and GPC stability. After observing variable genetic control of GPC within the NAM population, seven RIL families displaying reduced marker-by-environment interaction were selected based on a stability index derived from a Finlay–Wilkinson regression. A genome-wide association study identified eighteen significant QTLs for GPC stability with a Bonferroni-adjusted p-value < 0.05 using four different models and out of these eighteen QTLs eight were identified by two or more GWAS models simultaneously. This study also demonstrated that genome-wide prediction of GPC with ridge regression best linear unbiased estimates reached up to r = 0.69. Genomic selection can be used to apply selection pressure for GPC and improve genetic gain for GPC.


2000 ◽  
Vol 51 (6) ◽  
pp. 665 ◽  
Author(s):  
M Koç ◽  
C. Barutçular ◽  
N. Zencirci

High grain protein in durum wheat [Triticum turgidum ssp. turgidum L. conv. Durum (Desf.)] is one of the main goals of breeding programs. Landraces may be very useful germplasm for achieving this goal. To examine their potential as a source of high grain protein content, 11 genotypes, including 7 landraces, were evaluated in 8 environments. Environment, genotype, and the interaction of the two (G E) significantly influenced the variation in grain yield, grain protein content, and grain protein yield. The environmental effect was the strongest, mostly due to differences in water supply. Grain yields of the modern genotypes were higher than those of landraces. Yields of the modern genotypes tended to respond more strongly to the higher yielding environments, but they varied more than the yields of landraces. With the exception of VK.85.18, the grain protein content of the high-yielding genotypes was almost as high as that of the best landraces. Moreover, grain protein content of these bred genotypes tended to respond more strongly to the higher protein environments. Differences in grain protein yield were closely related to the differences in grain yield. The results indicate that it is possible to improve grain protein content without grain yield being adversely affected. The results also indicate that potential gene sources should be compared over a number of environments before they can be used as breeding material or as crop varieties producing high grain protein yields.


2018 ◽  
Vol 50 (4) ◽  
pp. 279-298 ◽  
Author(s):  
A.I. Rybalka ◽  
◽  
B.V. Morgun ◽  
S.S. Polyshchuk ◽  
◽  
...  

2020 ◽  
Vol 11 ◽  
Author(s):  
Biructawit Bekele Tessema ◽  
Huiming Liu ◽  
Anders Christian Sørensen ◽  
Jeppe Reitan Andersen ◽  
Just Jensen

Conventional wheat-breeding programs involve crossing parental lines and subsequent selfing of the offspring for several generations to obtain inbred lines. Such a breeding program takes more than 8 years to develop a variety. Although wheat-breeding programs have been running for many years, genetic gain has been limited. However, the use of genomic information as selection criterion can increase selection accuracy and that would contribute to increased genetic gain. The main objective of this study was to quantify the increase in genetic gain by implementing genomic selection in traditional wheat-breeding programs. In addition, we investigated the effect of genetic correlation between different traits on genetic gain. A stochastic simulation was used to evaluate wheat-breeding programs that run simultaneously for 25 years with phenotypic or genomic selection. Genetic gain and genetic variance of wheat-breeding program based on phenotypes was compared to the one with genomic selection. Genetic gain from the wheat-breeding program based on genomic estimated breeding values (GEBVs) has tripled compared to phenotypic selection. Genomic selection is a promising strategy for improving genetic gain in wheat-breeding programs.


2021 ◽  
Vol 12 ◽  
Author(s):  
Karansher S. Sandhu ◽  
Paul D. Mihalyov ◽  
Megan J. Lewien ◽  
Michael O. Pumphrey ◽  
Arron H. Carter

Genomics and high throughput phenomics have the potential to revolutionize the field of wheat (Triticum aestivum L.) breeding. Genomic selection (GS) has been used for predicting various quantitative traits in wheat, especially grain yield. However, there are few GS studies for grain protein content (GPC), which is a crucial quality determinant. Incorporation of secondary correlated traits in GS models has been demonstrated to improve accuracy. The objectives of this research were to compare performance of single and multi-trait GS models for predicting GPC and grain yield in wheat and to identify optimal growth stages for collecting secondary traits. We used 650 recombinant inbred lines from a spring wheat nested association mapping (NAM) population. The population was phenotyped over 3 years (2014–2016), and spectral information was collected at heading and grain filling stages. The ability to predict GPC and grain yield was assessed using secondary traits, univariate, covariate, and multivariate GS models for within and across cycle predictions. Our results indicate that GS accuracy increased by an average of 12% for GPC and 20% for grain yield by including secondary traits in the models. Spectral information collected at heading was superior for predicting GPC, whereas grain yield was more accurately predicted during the grain filling stage. Green normalized difference vegetation index had the largest effect on the prediction of GPC either used individually or with multiple indices in the GS models. An increased prediction ability for GPC and grain yield with the inclusion of secondary traits demonstrates the potential to improve the genetic gain per unit time and cost in wheat breeding.


Sign in / Sign up

Export Citation Format

Share Document