scholarly journals Multi-trait regressor stacking increased genomic prediction accuracy of sorghum grain composition

Author(s):  
Sirjan Sapkota ◽  
Jon Lucas Boatwright ◽  
Kathleen Jordan ◽  
Richard Boyles ◽  
Stephen Kresovich

AbstractCereal grains, primarily composed of starch, protein, and fat, are major source of staple for human and animal nutrition. Sorghum, a cereal crop, serves as a dietary staple for over half a billion people in the semi-arid tropics of Africa and South Asia. Genomic prediction has enabled plant breeders to estimate breeding values of unobserved genotypes and environments. Therefore, the use of genomic prediction will be extremely valuable for compositional traits for which phenotyping is labor-intensive and destructive for most accurate results. We studied the potential of Bayesian multi-output regressor stacking (BMORS) model in improving prediction performance over single trait single environment (STSE) models using a grain sorghum diversity panel (GSDP) and a biparental recombinant inbred lines (RILs) population. A total of five highly correlated grain composition traits: amylose, fat, gross energy, protein and starch, with genomic heritability ranging from 0.24 to 0.59 in the GSDP and 0.69 to 0.83 in the RILs were studied. Average prediction accuracies from the STSE model were within a range of 0.4 to 0.6 for all traits across both populations except amylose (0.25) in the GSDP. Prediction accuracy for BMORS increased by 41% and 32% on average over STSE in the GSDP and RILs, respectively. Predicting whole environments by training with remaining environments in BMORS yielded higher average prediction accuracy than from STSE model. Our results show regression stacking methods such as BMORS have potential to accurately predict unobserved individuals and environments, and implementation of such models can accelerate genetic gain.

Agronomy ◽  
2020 ◽  
Vol 10 (9) ◽  
pp. 1221 ◽  
Author(s):  
Sirjan Sapkota ◽  
J. Lucas Boatwright ◽  
Kathleen Jordan ◽  
Richard Boyles ◽  
Stephen Kresovich

Genomic prediction has enabled plant breeders to estimate breeding values of unobserved genotypes and environments. The use of genomic prediction will be extremely valuable for compositional traits for which phenotyping is labor-intensive and destructive for most accurate results. We studied the potential of Bayesian multi-output regressor stacking (BMORS) model in improving prediction performance over single trait single environment (STSE) models using a grain sorghum diversity panel (GSDP) and a biparental recombinant inbred lines (RILs) population. A total of five highly correlated grain composition traits—amylose, fat, gross energy, protein and starch, with genomic heritability ranging from 0.24 to 0.59 in the GSDP and 0.69 to 0.83 in the RILs were studied. Average prediction accuracies from the STSE model were within a range of 0.4 to 0.6 for all traits across both populations except amylose (0.25) in the GSDP. Prediction accuracy for BMORS increased by 41% and 32% on average over STSE in the GSDP and RILs, respectively. Prediction of whole environments by training with remaining environments in BMORS resulted in moderate to high prediction accuracy. Our results show regression stacking methods such as BMORS have potential to accurately predict unobserved individuals and environments, and implementation of such models can accelerate genetic gain.


Genetics ◽  
2021 ◽  
Author(s):  
Marco Lopez-Cruz ◽  
Gustavo de los Campos

Abstract Genomic prediction uses DNA sequences and phenotypes to predict genetic values. In homogeneous populations, theory indicates that the accuracy of genomic prediction increases with sample size. However, differences in allele frequencies and in linkage disequilibrium patterns can lead to heterogeneity in SNP effects. In this context, calibrating genomic predictions using a large, potentially heterogeneous, training data set may not lead to optimal prediction accuracy. Some studies tried to address this sample size/homogeneity trade-off using training set optimization algorithms; however, this approach assumes that a single training data set is optimum for all individuals in the prediction set. Here, we propose an approach that identifies, for each individual in the prediction set, a subset from the training data (i.e., a set of support points) from which predictions are derived. The methodology that we propose is a Sparse Selection Index (SSI) that integrates Selection Index methodology with sparsity-inducing techniques commonly used for high-dimensional regression. The sparsity of the resulting index is controlled by a regularization parameter (λ); the G-BLUP (the prediction method most commonly used in plant and animal breeding) appears as a special case which happens when λ = 0. In this study, we present the methodology and demonstrate (using two wheat data sets with phenotypes collected in ten different environments) that the SSI can achieve significant (anywhere between 5-10%) gains in prediction accuracy relative to the G-BLUP.


Animals ◽  
2021 ◽  
Vol 11 (5) ◽  
pp. 1199
Author(s):  
Reinhard Puntigam ◽  
Julia Slama ◽  
Daniel Brugger ◽  
Karin Leitner ◽  
Karl Schedle ◽  
...  

This study investigated the effects of sorghum ensiled as whole grains with different dry matter concentrations on the apparent total tract digestibility (ATTD) of energy, crude nutrients and minerals in growing pigs. Whole grain sorghum batches with varying dry matter (DM) concentrations of 701 (S1), 738 (S2) and 809 g kg−1 (S3) due to different dates of harvest from the same arable plot, were stored in air-tight kegs (6 L) for 6 months to ensure complete fermentation. Subsequently, 9 crossbred barrows (34.6 ± 1.8 kg; (Duroc x Landrace) × Piétrain)) were used in a 3 × 3 Latin square feeding experiment. Diets were based on the respective sorghum grain silage and were supplemented with additional amino acids, minerals and vitamins to meet or exceed published feeding recommendations for growing pigs. The ATTD of gross energy, dry matter, organic matter, nitrogen-free extracts, and crude ash were higher in S1 compared to S3 treatments (p ≤ 0.05), while S2 was intermediate. Pigs fed S1 showed significantly higher ATTD of phosphorus (P) compared to all other groups while ATTD of calcium was unaffected irrespective of the feeding regime. In conclusion, growing pigs used whole grain sorghum fermented with a DM concentration of 701 g kg−1 (S1) most efficiently. In particular, the addition of inorganic P could have been reduced by 0.39 g kg−1 DM when using this silage compared to the variant with the highest DM value (809 g kg−1).


PLoS ONE ◽  
2017 ◽  
Vol 12 (12) ◽  
pp. e0189775 ◽  
Author(s):  
S. Hong Lee ◽  
Sam Clark ◽  
Julius H. J. van der Werf

2020 ◽  
Vol 98 (Supplement_4) ◽  
pp. 245-246
Author(s):  
Cláudio U Magnabosco ◽  
Fernando Lopes ◽  
Valentina Magnabosco ◽  
Raysildo Lobo ◽  
Leticia Pereira ◽  
...  

Abstract The aim of the study was to evaluate prediction methods, validation approaches and pseudo-phenotypes for the prediction of the genomic breeding values of feed efficiency related traits in Nellore cattle. It used the phenotypic and genotypic information of 4,329 and 3,594 animals, respectively, which were tested for residual feed intake (RFI), dry matter intake (DMI), feed efficiency (FE), feed conversion ratio (FCR), residual body weight gain (RG), and residual intake and body weight gain (RIG). Six prediction methods were used: ssGBLUP, BayesA, BayesB, BayesCπ, BLASSO, and BayesR. Three validation approaches were used: 1) random: where the data was randomly divided into ten subsets and the validation was done in each subset at a time; 2) age: the division into the training (2010 to 2016) and validation population (2017) were based on the year of birth; 3) genetic breeding value (EBV) accuracy: the data was split in the training population being animals with accuracy above 0.45; and validation population those below 0.45. We checked the accuracy and bias of genomic value (GEBV). The results showed that the GEBV accuracy was the highest when the prediction is obtained with ssGBLUP (0.05 to 0.31) (Figure 1). The low heritability obtained, mainly for FE (0.07 ± 0.03) and FCR (0.09 ± 0.03), limited the GEBVs accuracy, which ranged from low to moderate. The regression coefficient estimates were close to 1, and similar between the prediction methods, validation approaches, and pseudo-phenotypes. The cross-validation presented the most accurate predictions ranging from 0.07 to 0.037. The prediction accuracy was higher for phenotype adjusted for fixed effects than for EBV and EBV deregressed (30.0 and 34.3%, respectively). Genomic prediction can provide a reliable estimate of genomic breeding values for RFI, DMI, RG and RGI, as to even say that those traits may have higher genetic gain than FE and FCR.


2017 ◽  
Vol 130 (12) ◽  
pp. 2543-2555 ◽  
Author(s):  
Adam Norman ◽  
Julian Taylor ◽  
Emi Tanaka ◽  
Paul Telfer ◽  
James Edwards ◽  
...  

Author(s):  
Stefan McKinnon Edwards ◽  
Jaap B. Buntjer ◽  
Robert Jackson ◽  
Alison R. Bentley ◽  
Jacob Lage ◽  
...  

PeerJ ◽  
2017 ◽  
Vol 5 ◽  
pp. e3019 ◽  
Author(s):  
Emma N. Bermingham ◽  
Paul Maclean ◽  
David G. Thomas ◽  
Nicholas J. Cave ◽  
Wayne Young

BackgroundMuch of the recent research in companion animal nutrition has focussed on understanding the role of diet on faecal microbiota composition. To date, diet-induced changes in faecal microbiota observed in humans and rodents have been extrapolated to pets in spite of their very different dietary and metabolic requirements. This lack of direct evidence means that the mechanisms by which microbiota influences health in dogs are poorly understood. We hypothesised that changes in faecal microbiota correlate with physiological parameters including apparent macronutrient digestibility.MethodsFifteen adult dogs were assigned to two diet groups, exclusively fed either a premium kibbled diet (kibble;K;n = 8) or a raw red meat diet (meat;M;n = 7) for nine weeks. Apparent digestibility of macronutrients (protein, fat, gross energy and dry matter), faecal weight, faecal health scores, faecal VFA concentrations and faecal microbial composition were determined. Datasets were integrated using mixOmics in R.ResultsFaecal weight and VFA levels were lower and the apparent digestibility of protein and energy were higher in dogs on the meat diet. Diet significantly affected 27 microbial families and 53 genera in the faeces. In particular, the abundances ofBacteriodes,Prevotella,PeptostreptococcusandFaecalibacteriumwere lower in dogs fed the meat diet, whereasFusobacterium,LactobacillusandClostridiumwere all more abundant.DiscussionOur results show clear associations of specific microbial taxa with diet composition. For example, Clostridiaceae, Erysipelotrichaceae and Bacteroidaceae were highly correlated to parameters such as protein and fat digestibility in the dog. By understanding the relationship between faecal microbiota and physiological parameters we will gain better insights into the effects of diet on the nutrition of our pets.


2019 ◽  
Author(s):  
Daniel Runcie ◽  
Hao Cheng

ABSTRACTIncorporating measurements on correlated traits into genomic prediction models can increase prediction accuracy and selection gain. However, multi-trait genomic prediction models are complex and prone to overfitting which may result in a loss of prediction accuracy relative to single-trait genomic prediction. Cross-validation is considered the gold standard method for selecting and tuning models for genomic prediction in both plant and animal breeding. When used appropriately, cross-validation gives an accurate estimate of the prediction accuracy of a genomic prediction model, and can effectively choose among disparate models based on their expected performance in real data. However, we show that a naive cross-validation strategy applied to the multi-trait prediction problem can be severely biased and lead to sub-optimal choices between single and multi-trait models when secondary traits are used to aid in the prediction of focal traits and these secondary traits are measured on the individuals to be tested. We use simulations to demonstrate the extent of the problem and propose three partial solutions: 1) a parametric solution from selection index theory, 2) a semi-parametric method for correcting the cross-validation estimates of prediction accuracy, and 3) a fully non-parametric method which we call CV2*: validating model predictions against focal trait measurements from genetically related individuals. The current excitement over high-throughput phenotyping suggests that more comprehensive phenotype measurements will be useful for accelerating breeding programs. Using an appropriate cross-validation strategy should more reliably determine if and when combining information across multiple traits is useful.


2020 ◽  
Vol 71 (20) ◽  
pp. 6670-6683
Author(s):  
Xiongwei Zhao ◽  
Gang Nie ◽  
Yanyu Yao ◽  
Zhongjie Ji ◽  
Jianhua Gao ◽  
...  

Abstract Genomic prediction of nitrogen-use efficiency (NUE) has not previously been studied in perennial grass species exposed to low-N stress. Here, we conducted a genomic prediction of physiological traits and NUE in 184 global accessions of perennial ryegrass (Lolium perenne) in response to a normal (7.5 mM) and low (0.75 mM) supply of N. After 21 d of treatment under greenhouse conditions, significant variations in plant height increment (ΔHT), leaf fresh weight (LFW), leaf dry weight (LDW), chlorophyll index (Chl), chlorophyll fluorescence, leaf N and carbon (C) contents, C/N ratio, and NUE were observed in accessions , but to a greater extent under low-N stress. Six genomic prediction models were applied to the data, namely the Bayesian method Bayes C, Bayesian LASSO, Bayesian Ridge Regression, Ridge Regression-Best Linear Unbiased Prediction, Reproducing Kernel Hilbert Spaces, and randomForest. These models produced similar prediction accuracy of traits within the normal or low-N treatments, but the accuracy differed between the two treatments. ΔHT, LFW, LDW, and C were predicted slightly better under normal N with a mean Pearson r-value of 0.26, compared with r=0.22 under low N, while the prediction accuracies for Chl, N, C/N, and NUE were significantly improved under low-N stress with a mean r=0.45, compared with r=0.26 under normal N. The population panel contained three population structures, which generally had no effect on prediction accuracy. The moderate prediction accuracies obtained for N, C, and NUE under low-N stress are promising, and suggest a feasible means by which germplasm might be initially assessed for further detailed studies in breeding programs.


Sign in / Sign up

Export Citation Format

Share Document