scholarly journals Utility of Climatic Information via Combining Ability Models to Improve Genomic Prediction for Yield Within the Genomes to Fields Maize Project

2021 ◽  
Vol 11 ◽  
Author(s):  
Diego Jarquin ◽  
Natalia de Leon ◽  
Cinta Romay ◽  
Martin Bohn ◽  
Edward S. Buckler ◽  
...  

Genomic prediction provides an efficient alternative to conventional phenotypic selection for developing improved cultivars with desirable characteristics. New and improved methods to genomic prediction are continually being developed that attempt to deal with the integration of data types beyond genomic information. Modern automated weather systems offer the opportunity to capture continuous data on a range of environmental parameters at specific field locations. In principle, this information could characterize training and target environments and enhance predictive ability by incorporating weather characteristics as part of the genotype-by-environment (G×E) interaction component in prediction models. We assessed the usefulness of including weather data variables in genomic prediction models using a naïve environmental kinship model across 30 environments comprising the Genomes to Fields (G2F) initiative in 2014 and 2015. Specifically four different prediction scenarios were evaluated (i) tested genotypes in observed environments; (ii) untested genotypes in observed environments; (iii) tested genotypes in unobserved environments; and (iv) untested genotypes in unobserved environments. A set of 1,481 unique hybrids were evaluated for grain yield. Evaluations were conducted using five different models including main effect of environments; general combining ability (GCA) effects of the maternal and paternal parents modeled using the genomic relationship matrix; specific combining ability (SCA) effects between maternal and paternal parents; interactions between genetic (GCA and SCA) effects and environmental effects; and finally interactions between the genetics effects and environmental covariates. Incorporation of the genotype-by-environment interaction term improved predictive ability across all scenarios. However, predictive ability was not improved through inclusion of naive environmental covariates in G×E models. More research should be conducted to link the observed weather conditions with important physiological aspects in plant development to improve predictive ability through the inclusion of weather data.

2021 ◽  
Author(s):  
Raysa Gevartosky ◽  
Humberto Fanelli Carvalho ◽  
Germano Costa-Neto ◽  
Osval A. Montesinos-Lopez ◽  
Jose Crossa ◽  
...  

Genomic prediction (GP) success is directly dependent on establishing a training population, where incorporating envirotyping data and correlated traits may increase the GP accuracy. Therefore, we aimed to design optimized training sets for multi-trait for multi-environment trials (MTMET). For that, we evaluated the predictive ability of five GP models using the genomic best linear unbiased predictor model (GBLUP) with additive + dominance effects (M1) as the baseline and then adding genotype by environment interaction (G × E) (M2), enviromic data (W) (M3), W+G × E (M4), and finally W+G × W (M5), where G × W denotes the genotype by enviromic interaction. Moreover, we considered single-trait multi-environment trials (STMET) and MTMET for three traits: grain yield (GY), plant height (PH), and ear height (EH), with two datasets and two cross-validation schemes. Afterward, we built two kernels for genotype by environment by trait interaction (GET) and genotype by enviromic by trait interaction (GWT) to apply genetic algorithms to select genotype:environment:trait combinations that represent 98% of the variation of the whole dataset and composed the optimized training set (OTS). Using OTS based on enviromic data, it was possible to increase the response to selection per amount invested by 142%. Consequently, our results suggested that genetic algorithms of optimization associated with genomic and enviromic data efficiently design optimized training sets for genomic prediction and improve the genetic gains per dollar invested.


2020 ◽  
Author(s):  
Nourollah Ahmadi ◽  
Tuong-Vi Cao ◽  
Julien Frouin ◽  
Gareth J. Norton ◽  
Adam H. Price

AbstractMany rice-growing areas are affected by high concentrations of arsenic (As). Rice varieties that prevent As uptake and/or accumulation can mitigate As threats to human health. Genomic selection is known to facilitate rapid selection of superior genotypes for complex traits. We explored the predictive ability (PA) of genomic prediction with single-environment models, accounting or not for trait-specific markers, multi-environment models, and multi-trait and multi-environment models, using the genotypic (1600 K SNP) and phenotypic (grain arsenic content, grain yield and days to flowering, observed under two irrigation systems over two years) data of the Bengal and Assam Aus Panel (BAAP). Under the base-line single environment model, PA of up to 0.707 and 0.654 was obtained for grain yield and grain As respectively, the three prediction methods (BL, GBLUP and RKHS) considered performed similarly, and marker selection based on linkage disequilibrium allowed to reduce the number of SNP to 17 K, without negative effect on PA of genomic predictions. Single environment models giving distinct weight to trait-specific markers in the genomic relationship matrix outperformed the base-line models up to 32%. Multi-environment models, accounting for G × E interactions, and multi-trait and multi-environment models outperformed the base-line models by up to 47% and 61%, respectively. Among the multi-trait and multi-environment models, the Bayesian multi-output regressor stacking function obtained the highest PA (0.831 for grain As) with much higher efficiency for computing time. These findings pave the way for breeding for As-tolerance in the progenies of biparental crosses involving members of the BAAP. It also applies to breeding for other complex traits evaluated under multiple environments.


2018 ◽  
Author(s):  
M Ben Hassen ◽  
J Bartholomé ◽  
G Valè ◽  
TV Cao ◽  
N Ahmadi

AbstractDeveloping rice varieties adapted to alternate wetting and drying water management is crucial for the sustainability of irrigated rice cropping systems. Here we report the first study exploring the feasibility of breeding rice for adaptation to alternate wetting and drying using genomic prediction methods that account for genotype by environment interactions. Two breeding populations (a reference panel of 284 accessions and a progeny population of 97 advanced lines) were evaluated under alternate wetting and drying and continuous flooding management systems. The accuracy of genomic prediction for response variables (index of relative performance and the slope of the joint regression) and for multi-environment genomic prediction models were compared. For the three traits considered (days to flowering, panicle weight and nitrogen-balance index), significant genotype by environment interactions were observed in both populations. In cross validation, prediction accuracy for the index was on average lower (0.31) than that of the slope of the joint regression (0.64) whatever the trait considered. Similar results were found for across population validation (progeny validation). Both cross-validation and progeny validation experiments showed that the performance of multi-environment models predicting unobserved phenotypes of untested entrees was similar to the performance of single environment models with differences in accuracy ranging from - 6% to 4% depending on the trait and on the statistical model concerned. The accuracy of multi-environment models predicting unobserved phenotypes of entrees evaluated under both water management systems outperformed single environment models by an average of 30%. Practical implications for breeding rice for adaptation to AWD are discussed.


2021 ◽  
Vol 12 ◽  
Author(s):  
Damiano Puglisi ◽  
Stefano Delbono ◽  
Andrea Visioni ◽  
Hakan Ozkan ◽  
İbrahim Kara ◽  
...  

Multi-parent Advanced Generation Inter-crosses (MAGIC) lines have mosaic genomes that are generated shuffling the genetic material of the founder parents following pre-defined crossing schemes. In cereal crops, these experimental populations have been extensively used to investigate the genetic bases of several traits and dissect the genetic bases of epistasis. In plants, genomic prediction models are usually fitted using either diverse panels of mostly unrelated accessions or individuals of biparental families and several empirical analyses have been conducted to evaluate the predictive ability of models fitted to these populations using different traits. In this paper, we constructed, genotyped and evaluated a barley MAGIC population of 352 individuals developed with a diverse set of eight founder parents showing contrasting phenotypes for grain yield. We combined phenotypic and genotypic information of this MAGIC population to fit several genomic prediction models which were cross-validated to conduct empirical analyses aimed at examining the predictive ability of these models varying the sizes of training populations. Moreover, several methods to optimize the composition of the training population were also applied to this MAGIC population and cross-validated to estimate the resulting predictive ability. Finally, extensive phenotypic data generated in field trials organized across an ample range of water regimes and climatic conditions in the Mediterranean were used to fit and cross-validate multi-environment genomic prediction models including G×E interaction, using both genomic best linear unbiased prediction and reproducing kernel Hilbert space along with a non-linear Gaussian Kernel. Overall, our empirical analyses showed that genomic prediction models trained with a limited number of MAGIC lines can be used to predict grain yield with values of predictive ability that vary from 0.25 to 0.60 and that beyond QTL mapping and analysis of epistatic effects, MAGIC population might be used to successfully fit genomic prediction models. We concluded that for grain yield, the single-environment genomic prediction models examined in this study are equivalent in terms of predictive ability while, in general, multi-environment models that explicitly split marker effects in main and environmental-specific effects outperform simpler multi-environment models.


Agriculture ◽  
2021 ◽  
Vol 11 (10) ◽  
pp. 932
Author(s):  
Reyna Persa ◽  
Martin Grondona ◽  
Diego Jarquin

The global growing population is experiencing challenges to satisfy the food chain supply in a world that faces rapid changes in environmental conditions complicating the development of stable cultivars. Emergent methodologies aided by molecular marker information such as marker assisted selection (MAS) and genomic selection (GS) have been widely adopted to assist the development of improved genotypes. In general, the implementation of GS is not straightforward, and it usually requires cross-validation studies to find the optimum set of factors (training set sizes, number of markers, quality controls, etc.) to use in real breeding applications. In most cases, these different scenarios (combination of several factors) vary just in the levels of a single factor keeping fixed the other levels of the other factors allowing the use of previously developed routines (code reuse). In this study we present a set of structured modules than are easily to assemble for constructing complex genomic prediction pipelines from scratch. Also, we proposed a novel method for selecting training-testing sets of similar sample sizes across different cross-validation schemes (CV2, predicting tested genotypes in observed environments; CV1, predicting untested genotypes in observed environments; CV0, predicting tested genotypes in novel environments; and CV00, predicting untested genotypes in novel environments). To show how our implementation works, we considered two real data sets. These correspond to selected samples of the USDA soybean collection (D1: 324 genotypes observed in 6 environments scored for 9 traits) and of the Soybean Nested Association Mapping (SoyNAM) experiment (D2: 324 genotypes observed in 6 environments scored for 6 traits). In addition, three prediction models which consider the effect of environments and lines (M1: E + L), environments, lines and main effect of markers (M2: E + L + G), and also the inclusion of the interaction between makers and environments (M3: E + L + G + G×E) were considered. The results confirm that under CV2 and CV1 schemes, moderate improvements in predictive ability can be obtained with the inclusion of the interaction component, while for CV0 mixed results were observed, and for CV00 no improvements were shown. However, for this last scenario the inclusion of weather and soil data potentially could enhance the results of the interaction model.


2021 ◽  
Vol 53 (1) ◽  
Author(s):  
Theo Meuwissen ◽  
Irene van den Berg ◽  
Mike Goddard

Abstract Background Whole-genome sequence (WGS) data are increasingly available on large numbers of individuals in animal and plant breeding and in human genetics through second-generation resequencing technologies, 1000 genomes projects, and large-scale genotype imputation from lower marker densities. Here, we present a computationally fast implementation of a variable selection genomic prediction method, that could handle WGS data on more than 35,000 individuals, test its accuracy for across-breed predictions and assess its quantitative trait locus (QTL) mapping precision. Methods The Monte Carlo Markov chain (MCMC) variable selection model (Bayes GC) fits simultaneously a genomic best linear unbiased prediction (GBLUP) term, i.e. a polygenic effect whose correlations are described by a genomic relationship matrix (G), and a Bayes C term, i.e. a set of single nucleotide polymorphisms (SNPs) with large effects selected by the model. Computational speed is improved by a Metropolis–Hastings sampling that directs computations to the SNPs, which are, a priori, most likely to be included into the model. Speed is also improved by running many relatively short MCMC chains. Memory requirements are reduced by storing the genotype matrix in binary form. The model was tested on a WGS dataset containing Holstein, Jersey and Australian Red cattle. The data contained 4,809,520 genotypes on 35,549 individuals together with their milk, fat and protein yields, and fat and protein percentage traits. Results The prediction accuracies of the Jersey individuals improved by 1.5% when using across-breed GBLUP compared to within-breed predictions. Using WGS instead of 600 k SNP-chip data yielded on average a 3% accuracy improvement for Australian Red cows. QTL were fine-mapped by locating the SNP with the highest posterior probability of being included in the model. Various QTL known from the literature were rediscovered, and a new SNP affecting milk production was discovered on chromosome 20 at 34.501126 Mb. Due to the high mapping precision, it was clear that many of the discovered QTL were the same across the five dairy traits. Conclusions Across-breed Bayes GC genomic prediction improved prediction accuracies compared to GBLUP. The combination of across-breed WGS data and Bayesian genomic prediction proved remarkably effective for the fine-mapping of QTL.


Author(s):  
Simon Rio ◽  
Deniz Akdemir ◽  
Tiago Carvalho ◽  
Julio Isidro y Sánchez

Abstract Key message New forms of the coefficient of determination can help to forecast the accuracy of genomic prediction and optimize experimental designs in multi-environment trials with genotype-by-environment interactions. Abstract In multi-environment trials, the relative performance of genotypes may vary depending on the environmental conditions, and this phenomenon is commonly referred to as genotype-by-environment interaction (G$$\times$$ × E). With genomic prediction, G$$\times$$ × E can be accounted for by modeling the genetic covariance between trials, even when the overall experimental design is highly unbalanced between trials, thanks to the genomic relationship between genotypes. In this study, we propose new forms of the coefficient of determination (CD, i.e., the expected model-based square correlation between a genetic value and its corresponding prediction) that can be used to forecast the genomic prediction reliability of genotypes, both for their trial-specific performance and their mean performance. As the expected prediction reliability based on these new CD criteria is generally a good approximation of the observed reliability, we demonstrate that they can be used to optimize multi-environment trials in the presence of G$$\times$$ × E. In addition, this reliability may be highly variable between genotypes, especially in unbalanced designs with complex pedigree relationships between genotypes. Therefore, it can be useful for breeders to assess it before selecting genotypes based on their predicted genetic values. Using a wheat population evaluated both for simulated and phenology traits, and two maize populations evaluated for grain yield, we illustrate this approach and confirm the value of our new CD criteria.


2020 ◽  
Vol 10 (8) ◽  
pp. 2629-2639
Author(s):  
Edna K. Mageto ◽  
Jose Crossa ◽  
Paulino Pérez-Rodríguez ◽  
Thanda Dhliwayo ◽  
Natalia Palacios-Rojas ◽  
...  

Zinc (Zn) deficiency is a major risk factor for human health, affecting about 30% of the world’s population. To study the potential of genomic selection (GS) for maize with increased Zn concentration, an association panel and two doubled haploid (DH) populations were evaluated in three environments. Three genomic prediction models, M (M1: Environment + Line, M2: Environment + Line + Genomic, and M3: Environment + Line + Genomic + Genomic x Environment) incorporating main effects (lines and genomic) and the interaction between genomic and environment (G x E) were assessed to estimate the prediction ability (rMP) for each model. Two distinct cross-validation (CV) schemes simulating two genomic prediction breeding scenarios were used. CV1 predicts the performance of newly developed lines, whereas CV2 predicts the performance of lines tested in sparse multi-location trials. Predictions for Zn in CV1 ranged from -0.01 to 0.56 for DH1, 0.04 to 0.50 for DH2 and -0.001 to 0.47 for the association panel. For CV2, rMP values ranged from 0.67 to 0.71 for DH1, 0.40 to 0.56 for DH2 and 0.64 to 0.72 for the association panel. The genomic prediction model which included G x E had the highest average rMP for both CV1 (0.39 and 0.44) and CV2 (0.71 and 0.51) for the association panel and DH2 population, respectively. These results suggest that GS has potential to accelerate breeding for enhanced kernel Zn concentration by facilitating selection of superior genotypes.


Sign in / Sign up

Export Citation Format

Share Document