scholarly journals Training Population Optimization for Genomic Selection in Miscanthus

2020 ◽  
Vol 10 (7) ◽  
pp. 2465-2476
Author(s):  
Marcus O. Olatoye ◽  
Lindsay V. Clark ◽  
Nicholas R. Labonte ◽  
Hongxu Dong ◽  
Maria S. Dwiyanti ◽  
...  

Miscanthus is a perennial grass with potential for lignocellulosic ethanol production. To ensure its utility for this purpose, breeding efforts should focus on increasing genetic diversity of the nothospecies Miscanthus × giganteus (M×g) beyond the single clone used in many programs. Germplasm from the corresponding parental species M. sinensis (Msi) and M. sacchariflorus (Msa) could theoretically be used as training sets for genomic prediction of M×g clones with optimal genomic estimated breeding values for biofuel traits. To this end, we first showed that subpopulation structure makes a substantial contribution to the genomic selection (GS) prediction accuracies within a 538-member diversity panel of predominately Msi individuals and a 598-member diversity panels of Msa individuals. We then assessed the ability of these two diversity panels to train GS models that predict breeding values in an interspecific diploid 216-member M×g F2 panel. Low and negative prediction accuracies were observed when various subsets of the two diversity panels were used to train these GS models. To overcome the drawback of having only one interspecific M×g F2 panel available, we also evaluated prediction accuracies for traits simulated in 50 simulated interspecific M×g F2 panels derived from different sets of Msi and diploid Msa parents. The results revealed that genetic architectures with common causal mutations across Msi and Msa yielded the highest prediction accuracies. Ultimately, these results suggest that the ideal training set should contain the same causal mutations segregating within interspecific M×g populations, and thus efforts should be undertaken to ensure that individuals in the training and validation sets are as closely related as possible.

2020 ◽  
Vol 44 (5) ◽  
pp. 994-1002
Author(s):  
Samet Hasan ABACI ◽  
Hasan ÖNDER

This study aims to compare the accuracy of pedigree-based and genomic-based breeding value prediction for different training population sizes. In this study, Bayes (A, B, C, Cpi) and GBLUP methods for genomic selection and BLUP method for pedigree-based selection were used. Genomic and pedigree-based breeding values were estimated for partial milk yield (158 days) of Holstein cows (400 individuals) from a private enterprise in the USA. For this aim, populations were created for indirect breeding value estimates as training (322–360) and test (78–40) populations. In animals genotyped with a 54k SNP, the marker file was encoded as –10, 0, and 10 for AA, AB, and BB marker genotypes, respectively. Bayes and GBLUP methods were performed using GenSel 4.55 software. A total of 50,000 iterations were used, with the first 5000 excluded as the burn-in. Pedigree-based breeding values were estimated by REML using MTDFREML software employing an animal model. Correlations between partial milk yield and estimated breeding values were used to assess the predictive ability for methods. Bayes B method gave the highest accuracy for the indirect estimate of breeding value.


2017 ◽  
Author(s):  
Uche Godfrey Okeke ◽  
Deniz Akdemir ◽  
Ismail Rabbi ◽  
Peter Kulakow ◽  
Jean-Luc Jannink

List of abbreviationsGSGenomic SelectionBLUPBest Linear Unbiased PredictionEBVsEstimated Breeding ValuesEGVsEstimated genetic ValuesGEBVsGenomic Estimated Breeding ValuesSNPsSingle Nucleotide polymorphismsGxEGenotype-by-environment interactionsGxEGenotype-by-environment interactionsGxGGene-by-gene interactionsGxGxEGene-by-gene-by-environment interactionsuTUnivariate single environment one-step modeluEUnivariate multi environment one-step modelMTMulti-trait single environment one-step modelMEMultivariate single trait multi environment modelAbstractBackgroundGenomic selection (GS) promises to accelerate genetic gain in plant breeding programs especially for long cycle crops like cassava. To practically implement GS in cassava breeding, it is useful to evaluate different GS models and to develop suitable models for an optimized breeding pipeline.MethodsWe compared prediction accuracies from a single-trait (uT) and a multi-trait (MT) mixed model for single environment genetic evaluation (Scenario 1) while for multi-environment evaluation accounting for genotype-by-environment interaction (Scenario 2) we compared accuracies from a univariate (uE) and a multivariate (ME) multi-environment mixed model. We used sixteen years of data for six target cassava traits for these analyses. All models for Scenario 1 and Scenario 2 were based on the one-step approach. A 5-fold cross validation scheme with 10-repeat cycles were used to assess model prediction accuracies.ResultsIn Scenario 1, the MT models had higher prediction accuracies than the uT models for most traits and locations analyzed amounting to 32 percent better prediction accuracy on average. However for Scenario 2, we observed that the ME model had on average (across all locations and traits) 12 percent better predictive power than the uE model.ConclusionWe recommend the use of multivariate mixed models (MT and ME) for cassava genetic evaluation. These models may be useful for other plant species.


2020 ◽  
Vol 10 (6) ◽  
pp. 2087-2101
Author(s):  
J. Jesus Cerón-Rojas ◽  
Jose Crossa

A combined multistage linear genomic selection index (CMLGSI) is a linear combination of phenotypic and genomic estimated breeding values useful for predicting the individual net genetic merit, which in turn is a linear combination of the true unobservable breeding values of the traits weighted by their respective economic values. The CMLGSI is a cost-saving strategy for improving multiple traits because the breeder does not need to measure all traits at each stage. The optimum (OCMLGSI) and decorrelated (DCMLGSI) indices are the main CMLGSIs. Whereas the OCMLGSI takes into consideration the index correlation values among stages, the DCMLGSI imposes the restriction that the index correlation values among stages be zero. Using real and simulated datasets, we compared the efficiency of both indices in a two-stage context. The criteria we applied to compare the efficiency of both indices were that the total selection response of each index must be lower than or equal to the single-stage combined linear genomic selection index (CLGSI) response and that the correlation of each index with the net genetic merit should be maximum. Using four different total proportions for the real dataset, the estimated total OCMLGSI and DCMLGSI responses explained 97.5% and 90%, respectively, of the estimated single-stage CLGSI selection response. In addition, at stage two, the estimated correlations of the OCMLGSI and the DCMLGSI with the net genetic merit were 0.84 and 0.63, respectively. We found similar results for the simulated datasets. Thus, we recommend using the OCMLGSI when performing multistage selection.


2021 ◽  
Vol 12 ◽  
Author(s):  
Julio Isidro y Sánchez ◽  
Deniz Akdemir

Genomic selection (GS) is becoming an essential tool in breeding programs due to its role in increasing genetic gain per unit time. The design of the training set (TRS) in GS is one of the key steps in the implementation of GS in plant and animal breeding programs mainly because (i) TRS optimization is critical for the efficiency and effectiveness of GS, (ii) breeders test genotypes in multi-year and multi-location trials to select the best-performing ones. In this framework, TRS optimization can help to decrease the number of genotypes to be tested and, therefore, reduce phenotyping cost and time, and (iii) we can obtain better prediction accuracies from optimally selected TRS than an arbitrary TRS. Here, we concentrate the efforts on reviewing the lessons learned from TRS optimization studies and their impact on crop breeding and discuss important features for the success of TRS optimization under different scenarios. In this article, we review the lessons learned from training population optimization in plants and the major challenges associated with the optimization of GS including population size, the relationship between training and test set (TS), update of TRS, and the use of different packages and algorithms for TRS implementation in GS. Finally, we describe general guidelines to improving the rate of genetic improvement by maximizing the use of the TRS optimization in the GS framework.


Author(s):  
Christian R. Werner ◽  
R. Chris Gaynor ◽  
Daniel J. Sargent ◽  
Alessandra Lillo ◽  
Gregor Gorjanc ◽  
...  

AbstractFor genomic selection in clonal breeding programs to be effective, crossing parents should be selected based on genomic predicted cross performance unless dominance is negligible. Genomic prediction of cross performance enables a balanced exploitation of the additive and dominance value simultaneously. Here, we compared different strategies for the implementation of genomic selection in clonal plant breeding programs. We used stochastic simulations to evaluate six combinations of three breeding programs and two parent selection methods. The three breeding programs included i) a breeding program that introduced genomic selection in the first clonal testing stage, and ii) two variations of a two-part breeding program with one and three crossing cycles per year, respectively. The two parent selection methods were i) selection of parents based on genomic estimated breeding values, and ii) selection of parents based on genomic predicted cross performance. Selection of parents based on genomic predicted cross performance produced faster genetic gain than selection of parents based on genomic estimated breeding values because it substantially reduced inbreeding when the dominance degree increased. The two-part breeding programs with one and three crossing cycles per year using genomic prediction of cross performance always produced the most genetic gain unless dominance was negligible. We conclude that i) in clonal breeding programs with genomic selection, parents should be selected based on genomic predicted cross performance, and ii) a two-part breeding program with parent selection based on genomic predicted cross performance to rapidly drive population improvement has great potential to improve breeding clonally propagated crops.


2018 ◽  
Vol 108 (3) ◽  
pp. 392-401 ◽  
Author(s):  
Debora Liabeuf ◽  
Sung-Chur Sim ◽  
David M. Francis

Bacterial spot affects tomato crops (Solanum lycopersicum) grown under humid conditions. Major genes and quantitative trait loci (QTL) for resistance have been described, and multiple loci from diverse sources need to be combined to improve disease control. We investigated genomic selection (GS) prediction models for resistance to Xanthomonas euvesicatoria and experimentally evaluated the accuracy of these models. The training population consisted of 109 families combining resistance from four sources and directionally selected from a population of 1,100 individuals. The families were evaluated on a plot basis in replicated inoculated trials and genotyped with single nucleotide polymorphisms (SNP). We compared the prediction ability of models developed with 14 to 387 SNP. Genomic estimated breeding values (GEBV) were derived using Bayesian least absolute shrinkage and selection operator regression (BL) and ridge regression (RR). Evaluations were based on leave-one-out cross validation and on empirical observations in replicated field trials using the next generation of inbred progeny and a hybrid population resulting from selections in the training population. Prediction ability was evaluated based on correlations between GEBV and phenotypes (rg), percentage of coselection between genomic and phenotypic selection, and relative efficiency of selection (rg/rp). Results were similar with BL and RR models. Models using only markers previously identified as significantly associated with resistance but weighted based on GEBV and mixed models with markers associated with resistance treated as fixed effects and markers distributed in the genome treated as random effects offered greater accuracy and a high percentage of coselection. The accuracy of these models to predict the performance of progeny and hybrids exceeded the accuracy of phenotypic selection.


2019 ◽  
Vol 51 (1) ◽  
Author(s):  
Evert W. Brascamp ◽  
Piter Bijma

Abstract Background In honey bees, observations are usually made on colonies. The phenotype of a colony is affected by the average breeding value for the worker effect of the thousands of workers in the colony (the worker group) and by the breeding value for the queen effect of the queen of the colony. Because the worker group consists of multiple individuals, interpretation of the variance components and heritabilities of phenotypes observed on the colony and of the accuracy of selection is not straightforward. The additive genetic variance among worker groups depends on the additive genetic relationship between the drone-producing queens (DPQ) that produce the drones that mate with the queen. Results Here, we clarify how the relatedness between DPQ affects phenotypic variance, heritability and accuracy of the estimated breeding values of replacement queens. Second, we use simulation to investigate the effect of assumptions about the relatedness between DPQ in the base population on estimates of genetic parameters. Relatedness between DPQ in the base generation may differ considerably between populations because of their history. Conclusions Our results show that estimates of (co)variance components and derived genetic parameters were seriously biased (25% too high or too low) when assumptions on the relationship between DPQ in the statistical analysis did not agree with reality.


2012 ◽  
Vol 52 (3) ◽  
pp. 115 ◽  
Author(s):  
D. Boichard ◽  
F. Guillaume ◽  
A. Baur ◽  
P. Croiseau ◽  
M. N. Rossignol ◽  
...  

Genomic selection is implemented in French Holstein, Montbéliarde, and Normande breeds (70%, 16% and 12% of French dairy cows). A characteristic of the model for genomic evaluation is the use of haplotypes instead of single-nucleotide polymorphisms (SNPs), so as to maximise linkage disequilibrium between markers and quantitative trait loci (QTLs). For each trait, a QTL-BLUP model (i.e. a best linear unbiased prediction model including QTL random effects) includes 300–700 trait-dependent chromosomal regions selected either by linkage disequilibrium and linkage analysis or by elastic net. This model requires an important effort to phase genotypes, detect QTLs, select SNPs, but was found to be the most efficient one among all tested ones. QTLs are defined within breed and many of them were found to be breed specific. Reference populations include 1800 and 1400 bulls in Montbéliarde and Normande breeds. In Holstein, the very large reference population of 18 300 bulls originates from the EuroGenomics consortium. Since 2008, ~65 000 animals have been genotyped for selection by Labogena with the 50k chip. Bulls genomic estimated breeding values (GEBVs) were made official in June 2009. In 2010, the market share of the young bulls reached 30% and is expected to increase rapidly. Advertising actions have been undertaken to recommend a time-restricted use of young bulls with a limited number of doses. In January 2011, genomic selection was opened to all farmers for females. Current developments focus on the extension of the method to a multi-breed context, to use all reference populations simultaneously in genomic evaluation.


Sign in / Sign up

Export Citation Format

Share Document