Training Population Optimization for Genomic Selection in Miscanthus

Miscanthus is a perennial grass with potential for lignocellulosic ethanol production. To ensure its utility for this purpose, breeding efforts should focus on increasing genetic diversity of the nothospecies Miscanthus × giganteus (M×g) beyond the single clone used in many programs. Germplasm from the corresponding parental species M. sinensis (Msi) and M. sacchariflorus (Msa) could theoretically be used as training sets for genomic prediction of M×g clones with optimal genomic estimated breeding values for biofuel traits. To this end, we first showed that subpopulation structure makes a substantial contribution to the genomic selection (GS) prediction accuracies within a 538-member diversity panel of predominately Msi individuals and a 598-member diversity panels of Msa individuals. We then assessed the ability of these two diversity panels to train GS models that predict breeding values in an interspecific diploid 216-member M×g F2 panel. Low and negative prediction accuracies were observed when various subsets of the two diversity panels were used to train these GS models. To overcome the drawback of having only one interspecific M×g F2 panel available, we also evaluated prediction accuracies for traits simulated in 50 simulated interspecific M×g F2 panels derived from different sets of Msi and diploid Msa parents. The results revealed that genetic architectures with common causal mutations across Msi and Msa yielded the highest prediction accuracies. Ultimately, these results suggest that the ideal training set should contain the same causal mutations segregating within interspecific M×g populations, and thus efforts should be undertaken to ensure that individuals in the training and validation sets are as closely related as possible.

Download Full-text

Comparison of BLUP and Bayesian methods for different sizes of training population in genomic selection

TURKISH JOURNAL OF VETERINARY AND ANIMAL SCIENCES ◽

10.3906/vet-2001-52 ◽

2020 ◽

Vol 44 (5) ◽

pp. 994-1002

Author(s):

Samet Hasan ABACI ◽

Hasan ÖNDER

Keyword(s):

Genomic Selection ◽

Milk Yield ◽

Predictive Ability ◽

Breeding Value ◽

Training Population ◽

Breeding Values ◽

Value Prediction ◽

The Usa ◽

Population Sizes ◽

Estimated Breeding Values

This study aims to compare the accuracy of pedigree-based and genomic-based breeding value prediction for different training population sizes. In this study, Bayes (A, B, C, Cpi) and GBLUP methods for genomic selection and BLUP method for pedigree-based selection were used. Genomic and pedigree-based breeding values were estimated for partial milk yield (158 days) of Holstein cows (400 individuals) from a private enterprise in the USA. For this aim, populations were created for indirect breeding value estimates as training (322–360) and test (78–40) populations. In animals genotyped with a 54k SNP, the marker file was encoded as –10, 0, and 10 for AA, AB, and BB marker genotypes, respectively. Bayes and GBLUP methods were performed using GenSel 4.55 software. A total of 50,000 iterations were used, with the first 5000 excluded as the burn-in. Pedigree-based breeding values were estimated by REML using MTDFREML software employing an animal model. Correlations between partial milk yield and estimated breeding values were used to assess the predictive ability for methods. Bayes B method gave the highest accuracy for the indirect estimate of breeding value.

Download Full-text

Accuracies of univariate and multivariate genomic prediction models in African Cassava

10.1101/116301 ◽

2017 ◽

Author(s):

Uche Godfrey Okeke ◽

Deniz Akdemir ◽

Ismail Rabbi ◽

Peter Kulakow ◽

Jean-Luc Jannink

Keyword(s):

Genomic Selection ◽

Mixed Model ◽

Genetic Evaluation ◽

Environment Interaction ◽

Genotype By Environment Interactions ◽

Breeding Values ◽

Genotype By Environment ◽

Step Model ◽

One Step ◽

Estimated Breeding Values

List of abbreviationsGSGenomic SelectionBLUPBest Linear Unbiased PredictionEBVsEstimated Breeding ValuesEGVsEstimated genetic ValuesGEBVsGenomic Estimated Breeding ValuesSNPsSingle Nucleotide polymorphismsGxEGenotype-by-environment interactionsGxEGenotype-by-environment interactionsGxGGene-by-gene interactionsGxGxEGene-by-gene-by-environment interactionsuTUnivariate single environment one-step modeluEUnivariate multi environment one-step modelMTMulti-trait single environment one-step modelMEMultivariate single trait multi environment modelAbstractBackgroundGenomic selection (GS) promises to accelerate genetic gain in plant breeding programs especially for long cycle crops like cassava. To practically implement GS in cassava breeding, it is useful to evaluate different GS models and to develop suitable models for an optimized breeding pipeline.MethodsWe compared prediction accuracies from a single-trait (uT) and a multi-trait (MT) mixed model for single environment genetic evaluation (Scenario 1) while for multi-environment evaluation accounting for genotype-by-environment interaction (Scenario 2) we compared accuracies from a univariate (uE) and a multivariate (ME) multi-environment mixed model. We used sixteen years of data for six target cassava traits for these analyses. All models for Scenario 1 and Scenario 2 were based on the one-step approach. A 5-fold cross validation scheme with 10-repeat cycles were used to assess model prediction accuracies.ResultsIn Scenario 1, the MT models had higher prediction accuracies than the uT models for most traits and locations analyzed amounting to 32 percent better prediction accuracy on average. However for Scenario 2, we observed that the ME model had on average (across all locations and traits) 12 percent better predictive power than the uE model.ConclusionWe recommend the use of multivariate mixed models (MT and ME) for cassava genetic evaluation. These models may be useful for other plant species.

Download Full-text

Shrinkage estimation of the genomic relationship matrix can improve genomic estimated breeding values in the training set

Theoretical and Applied Genetics ◽

10.1007/s00122-015-2464-6 ◽

2015 ◽

Vol 128 (4) ◽

pp. 693-703 ◽

Cited By ~ 14

Author(s):

Dominik Müller ◽

Frank Technow ◽

Albrecht E. Melchinger

Keyword(s):

Genomic Relationship Matrix ◽

Relationship Matrix ◽

Shrinkage Estimation ◽

Genomic Relationship ◽

Training Set ◽

Breeding Values ◽

Genomic Estimated Breeding Values ◽

Estimated Breeding Values

Download Full-text

Combined Multistage Linear Genomic Selection Indices To Predict the Net Genetic Merit in Plant Breeding

G3 Genes|Genome|Genetics ◽

10.1534/g3.120.401171 ◽

2020 ◽

Vol 10 (6) ◽

pp. 2087-2101

Author(s):

J. Jesus Cerón-Rojas ◽

Jose Crossa

Keyword(s):

Genomic Selection ◽

Linear Combination ◽

Selection Index ◽

Selection Response ◽

Single Stage ◽

Multiple Traits ◽

Breeding Values ◽

Genetic Merit ◽

Estimated Breeding Values ◽

The Individual

A combined multistage linear genomic selection index (CMLGSI) is a linear combination of phenotypic and genomic estimated breeding values useful for predicting the individual net genetic merit, which in turn is a linear combination of the true unobservable breeding values of the traits weighted by their respective economic values. The CMLGSI is a cost-saving strategy for improving multiple traits because the breeder does not need to measure all traits at each stage. The optimum (OCMLGSI) and decorrelated (DCMLGSI) indices are the main CMLGSIs. Whereas the OCMLGSI takes into consideration the index correlation values among stages, the DCMLGSI imposes the restriction that the index correlation values among stages be zero. Using real and simulated datasets, we compared the efficiency of both indices in a two-stage context. The criteria we applied to compare the efficiency of both indices were that the total selection response of each index must be lower than or equal to the single-stage combined linear genomic selection index (CLGSI) response and that the correlation of each index with the net genetic merit should be maximum. Using four different total proportions for the real dataset, the estimated total OCMLGSI and DCMLGSI responses explained 97.5% and 90%, respectively, of the estimated single-stage CLGSI selection response. In addition, at stage two, the estimated correlations of the OCMLGSI and the DCMLGSI with the net genetic merit were 0.84 and 0.63, respectively. We found similar results for the simulated datasets. Thus, we recommend using the OCMLGSI when performing multistage selection.

Download Full-text

Training Set Optimization for Sparse Phenotyping in Genomic Selection: A Conceptual Overview

Frontiers in Plant Science ◽

10.3389/fpls.2021.715910 ◽

2021 ◽

Vol 12 ◽

Author(s):

Julio Isidro y Sánchez ◽

Deniz Akdemir

Keyword(s):

Genomic Selection ◽

Lessons Learned ◽

Crop Breeding ◽

Training Population ◽

Training Set ◽

Breeding Programs ◽

Efficiency And Effectiveness ◽

Key Steps ◽

The Relationship ◽

Conceptual Overview

Genomic selection (GS) is becoming an essential tool in breeding programs due to its role in increasing genetic gain per unit time. The design of the training set (TRS) in GS is one of the key steps in the implementation of GS in plant and animal breeding programs mainly because (i) TRS optimization is critical for the efficiency and effectiveness of GS, (ii) breeders test genotypes in multi-year and multi-location trials to select the best-performing ones. In this framework, TRS optimization can help to decrease the number of genotypes to be tested and, therefore, reduce phenotyping cost and time, and (iii) we can obtain better prediction accuracies from optimally selected TRS than an arbitrary TRS. Here, we concentrate the efforts on reviewing the lessons learned from TRS optimization studies and their impact on crop breeding and discuss important features for the success of TRS optimization under different scenarios. In this article, we review the lessons learned from training population optimization in plants and the major challenges associated with the optimization of GS including population size, the relationship between training and test set (TS), update of TRS, and the use of different packages and algorithms for TRS implementation in GS. Finally, we describe general guidelines to improving the rate of genetic improvement by maximizing the use of the TRS optimization in the GS framework.

Download Full-text

Genomic selection strategies for clonally propagated crops

10.1101/2020.06.15.152017 ◽

2020 ◽

Cited By ~ 1

Author(s):

Christian R. Werner ◽

R. Chris Gaynor ◽

Daniel J. Sargent ◽

Alessandra Lillo ◽

Gregor Gorjanc ◽

...

Keyword(s):

Genomic Selection ◽

Genetic Gain ◽

Genomic Prediction ◽

Breeding Program ◽

Selection Methods ◽

Breeding Programs ◽

Breeding Values ◽

Parent Selection ◽

Estimated Breeding Values ◽

Selection Of

AbstractFor genomic selection in clonal breeding programs to be effective, crossing parents should be selected based on genomic predicted cross performance unless dominance is negligible. Genomic prediction of cross performance enables a balanced exploitation of the additive and dominance value simultaneously. Here, we compared different strategies for the implementation of genomic selection in clonal plant breeding programs. We used stochastic simulations to evaluate six combinations of three breeding programs and two parent selection methods. The three breeding programs included i) a breeding program that introduced genomic selection in the first clonal testing stage, and ii) two variations of a two-part breeding program with one and three crossing cycles per year, respectively. The two parent selection methods were i) selection of parents based on genomic estimated breeding values, and ii) selection of parents based on genomic predicted cross performance. Selection of parents based on genomic predicted cross performance produced faster genetic gain than selection of parents based on genomic estimated breeding values because it substantially reduced inbreeding when the dominance degree increased. The two-part breeding programs with one and three crossing cycles per year using genomic prediction of cross performance always produced the most genetic gain unless dominance was negligible. We conclude that i) in clonal breeding programs with genomic selection, parents should be selected based on genomic predicted cross performance, and ii) a two-part breeding program with parent selection based on genomic predicted cross performance to rapidly drive population improvement has great potential to improve breeding clonally propagated crops.

Download Full-text

Comparison of Marker-Based Genomic Estimated Breeding Values and Phenotypic Evaluation for Selection of Bacterial Spot Resistance in Tomato

Phytopathology ◽

10.1094/phyto-12-16-0431-r ◽

2018 ◽

Vol 108 (3) ◽

pp. 392-401 ◽

Cited By ~ 9

Author(s):

Debora Liabeuf ◽

Sung-Chur Sim ◽

David M. Francis

Keyword(s):

Prediction Models ◽

Phenotypic Selection ◽

Hybrid Population ◽

Bacterial Spot ◽

Training Population ◽

Prediction Ability ◽

Breeding Values ◽

Genomic Estimated Breeding Values ◽

Estimated Breeding Values ◽

Efficiency Of Selection

Bacterial spot affects tomato crops (Solanum lycopersicum) grown under humid conditions. Major genes and quantitative trait loci (QTL) for resistance have been described, and multiple loci from diverse sources need to be combined to improve disease control. We investigated genomic selection (GS) prediction models for resistance to Xanthomonas euvesicatoria and experimentally evaluated the accuracy of these models. The training population consisted of 109 families combining resistance from four sources and directionally selected from a population of 1,100 individuals. The families were evaluated on a plot basis in replicated inoculated trials and genotyped with single nucleotide polymorphisms (SNP). We compared the prediction ability of models developed with 14 to 387 SNP. Genomic estimated breeding values (GEBV) were derived using Bayesian least absolute shrinkage and selection operator regression (BL) and ridge regression (RR). Evaluations were based on leave-one-out cross validation and on empirical observations in replicated field trials using the next generation of inbred progeny and a hybrid population resulting from selections in the training population. Prediction ability was evaluated based on correlations between GEBV and phenotypes (rg), percentage of coselection between genomic and phenotypic selection, and relative efficiency of selection (rg/rp). Results were similar with BL and RR models. Models using only markers previously identified as significantly associated with resistance but weighted based on GEBV and mixed models with markers associated with resistance treated as fixed effects and markers distributed in the genome treated as random effects offered greater accuracy and a high percentage of coselection. The accuracy of these models to predict the performance of progeny and hybrids exceeded the accuracy of phenotypic selection.

Download Full-text

P3027 Bioactivity of colostrum and milk exosomes containing microrna from cows genetically selected as high, average and low immune responders based on their estimated breeding values

Journal of Animal Science ◽

10.2527/jas2016.94supplement465x ◽

2016 ◽

Vol 94 (suppl_4) ◽

pp. 65-66 ◽

Cited By ~ 1

Author(s):

M. Ross ◽

H. Atalla ◽

B. Mallard

Keyword(s):

Breeding Values ◽

Estimated Breeding Values

Download Full-text

A note on genetic parameters and accuracy of estimated breeding values in honey bees

Genetics Selection Evolution ◽

10.1186/s12711-019-0510-6 ◽

2019 ◽

Vol 51 (1) ◽

Cited By ~ 6

Author(s):

Evert W. Brascamp ◽

Piter Bijma

Keyword(s):

Honey Bees ◽

Variance Components ◽

Genetic Parameters ◽

Additive Genetic Variance ◽

Breeding Value ◽

Phenotypic Variance ◽

Breeding Values ◽

Estimated Breeding Values ◽

Additive Genetic Relationship ◽

The Relationship

Abstract Background In honey bees, observations are usually made on colonies. The phenotype of a colony is affected by the average breeding value for the worker effect of the thousands of workers in the colony (the worker group) and by the breeding value for the queen effect of the queen of the colony. Because the worker group consists of multiple individuals, interpretation of the variance components and heritabilities of phenotypes observed on the colony and of the accuracy of selection is not straightforward. The additive genetic variance among worker groups depends on the additive genetic relationship between the drone-producing queens (DPQ) that produce the drones that mate with the queen. Results Here, we clarify how the relatedness between DPQ affects phenotypic variance, heritability and accuracy of the estimated breeding values of replacement queens. Second, we use simulation to investigate the effect of assumptions about the relatedness between DPQ in the base population on estimates of genetic parameters. Relatedness between DPQ in the base generation may differ considerably between populations because of their history. Conclusions Our results show that estimates of (co)variance components and derived genetic parameters were seriously biased (25% too high or too low) when assumptions on the relationship between DPQ in the statistical analysis did not agree with reality.

Download Full-text

Genomic selection in French dairy cattle

Animal Production Science ◽

10.1071/an11119 ◽

2012 ◽

Vol 52 (3) ◽

pp. 115 ◽

Cited By ~ 63

Author(s):

D. Boichard ◽

F. Guillaume ◽

A. Baur ◽

P. Croiseau ◽

M. N. Rossignol ◽

...

Keyword(s):

Linkage Disequilibrium ◽

Genomic Selection ◽

Reference Population ◽

Specific Reference ◽

Nucleotide Polymorphisms ◽

Genomic Evaluation ◽

Linear Unbiased Prediction ◽

Best Linear Unbiased ◽

Estimated Breeding Values ◽

Restricted Use

Genomic selection is implemented in French Holstein, Montbéliarde, and Normande breeds (70%, 16% and 12% of French dairy cows). A characteristic of the model for genomic evaluation is the use of haplotypes instead of single-nucleotide polymorphisms (SNPs), so as to maximise linkage disequilibrium between markers and quantitative trait loci (QTLs). For each trait, a QTL-BLUP model (i.e. a best linear unbiased prediction model including QTL random effects) includes 300–700 trait-dependent chromosomal regions selected either by linkage disequilibrium and linkage analysis or by elastic net. This model requires an important effort to phase genotypes, detect QTLs, select SNPs, but was found to be the most efficient one among all tested ones. QTLs are defined within breed and many of them were found to be breed specific. Reference populations include 1800 and 1400 bulls in Montbéliarde and Normande breeds. In Holstein, the very large reference population of 18 300 bulls originates from the EuroGenomics consortium. Since 2008, ~65 000 animals have been genotyped for selection by Labogena with the 50k chip. Bulls genomic estimated breeding values (GEBVs) were made official in June 2009. In 2010, the market share of the young bulls reached 30% and is expected to increase rapidly. Advertising actions have been undertaken to recommend a time-restricted use of young bulls with a limited number of doses. In January 2011, genomic selection was opened to all farmers for females. Current developments focus on the extension of the method to a multi-breed context, to use all reference populations simultaneously in genomic evaluation.

Download Full-text