scholarly journals Genomic prediction with allele dosage information in highly polyploid species

Author(s):  
Lorena G. Batista ◽  
Victor H. Mello ◽  
Anete P. Souza ◽  
Gabriel R. A. Margarido
2019 ◽  
pp. g3.400059.2019 ◽  
Author(s):  
Ivone de Bem Oliveira ◽  
Marcio F. R. Resende ◽  
Luis Felipe V. Ferrão ◽  
Rodrigo R. Amadeu ◽  
Jeffrey B. Endelman ◽  
...  

2021 ◽  
Author(s):  
Luís Felipe V. Ferrão ◽  
Rodrigo R. Amadeu ◽  
Juliana Benevenuto ◽  
Ivone de Bem Oliveira ◽  
Patricio R. Munoz

AbstractBlueberry (Vaccinium corymbosum and hybrids) is a specialty crop, with expanding production and consumption worldwide. The blueberry breeding program at the University of Florida (UF) has greatly contributed to the expansion of production areas by developing low-chilling cultivars better adapted to subtropical and Mediterranean climates of the globe. The breeding program has historically focused on phenotypic recurrent selection. As an autopolyploid, outcrossing, perennial, long juvenile phase crop, blueberry’s breeding cycles are costly and time-consuming, which results in low genetic gains per unit of time. Motivated by the application of molecular markers for a more accurate selection in early stages of breeding, we performed pioneering genomic prediction studies and optimization for implementation in the blueberry breeding program. We have also addressed some complexities of sequence-based geno- typing and model parametrization for an autopolyploid crop, providing empirical contributions that can be extended to other polyploid species. We herein revisited some of our previous genomic prediction studies and described the current achievements in the crop. In this paper, our contribution for genomic prediction in an autotetraploid crop is three-fold: i) summarize previous results on the relevance of model parametrizations, such as diploid or polyploid methods, and inclusion of dominance effects; ii) assess the importance of sequence depth of coverage and genotype dosage calling steps; iii) demonstrate the real impact of genomic selection on leveraging breeding decisions by using an independent validation set. Altogether, we propose a strategy for the use of genomic selection in blueberry, with potential to be applied to other polyploid species of a similar background.


2019 ◽  
Vol 39 (7) ◽  
Author(s):  
Filipe Inácio Matias ◽  
Filipe Couto Alves ◽  
Karem Guimarães Xavier Meireles ◽  
Sanzio Carvalho Lima Barrios ◽  
Cacilda Borges do Valle ◽  
...  

2021 ◽  
Author(s):  
Lorena Batista ◽  
Victor H Mello ◽  
Anete Pereira de Souza ◽  
Gabriel RA Margarido

Several studies have shown how to leverage allele dosage information to improve the accuracy of genomic selection models in autotetraploids. In this study we expanded the methodology used for genomic selection in autotetraploids to higher (and mixed) ploidy levels. We adapted the models to build covariance matrices of both additive and digenic dominance effects that are subsequently used in genomic selection models. We applied these models using estimates of ploidy and allele dosage to sugarcane and sweet potato datasets and validated our results by also applying the models in simulated data. For the simulated datasets, including allele dosage information led up to 140% higher mean predictive abilities in comparison to using diploidized markers. Including dominance effects was highly advantageous when using diploidized markers, leading to mean predictive abilities which were up to 115% higher in comparison to only including additive effects. When the frequency of heterozygous genotypes in the population was low, such as in the sugarcane and sweet potato datasets, there was little advantage in including allele dosage information in the models. Overall, we show that including allele dosage can improve genomic selection in highly polyploid species under higher frequency of different heterozygous genotypic classes and high dominance degree levels.


2018 ◽  
Author(s):  
Ivone de Bem Oliveira ◽  
Marcio F. R. Resende ◽  
Luis Felipe V. Ferrão ◽  
Rodrigo R. Amadeu ◽  
Jeffrey B. Endelman ◽  
...  

ABSTRACTEstimation of allele dosage in autopolyploids is challenging and current methods often result in the misclassification of genotypes. Here we propose and compare the use of next generation sequencing read depth as continuous parameterization for autotetraploid genomic prediction of breeding values, using blueberry (Vaccinium corybosumspp.) as a model. Additionally, we investigated the influence of different sources of information to build relationship matrices in phenotype prediction; no relationship, pedigree, and genomic information, considering either diploid or tetraploid parameterizations. A real breeding population composed of 1,847 individuals was phenotyped for eight yield and fruit quality traits over two years. Analyses were based on extensive pedigree (since 1908) and high-density marker data (86K markers). Our results show that marker-based matrices can yield significantly better prediction than pedigree for most of the traits, based on model fitting and expected genetic gain. Continuous genotypic based models performed as well as the current best models and presented a significantly better goodness-of-fit for all traits analyzed. This approach also reduces the computational time required for marker calling and avoids problems associated with misclassification of genotypic classes when assigning dosage in polyploid species. Accuracies are encouraging for application of genomic selection (GS) for blueberry breeding. Conservatively, GS could reduce the time for cultivar release by three years. GS could increase the genetic gain per cycle by 86% on average when compared to phenotypic selection, and 32% when compared with pedigree-based selection.


2019 ◽  
Vol 15 ◽  
pp. 117693431983130 ◽  
Author(s):  
Diego Jarquín ◽  
Reka Howard ◽  
George Graef ◽  
Aaron Lorenz

An important and broadly used tool for selection purposes and to increase yield and genetic gain in plant breeding programs is genomic prediction (GP). Genomic prediction is a technique where molecular marker information and phenotypic data are used to predict the phenotype (eg, yield) of individuals for which only marker data are available. Higher prediction accuracy can be achieved not only by using efficient models but also by using quality molecular marker and phenotypic data. The steps of a typical quality control (QC) of marker data include the elimination of markers with certain level of minor allele frequency (MAF) and missing marker values and the imputation of missing marker values. In this article, we evaluated how the prediction accuracy is influenced by the combination of 12 MAF values, 27 different percentages of missing marker values, and 2 imputation techniques (IT; naïve and Random Forest (RF)). We constructed a response surface of prediction accuracy values for the two ITs as a function of MAF and percentage of missing marker values using soybean data from the University of Nebraska–Lincoln Soybean Breeding Program. We found that both the genetic architecture of the trait and the IT affect the prediction accuracy implying that we have to be careful how we perform QC on the marker data. For the corresponding combinations MAF-percentage of missing values we observed that implementing the RF imputation increased the number of markers by 2 to 5 times than the simple naïve imputation method that is based on the mean allele dosage of the non-missing values at each loci. We conclude that there is not a unique strategy (combination of the QCs and imputation method) that outperforms the results of the others for all traits.


2021 ◽  
Vol 245 ◽  
pp. 104421
Author(s):  
Rosiane P. Silva ◽  
Rafael Espigolan ◽  
Mariana P. Berton ◽  
Raysildo B. Lôbo ◽  
Cláudio U. Magnabosco ◽  
...  

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Fatemeh Amini ◽  
Felipe Restrepo Franco ◽  
Guiping Hu ◽  
Lizhi Wang

AbstractRecent advances in genomic selection (GS) have demonstrated the importance of not only the accuracy of genomic prediction but also the intelligence of selection strategies. The look ahead selection algorithm, for example, has been found to significantly outperform the widely used truncation selection approach in terms of genetic gain, thanks to its strategy of selecting breeding parents that may not necessarily be elite themselves but have the best chance of producing elite progeny in the future. This paper presents the look ahead trace back algorithm as a new variant of the look ahead approach, which introduces several improvements to further accelerate genetic gain especially under imperfect genomic prediction. Perhaps an even more significant contribution of this paper is the design of opaque simulators for evaluating the performance of GS algorithms. These simulators are partially observable, explicitly capture both additive and non-additive genetic effects, and simulate uncertain recombination events more realistically. In contrast, most existing GS simulation settings are transparent, either explicitly or implicitly allowing the GS algorithm to exploit certain critical information that may not be possible in actual breeding programs. Comprehensive computational experiments were carried out using a maize data set to compare a variety of GS algorithms under four simulators with different levels of opacity. These results reveal how differently a same GS algorithm would interact with different simulators, suggesting the need for continued research in the design of more realistic simulators. As long as GS algorithms continue to be trained in silico rather than in planta, the best way to avoid disappointing discrepancy between their simulated and actual performances may be to make the simulator as akin to the complex and opaque nature as possible.


2021 ◽  
Vol 41 (2) ◽  
Author(s):  
Eduardo Beche ◽  
Jason D. Gillman ◽  
Qijian Song ◽  
Randall Nelson ◽  
Tim Beissinger ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document