scholarly journals Incorporating selfing to purge deleterious alleles in a cassava genomic selection program

2020 ◽  
Author(s):  
Mohamed Somo ◽  
Jean-Luc Jannink

AbstractCassava has been found to carry high levels of recessive deleterious mutations and it is known to suffer from inbreeding depression. Breeders therefore consider specific approaches to decrease cassava’s genetic load. Using self fertilization to unmask deleterious recessive alleles and therefore accelerate their purging is one possibility. Before implementation of this approach we sought to understand better its consequences through simulation. Founder populations with high directional dominance were simulated using a natural selection forward simulator. The founder population was then subjected to five generations of genomic selection in schemes that did or did not include a generation of phenotypic selection on selfed progeny. We found that genomic selection was less effective under the directional dominance model than under the additive models that have commonly been used in simulations. While selection did increase favorable allele frequencies, increased inbreeding during selection caused decreased gain in genotypic values under the directional dominance. While purging selection on selfed individuals was effective in the first breeding cycle, it was not effective in later cycles, an effect we attributed to the fact that the generation of selfing decreased the relatedness of the genomic prediction training population from selection candidates. That decreased relatedness caused genomic prediction accuracy to be lower in schemes incorporating selfing. We found that selection on individuals partially inbred by one generation of selfing did increase mean genetic value of the partially inbred population, but that this gain was accompanied by a relatively small increase in favorable allele frequencies such that improvement in the outbred population was lower than might have been intuited.

2021 ◽  
Author(s):  
Ao Zhang ◽  
Shan Chen ◽  
Zhenhai Cui ◽  
Yubo Liu ◽  
Yuan Guan ◽  
...  

Abstract Drought tolerance in maize is a complex and polygenic trait, especially in the seedling stage. In plant breeding, such traits can be improved by genomic selection (GS), which has become a practical and effective tool. In the present study, a natural maize population named Northeast China core population (NCCP) consisting of 379 inbred lines were genotyped with diversity arrays technology (DArT) and genotyping-by-sequencing (GBS) platforms. Target traits of seedling emergence rate (ER), seedling plant height (SPH), and grain yield (GY) were evaluated under two natural drought environments in northeast China. adequate genetic variants have been found for genomic selection, they are not stable enough between two years. Similarly, the heritability of the three traits is not stable enough, and the heritabilities in 2019 (0.88, 0.82, 0.85 for ER, SPH, GY) are higher than that in 2020 (0.65, 0.53, 0.33) and cross-two-year (0.32, 0.26, 0.33). The current research obtained two kinds of marker sets: the SilicoDArT markers were from DArT-seq, and SNPs were from the GBS and DArT-seq. In total, a number of 11,865 SilicoDArT, 7,837 DArT's SNPs, and 91,003 GBS SNPs were used for analysis after quality control. The results of phylogenetic trees showed that the population was rich in consanguinity. Genomic prediction results showed that the average prediction accuracies estimated using the DArT SNP dataset under the 2-fold cross-validation scheme were 0.27, 0.19, and 0.33, for ER, SPH, and GY, respectively. The result of SilicoDArT is close to the SNPs from DArT-seq, those were 0.26, 0.22, and 0.33. For SPH, the prediction accuracies using SilicoDArT were more than ones using DArT SNP, In some cases, alignment to the reference genome results in a loss to the prediction. The trait with lower heritability can improve the prediction accuracy using filtering of linkage disequilibrium. For the same trait, the prediction accuracy estimated with two types of DArT markers was consistently higher than those estimated with the GBS SNPs under the same genotyping cost. Our results show the prediction accuracy has been improved in some cases of controlling population structure and marker quality, even when the density of the marker is reduced. In the initial maize breeding cycle, Silicodart markers can obtain higher prediction accuracy with a lower cost. However, higher marker density platforms i.e. GBS may play a role in the following breeding cycle for the long term. The natural drought experimental station can reduce the difficulty of phenotypic identification in a water-scarce environment. The accumulation of more yearly data will help to stabilize the heritability and improve predictive accuracy in maize breeding. The experimental design and model for drought resistance also need to be further developed.


2018 ◽  
Author(s):  
Zhi-Qiang Chen ◽  
John Baison ◽  
Jin Pan ◽  
Bo Karlsson ◽  
Bengt Andersson Gull ◽  
...  

AbstractBackgroundGenomic selection (GS) can increase genetic gain by reducing the length of breeding cycle in forest trees. Here we genotyped 1370 control-pollinated progeny trees from 128 full-sib families in Norway spruce (Picea abies (L.) Karst.), using exome capture as a genotyping platform. We used 116,765 high quality SNPs to develop genomic prediction models for tree height and wood quality traits. We assessed the impact of different genomic prediction methods, genotype-by-environment interaction (G×E), genetic composition, size of the training and validation set, relatedness, and the number of SNPs on the accuracy and predictive ability (PA) of GS.ResultsUsing G matrix slightly altered heritability estimates relative to pedigree-based method. GS accuracies were about 11–14% lower than those based on pedigree-based selection. The efficiency of GS per year varied from 1.71 to 1.78, compared to that of the pedigree-based model if breeding cycle length was halved using GS. Height GS accuracy decreased more than 30% using one site as training for GS prediction to the second site, indicating that G×E for tree height should be accommodated in model fitting. Using half-sib family structure instead of full-sib led a significant reduction in GS accuracy and PA. The full-sib family structure only needed 750 makers to reach similar accuracy and PA as 100,000 markers required for half-sib family, indicating that maintaining the high relatedness in the model improves accuracy and PA. Using 4000–8000 markers in full-sib family structure was sufficient to obtain GS model accuracy and PA for tree height and wood quality traits, almost equivalent to that obtained with all makers.ConclusionsThe study indicates GS would be efficient in reducing generation time of a breeding cycle in conifer tree breeding program that requires a long-term progeny testing. Sufficient number of trees within-family (16 for growth and 12 for wood quality traits) and number of SNPs (8000) are required for GS with full-sib family relationship. GS methods had little impact on GS efficiency for growth and wood quality traits. GS model should incorporate G × E effect when a strong G×E is detected.


Author(s):  
Pascal Duenk ◽  
Piter Bijma ◽  
Yvonne C J Wientjes ◽  
Mario P L Calus

Abstract Breeding programs aiming to improve the performance of crossbreds may benefit from genomic prediction of crossbred (CB) performance for purebred (PB) selection candidates. In this review, we compared genomic prediction strategies that differed in (1) the genomic prediction model used, or (2) the data used in the reference population. We found 27 unique studies, two of which used deterministic simulation, 11 used stochastic simulation, and 14 real data. Differences in accuracy and response to selection between strategies depended on i) the value of the purebred crossbred genetic correlation (rpc), ii) the genetic distance between the parental lines, iii) the size of PB and CB reference populations, and iv) the relatedness of these reference populations to the selection candidates. In studies where a PB reference population was used, the use of a dominance model yielded accuracies that were equal to or higher than those of additive models. When rpc was lower than ~0.8, and was caused mainly by GxE, it was beneficial to create a reference population of PB animals that are tested in a CB environment. In general, the benefit of collecting CB information increased with decreasing rpc. For a given rpc, the benefit of collecting CB information increased with increasing size of the reference populations. Collecting CB information was not beneficial when rpc was higher than ~0.9, especially when the reference populations were small. Collecting only phenotypes of CB animals may slightly improve accuracy and response to selection, but requires that the pedigree is known. It is therefore advisable to genotype these CB animals as well. Finally, considering the breed-origin of alleles allows for modelling breed-specific effects in the CB, but this did not always lead to higher accuracies. Our review shows that the differences in accuracy and response to selection between strategies depend on several factors. One of the most important factors is rpc, and we therefore recommend to obtain accurate estimates of rpc of all breeding goal traits. Furthermore, knowledge about the importance of components of rpc (i.e., dominance, epistasis, and GxE) can help breeders to decide which model to use, and whether to collect data on animals in a CB environment. Future research should focus on the development of a tool that predicts accuracy and response to selection from scenario specific parameters.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Fatemeh Amini ◽  
Felipe Restrepo Franco ◽  
Guiping Hu ◽  
Lizhi Wang

AbstractRecent advances in genomic selection (GS) have demonstrated the importance of not only the accuracy of genomic prediction but also the intelligence of selection strategies. The look ahead selection algorithm, for example, has been found to significantly outperform the widely used truncation selection approach in terms of genetic gain, thanks to its strategy of selecting breeding parents that may not necessarily be elite themselves but have the best chance of producing elite progeny in the future. This paper presents the look ahead trace back algorithm as a new variant of the look ahead approach, which introduces several improvements to further accelerate genetic gain especially under imperfect genomic prediction. Perhaps an even more significant contribution of this paper is the design of opaque simulators for evaluating the performance of GS algorithms. These simulators are partially observable, explicitly capture both additive and non-additive genetic effects, and simulate uncertain recombination events more realistically. In contrast, most existing GS simulation settings are transparent, either explicitly or implicitly allowing the GS algorithm to exploit certain critical information that may not be possible in actual breeding programs. Comprehensive computational experiments were carried out using a maize data set to compare a variety of GS algorithms under four simulators with different levels of opacity. These results reveal how differently a same GS algorithm would interact with different simulators, suggesting the need for continued research in the design of more realistic simulators. As long as GS algorithms continue to be trained in silico rather than in planta, the best way to avoid disappointing discrepancy between their simulated and actual performances may be to make the simulator as akin to the complex and opaque nature as possible.


BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Osval Antonio Montesinos-López ◽  
Abelardo Montesinos-López ◽  
Paulino Pérez-Rodríguez ◽  
José Alberto Barrón-López ◽  
Johannes W. R. Martini ◽  
...  

Abstract Background Several conventional genomic Bayesian (or no Bayesian) prediction methods have been proposed including the standard additive genetic effect model for which the variance components are estimated with mixed model equations. In recent years, deep learning (DL) methods have been considered in the context of genomic prediction. The DL methods are nonparametric models providing flexibility to adapt to complicated associations between data and output with the ability to adapt to very complex patterns. Main body We review the applications of deep learning (DL) methods in genomic selection (GS) to obtain a meta-picture of GS performance and highlight how these tools can help solve challenging plant breeding problems. We also provide general guidance for the effective use of DL methods including the fundamentals of DL and the requirements for its appropriate use. We discuss the pros and cons of this technique compared to traditional genomic prediction approaches as well as the current trends in DL applications. Conclusions The main requirement for using DL is the quality and sufficiently large training data. Although, based on current literature GS in plant and animal breeding we did not find clear superiority of DL in terms of prediction power compared to conventional genome based prediction models. Nevertheless, there are clear evidences that DL algorithms capture nonlinear patterns more efficiently than conventional genome based. Deep learning algorithms are able to integrate data from different sources as is usually needed in GS assisted breeding and it shows the ability for improving prediction accuracy for large plant breeding data. It is important to apply DL to large training-testing data sets.


2021 ◽  
Vol 12 ◽  
Author(s):  
◽  
Aline Fugeray-Scarbel ◽  
Catherine Bastien ◽  
Mathilde Dupont-Nivet ◽  
Stéphane Lemarié

The present study is a transversal analysis of the interest in genomic selection for plant and animal species. It focuses on the arguments that may convince breeders to switch to genomic selection. The arguments are classified into three different “bricks.” The first brick considers the addition of genotyping to improve the accuracy of the prediction of breeding values. The second consists of saving costs and/or shortening the breeding cycle by replacing all or a portion of the phenotyping effort with genotyping. The third concerns population management to improve the choice of parents to either optimize crossbreeding or maintain genetic diversity. We analyse the relevance of these different bricks for a wide range of animal and plant species and sought to explain the differences between species according to their biological specificities and the organization of breeding programs.


2021 ◽  
Vol 12 ◽  
Author(s):  
Marlee R. Labroo ◽  
Jauhar Ali ◽  
M. Umair Aslam ◽  
Erik Jon de Asis ◽  
Madonna A. dela Paz ◽  
...  

Hybrid rice varieties can outyield the best inbred varieties by 15 – 30% with appropriate management. However, hybrid rice requires more inputs and management than inbred rice to realize a yield advantage in high-yielding environments. The development of stress-tolerant hybrid rice with lowered input requirements could increase hybrid rice yield relative to production costs. We used genomic prediction to evaluate the combining abilities of 564 stress-tolerant lines used to develop Green Super Rice with 13 male sterile lines of the International Rice Research Institute for yield-related traits. We also evaluated the performance of their F1 hybrids. We identified male sterile lines with good combining ability as well as F1 hybrids with potential further use in product development. For yield per plant, accuracies of genomic predictions of hybrid genetic values ranged from 0.490 to 0.822 in cross-validation if neither parent or up to both parents were included in the training set, and both general and specific combining abilities were modeled. The accuracy of phenotypic selection for hybrid yield per plant was 0.682. The accuracy of genomic predictions of male GCA for yield per plant was 0.241, while the accuracy of phenotypic selection was 0.562. At the observed accuracies, genomic prediction of hybrid genetic value could allow improved identification of high-performing single crosses. In a reciprocal recurrent genomic selection program with an accelerated breeding cycle, observed male GCA genomic prediction accuracies would lead to similar rates of genetic gain as phenotypic selection. It is likely that prediction accuracies of male GCA could be improved further by targeted expansion of the training set. Additionally, we tested the correlation of parental genetic distance with mid-parent heterosis in the phenotyped hybrids. We found the average mid-parent heterosis for yield per plant to be consistent with existing literature values at 32.0%. In the overall population of study, parental genetic distance was significantly negatively correlated with mid-parent heterosis for yield per plant (r = −0.131) and potential yield (r = −0.092), but within female families the correlations were non-significant and near zero. As such, positive parental genetic distance was not reliably associated with positive mid-parent heterosis.


2019 ◽  
Vol 59 (8) ◽  
pp. 1428
Author(s):  
T. Granleese ◽  
S. A. Clark ◽  
N. Duijvesteijn ◽  
P. E. Bradley ◽  
J. H. J. van der Werf

The present study assessed the effectiveness and cost–benefit of several genotyping strategies for breeding poll Merino sheep in a closed nucleus with different initial allele frequencies and assuming a single-gene responsible for the horn or poll phenotype. We assumed that selection was based on phenotypes or genotypes for a single gene conferring polledness via a complete-dominance model. Under such a model, a complete fixation of the ‘polled allele’ (P) requires genotyping of the ewe-selection candidates. Testing a higher proportion of female candidates resulted in a faster fixation of the P-allele. Fixation ranged from 1 year of selection with a high starting P-allele frequency of 0.9, to 7 years for low starting P-allele frequencies of 0.3. When premiums of AU$50 or AU$100 were paid for rams with a PP genotype, breeding for PP genotypes was not profitable when the starting P-allele frequency was below 0.7. If the starting allele frequency was above 0.7, net profitability was positive over 10 years when premiums of AU$200 were paid for known PP-genotype rams. While fixing the P-allele, genetic gain for production traits was slowed down in the first 5 years of selection by up to 23% and 3% for initial P allele-frequencies of 0.3 and 0.9 respectively. Lost genetic gain due to fixing the P-allele, which can never be recovered in a closed nucleus, incurred 200–800% higher costs than the DNA testing costs. Rates of genetic gain recovered to pre-P-allele selection level rates of genetic gain once the P-allele was fixed. Testing a maximum of 25% ewe-selection candidates was the least expensive strategy across all starting allele frequencies and premiums. To avoid large losses of genetic gain in a closed nucleus with low P-allele starting frequencies, opening the nucleus should be considered to increase starting P-allele frequencies and also to potentially increase rates of genetic gain to offset the economic loss caused by P-selection.


Author(s):  
Sikiru Adeniyi Atanda ◽  
Michael Olsen ◽  
Juan Burgueño ◽  
Jose Crossa ◽  
Daniel Dzidzienyo ◽  
...  

Abstract Key message Historical data from breeding programs can be efficiently used to improve genomic selection accuracy, especially when the training set is optimized to subset individuals most informative of the target testing set. Abstract The current strategy for large-scale implementation of genomic selection (GS) at the International Maize and Wheat Improvement Center (CIMMYT) global maize breeding program has been to train models using information from full-sibs in a “test-half-predict-half approach.” Although effective, this approach has limitations, as it requires large full-sib populations and limits the ability to shorten variety testing and breeding cycle times. The primary objective of this study was to identify optimal experimental and training set designs to maximize prediction accuracy of GS in CIMMYT’s maize breeding programs. Training set (TS) design strategies were evaluated to determine the most efficient use of phenotypic data collected on relatives for genomic prediction (GP) using datasets containing 849 (DS1) and 1389 (DS2) DH-lines evaluated as testcrosses in 2017 and 2018, respectively. Our results show there is merit in the use of multiple bi-parental populations as TS when selected using algorithms to maximize relatedness between the training and prediction sets. In a breeding program where relevant past breeding information is not readily available, the phenotyping expenditure can be spread across connected bi-parental populations by phenotyping only a small number of lines from each population. This significantly improves prediction accuracy compared to within-population prediction, especially when the TS for within full-sib prediction is small. Finally, we demonstrate that prediction accuracy in either sparse testing or “test-half-predict-half” can further be improved by optimizing which lines are planted for phenotyping and which lines are to be only genotyped for advancement based on GP.


Agronomy ◽  
2020 ◽  
Vol 10 (4) ◽  
pp. 585 ◽  
Author(s):  
Seema Yadav ◽  
Phillip Jackson ◽  
Xianming Wei ◽  
Elizabeth M. Ross ◽  
Karen Aitken ◽  
...  

Sugarcane is a major industrial crop cultivated in tropical and subtropical regions of the world. It is the primary source of sugar worldwide, accounting for more than 70% of world sugar consumption. Additionally, sugarcane is emerging as a source of sustainable bioenergy. However, the increase in productivity from sugarcane has been small compared to other major crops, and the rate of genetic gains from current breeding programs tends to be plateauing. In this review, some of the main contributors for the relatively slow rates of genetic gain are discussed, including (i) breeding cycle length and (ii) low narrow-sense heritability for major commercial traits, possibly reflecting strong non-additive genetic effects involved in quantitative trait expression. A general overview of genomic selection (GS), a modern breeding tool that has been very successfully applied in animal and plant breeding, is given. This review discusses key elements of GS and its potential to significantly increase the rate of genetic gain in sugarcane, mainly by (i) reducing the breeding cycle length, (ii) increasing the prediction accuracy for clonal performance, and (iii) increasing the accuracy of breeding values for parent selection. GS approaches that can accurately capture non-additive genetic effects and potentially improve the accuracy of genomic estimated breeding values are particularly promising for the adoption of GS in sugarcane breeding. Finally, different strategies for the efficient incorporation of GS in a practical sugarcane breeding context are presented. These proposed strategies hold the potential to substantially increase the rate of genetic gain in future sugarcane breeding.


Sign in / Sign up

Export Citation Format

Share Document