SNP and Haplotype-Based Genomic Selection of Quantitative Traits in Eucalyptus globulus

Eucalyptus globulus (Labill.) is one of the most important cultivated eucalypts in temperate and subtropical regions and has been successfully subjected to intensive breeding. In this study, Bayesian genomic models that include the effects of haplotype and single nucleotide polymorphisms (SNP) were assessed to predict quantitative traits related to wood quality and tree growth in a 6-year-old breeding population. To this end, the following markers were considered: (a) ~14 K SNP markers (SNP), (b) ~3 K haplotypes (HAP), and (c) haplotypes and SNPs that were not assigned to a haplotype (HAP-SNP). Predictive ability values (PA) were dependent on the genomic prediction models and markers. On average, Bayesian ridge regression (BRR) and Bayes C had the highest PA for the majority of traits. Notably, genomic models that included the haplotype effect (either HAP or HAP-SNP) significantly increased the PA of low-heritability traits. For instance, BRR based on HAP had the highest PA (0.58) for stem straightness. Consistently, the heritability estimates from genomic models were higher than the pedigree-based estimates for these traits. The results provide additional perspectives for the implementation of genomic selection in Eucalyptus breeding programs, which could be especially beneficial for improving traits with low heritability.

Download Full-text

Genomic Selection in Winter Wheat Breeding Using a Recommender Approach

Genes ◽

10.3390/genes11070779 ◽

2020 ◽

Vol 11 (7) ◽

pp. 779

Author(s):

Dennis N. Lozada ◽

Arron H. Carter

Keyword(s):

Winter Wheat ◽

Genomic Selection ◽

Prediction Models ◽

Heading Date ◽

Predictive Ability ◽

Wheat Breeding ◽

Snp Markers ◽

Bayesian Regression ◽

Phenotypic Trait ◽

Breeding Programs

Achieving optimal predictive ability is key to increasing the relevance of implementing genomic selection (GS) approaches in plant breeding programs. The potential of an item-based collaborative filtering (IBCF) recommender system in the context of multi-trait, multi-environment GS has been explored. Different GS scenarios for IBCF were evaluated for a diverse population of winter wheat lines adapted to the Pacific Northwest region of the US. Predictions across years through cross-validations resulted in improved predictive ability when there is a high correlation between environments. Using multiple spectral traits collected from high-throughput phenotyping resulted in better GS accuracies for grain yield (GY) compared to using only single traits for predictions. Trait adjustments through various Bayesian regression models using genomic information from SNP markers was the most effective in achieving improved accuracies for GY, heading date, and plant height among the GS scenarios evaluated. Bayesian LASSO had the highest predictive ability compared to other models for phenotypic trait adjustments. IBCF gave competitive accuracies compared to a genomic best linear unbiased predictor (GBLUP) model for predicting different traits. Overall, an IBCF approach could be used as an alternative to traditional prediction models for important target traits in wheat breeding programs.

Download Full-text

Haplotype- and SNP-Based GWAS for Growth and Wood Quality Traits in Eucalyptus cladocalyx Trees under Arid Conditions

Plants ◽

10.3390/plants10010148 ◽

2021 ◽

Vol 10 (1) ◽

pp. 148

Author(s):

Camilo E. Valenzuela ◽

Paulina Ballesta ◽

Sunny Ahmar ◽

Sajid Fiaz ◽

Parviz Heidari ◽

...

Keyword(s):

Genome Wide Association Study ◽

Wood Quality ◽

Tree Height ◽

Mediterranean Ecosystems ◽

Primary Metabolism ◽

Arid Environments ◽

Nucleotide Polymorphisms ◽

Haplotype Blocks ◽

A Genome ◽

Stem Straightness

The agricultural and forestry productivity of Mediterranean ecosystems is strongly threatened by the adverse effects of climate change, including an increase in severe droughts and changes in rainfall distribution. In the present study, we performed a genome-wide association study (GWAS) to identify single-nucleotide polymorphisms (SNPs) and haplotype blocks associated with the growth and wood quality of Eucalyptus cladocalyx, a tree species suitable for low-rainfall sites. The study was conducted in a progeny-provenance trial established in an arid site with Mediterranean patterns located in the southern Atacama Desert, Chile. A total of 87 SNPs and 3 haplotype blocks were significantly associated with the 6 traits under study (tree height, diameter at breast height, slenderness coefficient, first bifurcation height, stem straightness, and pilodyn penetration). In addition, 11 loci were identified as pleiotropic through Bayesian multivariate regression and were mainly associated with wood hardness, height, and diameter. In general, the GWAS revealed associations with genes related to primary metabolism and biosynthesis of cell wall components. Additionally, associations coinciding with stress response genes, such as GEM-related 5 and prohibitin-3, were detected. The findings of this study provide valuable information regarding genetic control of morphological traits related to adaptation to arid environments.

Download Full-text

Harnessing Genetic Diversity in the USDA Pea Germplasm Collection Through Genomic Prediction

Frontiers in Genetics ◽

10.3389/fgene.2021.707754 ◽

2021 ◽

Vol 12 ◽

Author(s):

Md. Abdullah Al Bari ◽

Ping Zheng ◽

Indalecio Viera ◽

Hannah Worral ◽

Stephen Szwiec ◽

...

Keyword(s):

Genetic Diversity ◽

Seed Yield ◽

Genomic Prediction ◽

Complex Traits ◽

Prediction Models ◽

Germplasm Collection ◽

Predictive Ability ◽

Snp Markers ◽

Breeding Values ◽

Germplasm Collections

Phenotypic evaluation and efficient utilization of germplasm collections can be time-intensive, laborious, and expensive. However, with the plummeting costs of next-generation sequencing and the addition of genomic selection to the plant breeder’s toolbox, we now can more efficiently tap the genetic diversity within large germplasm collections. In this study, we applied and evaluated genomic prediction’s potential to a set of 482 pea (Pisum sativum L.) accessions—genotyped with 30,600 single nucleotide polymorphic (SNP) markers and phenotyped for seed yield and yield-related components—for enhancing selection of accessions from the USDA Pea Germplasm Collection. Genomic prediction models and several factors affecting predictive ability were evaluated in a series of cross-validation schemes across complex traits. Different genomic prediction models gave similar results, with predictive ability across traits ranging from 0.23 to 0.60, with no model working best across all traits. Increasing the training population size improved the predictive ability of most traits, including seed yield. Predictive abilities increased and reached a plateau with increasing number of markers presumably due to extensive linkage disequilibrium in the pea genome. Accounting for population structure effects did not significantly boost predictive ability, but we observed a slight improvement in seed yield. By applying the best genomic prediction model (e.g., RR-BLUP), we then examined the distribution of genotyped but nonphenotyped accessions and the reliability of genomic estimated breeding values (GEBV). The distribution of GEBV suggested that none of the nonphenotyped accessions were expected to perform outside the range of the phenotyped accessions. Desirable breeding values with higher reliability can be used to identify and screen favorable germplasm accessions. Expanding the training set and incorporating additional orthogonal information (e.g., transcriptomics, metabolomics, physiological traits, etc.) into the genomic prediction framework can enhance prediction accuracy.

Download Full-text

Mapping Quantitative Trait Loci onto Chromosome-Scale Pseudomolecules in Flax

Methods and Protocols ◽

10.3390/mps3020028 ◽

2020 ◽

Vol 3 (2) ◽

pp. 28 ◽

Cited By ~ 1

Author(s):

Frank M. You ◽

Sylvie Cloutier

Keyword(s):

Quantitative Trait Loci ◽

Candidate Genes ◽

Quantitative Trait ◽

Quantitative Traits ◽

Snp Markers ◽

Genetic Maps ◽

Nucleotide Polymorphisms ◽

Genome Wide ◽

Trait Loci ◽

Genomic Regions

Quantitative trait loci (QTL) are genomic regions associated with phenotype variation of quantitative traits. To date, a total of 313 QTL for 31 quantitative traits have been reported in 14 studies on flax. Of these, 200 QTL from 12 studies were identified based on genetic maps, the scaffold sequences, or the pre-released chromosome-scale pseudomolecules. Molecular markers for QTL identification differed across studies but the most used ones were simple sequence repeats (SSRs) or single nucleotide polymorphisms (SNPs). To uniquely map the SSR and SNP markers from different references onto the recently released chromosome-scale pseudomolecules, methods with several scripts and database files were developed to locate PCR- and SNP-based markers onto the same reference, co-locate QTL, and scan genome-wide candidate genes. Using these methods, 195 out of 200 QTL were successfully sorted onto the 15 flax chromosomes and grouped into 133 co-located QTL clusters; the candidate genes that co-located with these QTL clusters were also predicted. The methods and tools presented in this article facilitate marker re-mapping to a new reference, genome-wide QTL analysis, candidate gene scanning, and breeding applications in flax and other crops.

Download Full-text

Genomic predictive ability for foliar nutritive traits in perennial ryegrass

10.1101/727958 ◽

2019 ◽

Author(s):

Sai Krishna Arojju ◽

Mingshu Cao ◽

M. Z. Zulfi Jahufer ◽

Brent A Barrett ◽

Marty J Faville

Keyword(s):

Genomic Selection ◽

Genomic Prediction ◽

Nutritive Value ◽

Prediction Models ◽

Genotypic Variation ◽

Genetic Correlations ◽

Predictive Ability ◽

Water Soluble ◽

Training Set ◽

Sib Families

AbstractForage nutritive value impacts animal nutrition, which underpins livestock productivity, reproduction and health. Genetic improvement for nutritive traits has been limited, as they are typically expensive and time-consuming to measure through conventional methods. Genomic selection is appropriate for such complex and expensive traits, enabling cost-effective prediction of breeding values using genome-wide markers. The aims of the present study were to assess the potential of genomic selection for a range of nutritive traits in a multi-population training set, and to quantify contributions of genotypic, environmental and genotype-by-environment (G × E) variance components to trait variation and heritability for nutritive traits. The training set consisted of a total of 517 half-sibling (half-sib) families, from five advanced breeding populations, evaluated in two distinct New Zealand grazing environments. Autumn-harvested samples were analyzed for 18 nutritive traits and maternal parents of the half-sib families were genotyped using genotyping-by-sequencing. Significant (P<0.05) genotypic variation was detected for all nutritive traits and genomic heritability (h2g) was moderate to high (0.20 to 0.74). G × E interactions were significant and particularly large for water soluble carbohydrate (WSC), crude fat, phosphorus (P) and crude protein. GBLUP, KGD-GBLUP and BayesC genomic prediction models displayed similar predictive ability, estimated by 10-fold cross validation, for all nutritive traits with values ranging from r = 0.16 to 0.45 using phenotypes from across two environments. High predictive ability was observed for the mineral traits sulphur (0.44), sodium (0.45) and magnesium (0.45) and the lowest values were observed for P (0.16), digestibility (0.22) and high molecular weight WSC (0.23). Predictive ability estimates for most nutritive traits were retained when marker number was reduced from 1 million to as few as 50,000. The moderate to high predictive abilities observed suggests implementation of genomic selection is feasible for most of the nutritive traits examined. For traits with lower predictive ability, multi-trait genomic prediction approaches that exploit the strong genetic correlations observed amongst some nutritive traits may be useful. This appears to be particularly important for WSC, considered one of the primary constituent of nutritive value for forages.

Download Full-text

Sequencing depth and genotype quality: Accuracy and breeding operation considerations for genomic selection applications in autopolyploid crops

10.1101/2020.02.23.961383 ◽

2020 ◽

Cited By ~ 1

Author(s):

Dorcus C Gemenet ◽

Hannele Lindqvist-Kreuze ◽

Bode A Olukolu ◽

Bert De Boeck ◽

Guilherme da Silva Pereira ◽

...

Keyword(s):

Genomic Selection ◽

Service Providers ◽

Predictive Ability ◽

Genetic Effects ◽

Additive Models ◽

Sequencing Depth ◽

Nucleotide Polymorphisms ◽

Breeding Populations ◽

Wide Range ◽

Additive Genetic Effects

AbstractThe autopolyploid nature of potato and sweetpotato ensures a wide range of meiotic configurations and linkage phases leading to complex gene action and pose problems in genotype data quality and genomic selection analyses. We used a 315-progeny biparental population of hexaploid sweetpotato and a diversity panel of 380 tetraploid potato, genotyped using different platforms to answer the following questions: i) do polyploid crop breeders need to invest more for additional sequencing depth? ii) how many markers are required to make selection decisions? iii) does considering non-additive genetic effects improve predictive ability (PA)? iv) does considering dosage or quantitative trait loci (QTL) offer significant improvement to PA? Our results show that only a small number of highly informative single nucleotide polymorphisms (SNPs; ≤ 1000) are adequate for prediction, hence it is possible to get this number at the current sequencing depth from most service providers. We also show that considering dosage information and additive-effects only models had the best PA for most traits, while the comparative advantage of considering non-additive genetic effects and including known QTL in the predictive model depended on trait architecture. We conclude that genomic selection can help accelerate the rate of genetic gains in potato and sweetpotato. However, application of genomic selection should be considered as part of optimizing the entire breeding program. Additionally, since the predictions in the current study are based on single populations, further studies on the effects of haplotype structure and inheritance on PA should be studied in actual multi-generation breeding populations.Key messagePolypoid crop breeders do not need more investment for sequencing depth, dosage information and fewer highly informative SNPs recommended, non-additive models and QTL advantages on prediction dependent on trait architecture.

Download Full-text

Selection of trait-specific markers and multi-environment models improve genomic predictive ability in rice

10.1101/482109 ◽

2018 ◽

Author(s):

Aditi Bhandari ◽

Jérôme Bartholomé ◽

Tuong-Vi Cao ◽

Nilima Kumari ◽

Julien frouin ◽

...

Keyword(s):

Drought Stress ◽

Genomic Prediction ◽

Complex Traits ◽

Prediction Models ◽

Predictive Ability ◽

Reference Population ◽

Snp Markers ◽

Selection Strategy ◽

Specific Marker ◽

Marker Selection

AbstractDeveloping high yielding rice varieties that are tolerant to drought stress is crucial for the sustainable livelihood of rice farmers in rainfed rice cropping ecosystems. Genomic selection (GS) promises to be an effective breeding option for these complex traits. We evaluated the effectiveness of two rather new options in the implementation of GS: trait and environment-specific marker selection and the use of multi-environment prediction models. A reference population of 280 rainfed lowland accessions endowed with 215k SNP markers data was phenotyped under a favorable and two managed drought environments. Trait-specific SNP subsets (28k) were selected for each trait under each environment, using results of GWAS performed with the complete genotype dataset. Performances of single-environment and multi-environment genomic prediction models were compared using kernel regression based methods (GBLUP and RKHS) under two cross validation scenario: availability (CV2) or not (CV1) of phenotypic data for the validation set, in one of the environments. The most realistic trait-specific marker selection strategy achieved predictive ability (PA) of genomic prediction was up to 22% higher than markers selected on the bases of neutral linkage disequilibrium (LD). Tolerance to drought stress was up to 32% better predicted by multi-environment models (especially RKHS based models) under CV2 strategy. Under the less favorable CV1 strategy, the multi-environment models achieved similar PA than the single-environment predictions. We also showed that reasonable PA could be obtained with as few as 3,000 SNP markers, even in a population of low LD extent, provided marker selection is based on pairwise LD. The implications of these findings for breeding for drought tolerance are discussed. The most resource sparing option would be accurate phenotyping of the reference population in a favorable environment and under a managed drought, while the candidate population would be phenotyped only under one of those environments.

Download Full-text

Linkage disequilibrium and haplotype block patterns in popcorn populations

10.1101/688960 ◽

2019 ◽

Author(s):

Andréa Carla Bastos Andrade ◽

José Marcelo Soriano Viana ◽

Helcio Duarte Pereira ◽

Vitor Batista Pinto ◽

Fabyano Fonseca e Silva

Keyword(s):

Linkage Disequilibrium ◽

Genomic Selection ◽

Association Studies ◽

Breeding Population ◽

Human Populations ◽

Genome Wide Association Studies ◽

Nucleotide Polymorphisms ◽

Haplotype Blocks ◽

Genome Wide ◽

Evolutionary Aspects

AbstractLinkage disequilibrium (LD) analysis provides information on evolutionary aspects of the populations and allows selecting populations and single nucleotide polymorphisms (SNPs) for association studies. Recently, haplotype blocks have been used to increase the power of quantitative trait loci detection in genome-wide association studies and the prediction accuracy with genomic selection. The objectives of this study were to compare the degree of LD, the LD decay, the LD decay extent, and the number and length of haplotype blocks in the populations and to elaborate the first LD map for maize, for elucidating if the maize chromosomes also had a pattern of interspaced regions of high and low rates of recombination. We used a biparental temperate population, a tropical synthetic, and a tropical breeding population, genotyped for approximately 75,000 SNPs. The level of LD expressed by the r2 values is surprisingly low (0.02, 0.04, and 0.04), but comparable to some non-isolated human populations. The general evidence is that the synthetic is the population with higher LD. It is not expected a significant advantage of haplotype-based association study and along generations genomic selection due to the reduced number of SNPs in the haplotype blocks (2 to 3). The results concerning LD decay (rapid decay after 5-10 kb) and LD decay extent (along up to 300 kb) are in the range observed with maize inbred line panels. Our most important result is that maize chromosomes had a pattern of regions of extensive LD interspaced with regions of low LD. However, our simple simulated LD map provides evidence that this pattern can reflect regions with differences of allele frequencies and LD level (expressed by D’) and not regions with high and low rates of recombination.

Download Full-text

Genomic prediction accuracies and abilities for growth and wood quality traits of Scots pine, using genotyping-by-sequencing (GBS) data

10.1101/607648 ◽

2019 ◽

Cited By ~ 1

Author(s):

Ainhoa Calleja-Rodriguez ◽

Jin Pan ◽

Tomas Funda ◽

Zhi-Qiang Chen ◽

John Baison ◽

...

Keyword(s):

Scots Pine ◽

Genomic Prediction ◽

Prediction Models ◽

Wood Quality ◽

Genotyping By Sequencing ◽

Tree Breeding ◽

Snp Markers ◽

Progeny Testing ◽

Quality Traits ◽

The Impact

ABSTRACTHigher genetic gains can be achieved through genomic selection (GS) by shortening time of progeny testing in tree breeding programs. Genotyping-by-sequencing (GBS), combined with two imputation methods, allowed us to perform the current genomic prediction study in Scots pine (Pinus sylvestrisL.). 694 individuals representing 183 full-sib families were genotyped and phenotyped for growth and wood quality traits. 8719 SNPs were used to compare different genomic prediction models. In addition, the impact on the predictive ability (PA) and prediction accuracy to estimate genomic breeding values was evaluated by assigning different ratios of training and validation sets, as well as different subsets of SNP markers. Genomic Best Linear Unbiased Prediction (GBLUP) and Bayesian Ridge Regression (BRR) combined with expectation maximization (EM) imputation algorithm showed higher PAs and prediction accuracies than Bayesian LASSO (BL). A subset of approximately 4000 markers was sufficient to provide the same PAs and accuracies as the full set of 8719 markers. Furthermore, PAs were similar for both pedigree- and genomic-based estimations, whereas accuracies and heritabilities were slightly higher for pedigree-based estimations. However, prediction accuracies of genomic models were sufficient to achieve a higher selection efficiency per year, varying between 50-87% compared to the traditional pedigree-based selection.

Download Full-text

Genomic selection for non-key traits in radiata pine when the documented pedigree is corrected using DNA marker information

BMC Genomics ◽

10.1186/s12864-019-6420-8 ◽

2019 ◽

Vol 20 (1) ◽

Cited By ~ 4

Author(s):

Yongjun Li ◽

Jaroslav Klápště ◽

Emily Telfer ◽

Phillip Wilcox ◽

Natalie Graham ◽

...

Keyword(s):

Genomic Selection ◽

Genetic Gain ◽

Additive Genetic Variance ◽

Predictive Ability ◽

Radiata Pine ◽

Exome Capture ◽

Forward Selection ◽

Generation Interval ◽

Stem Straightness ◽

Cluster Frequency

Abstract Background Non-key traits (NKTs) in radiata pine (Pinus radiata D. Don) refer to traits other than growth, wood density and stiffness, but still of interest to breeders. Branch-cluster frequency, stem straightness, external resin bleeding and internal checking are examples of such traits and are targeted for improvement in radiata pine research programmes. Genomic selection can be conducted before the performance of selection candidates is available so that generation intervals can be reduced. Radiata pine is a species with a long generation interval, which if reduced could significantly increase genetic gain per unit of time. The aim of this study was to evaluate the accuracy and predictive ability of genomic selection and its efficiency over traditional forward selection in radiata pine for the following NKTs: branch-cluster frequency, stem straightness, internal checking, and external resin bleeding. Results Nine hundred and eighty-eight individuals were genotyped using exome capture genotyping by sequencing (GBS) and 67,168 single nucleotide polymorphisms (SNPs) used to develop genomic estimated breeding values (GEBVs) with genomic best linear unbiased prediction (GBLUP). The documented pedigree was corrected using a subset of 704 SNPs. The percentage of trio parentage confirmed was about 49% and about 50% of parents were re-assigned. The accuracy of GEBVs was 0.55–0.75 when using the documented pedigree and 0.61–0.80 when using the SNP-corrected pedigree. A higher percentage of additive genetic variance was explained and a higher predictive ability was observed when using the SNP-corrected pedigree than using the documented pedigree. With the documented pedigree, genomic selection was similar to traditional forward selection when assuming a generation interval of 17 years, but worse than traditional forward selection when assuming a generation interval of 14 years. After the pedigree was corrected, genomic selection led to 37–115% and 13–77% additional genetic gain over traditional forward selection when generation intervals of 17 years and 14 years were assumed, respectively. Conclusion It was concluded that genomic selection with a pedigree corrected by SNP information was an efficient way of improving non-key traits in radiata pine breeding.

Download Full-text