GWAS findings improved genomic prediction accuracy of lipid profile traits: Tehran Cardiometabolic Genetic Study

AbstractIn recent decades, ongoing GWAS findings discovered novel therapeutic modifications such as whole-genome risk prediction in particular. Here, we proposed a method based on integrating the traditional genomic best linear unbiased prediction (gBLUP) approach with GWAS information to boost genetic prediction accuracy and gene-based heritability estimation. This study was conducted in the framework of the Tehran Cardio-metabolic Genetic study (TCGS) containing 14,827 individuals and 649,932 SNP markers. Five SNP subsets were selected based on GWAS results: top 1%, 5%, 10%, 50% significant SNPs, and reported associated SNPs in previous studies. Furthermore, we randomly selected subsets as large as every five subsets. Prediction accuracy has been investigated on lipid profile traits with a tenfold and 10-repeat cross-validation algorithm by the gBLUP method. Our results revealed that genetic prediction based on selected subsets of SNPs obtained from the dataset outperformed the subsets from previously reported SNPs. Selected SNPs’ subsets acquired a more precise prediction than whole SNPs and much higher than randomly selected SNPs. Also, common SNPs with the most captured prediction accuracy in the selected sets caught the highest gene-based heritability. However, it is better to be mindful of the fact that a small number of SNPs obtained from GWAS results could capture a highly notable proportion of variance and prediction accuracy.

Download Full-text

The Role of Different Linkage Disequilibrium Patterns in Genomic Prediction: The gBULP Based Exploratory Method in Tehran Cardiometabolic Genetic Study

10.21203/rs.3.rs-127117/v1 ◽

2020 ◽

Author(s):

Mahdei Akbarzadeh ◽

Saeid Dehkordi ◽

Mahmoud Roudbar ◽

Parisa Riahi ◽

Mehdi Sargolzaei ◽

...

Keyword(s):

Linkage Disequilibrium ◽

Genetic Study ◽

Genomic Prediction ◽

Prediction Accuracy ◽

Public Health Care ◽

Genetic Prediction ◽

Heritability Estimation ◽

Dramatic Rise ◽

The Difference ◽

Family Based

Abstract Background: Current GWAS discoveries have discovered novel clinical improvements in recent decades, such as estimating whole-genome risk. Genetic prediction of traits has substantial impacts on public health care and disease prevention. This study aimed to investigate the effects of different linkage disequilibrium (LD) patterns on genomic prediction accuracy and SNP-based heritability estimation for four lipid profile traits.Results: This family-based study included 11,798 individuals ranging from 3 to 80 ys, extracted from Tehran Cardiometabolic Genetic Study (TCGS). LD patterns were considered on different thresholds (0.01, 0.03, 0.05, 0.07, 0.09, 0.1, 0.2, 0.3, 0.5, 0.6, 0.7, 0.8, and 0.9) to create subsets of SNPs. We have compared the prediction accuracy and SNP-based heritability estimation of the selected SNPs within these patterns as well as randomly selected SNPs with equal sizes. Subsets of SNPs selected based on LD patterns had a higher prediction accuracy level than subsets of SNPs selected randomly, and when the LD threshold increases, the difference tends to zero. The results were consistent when the prediction accuracy of subsets were adjusted for their SNP numbers in all traits. For all traits, when the number of SNPs was adjusted, between LD threshold 0.01 and 0.2, both prediction accuracy and SNP-based heritability have a dramatic rise. After substantial growth, there was a steady decline, and they reach a peak at an LD threshold between 0.2 and 0.3.Conclusions: This research indicated that having selected subsets of SNPs based on the LD threshold always outperform randomly selected SNPs for prediction objectives. However, determining the specific LD threshold for prediction purposes might be controversial since achieving the highest level of prediction accuracy, when the number of SNPs is adjusted, prompts different results (in our case, 0.3 when the SNP number was adjusted and 0.9 when the SNP number is not adjusted). Finally, we concluded that choosing the LD threshold as a tool to boost genetic prediction accuracy should be used with intense care.

Download Full-text

Accounting for Group-Specific Allele Effects and Admixture in Genomic Predictions: Theory and Experimental Evaluation in Maize

Genetics ◽

10.1534/genetics.120.303278 ◽

2020 ◽

Vol 216 (1) ◽

pp. 27-41

Author(s):

Simon Rio ◽

Laurence Moreau ◽

Alain Charcosset ◽

Tristan Mary-Huard

Keyword(s):

Genomic Prediction ◽

Prediction Accuracy ◽

Prediction Models ◽

Best Linear Unbiased Prediction ◽

Linear Unbiased Prediction ◽

Modeling Group ◽

A Genome ◽

Specific Allele ◽

Best Linear Unbiased ◽

Unbiased Prediction

Populations structured into genetic groups may display group-specific linkage disequilibrium, mutations, and/or interactions between quantitative trait loci and the genetic background. These factors lead to heterogeneous marker effects affecting the efficiency of genomic prediction, especially for admixed individuals. Such individuals have a genome that is a mosaic of chromosome blocks from different origins, and may be of interest to combine favorable group-specific characteristics. We developed two genomic prediction models adapted to the prediction of admixed individuals in presence of heterogeneous marker effects: multigroup admixed genomic best linear unbiased prediction random individual (MAGBLUP-RI), modeling the ancestry of alleles; and multigroup admixed genomic best linear unbiased prediction random allele effect (MAGBLUP-RAE), modeling group-specific distributions of allele effects. MAGBLUP-RI can estimate the segregation variance generated by admixture while MAGBLUP-RAE can disentangle the variability that is due to main allele effects from the variability that is due to group-specific deviation allele effects. Both models were evaluated for their genomic prediction accuracy using a maize panel including lines from the Dent and Flint groups, along with admixed individuals. Based on simulated traits, both models proved their efficiency to improve genomic prediction accuracy compared to standard GBLUP models. For real traits, a clear gain was observed at low marker densities whereas it became limited at high marker densities. The interest of including admixed individuals in multigroup training sets was confirmed using simulated traits, but was variable using real traits. Both MAGBLUP models and admixed individuals are of interest whenever group-specific SNP allele effects exist.

Download Full-text

Genomic studies with pre-selected markers reveal dominance effects influencing growth traits in Eucalyptus nitens

G3 Genes|Genome|Genetics ◽

10.1093/g3journal/jkab363 ◽

2021 ◽

Author(s):

Bala R Thumma ◽

Kelsey R Joyce ◽

Andrew Jacobs

Keyword(s):

Inbreeding Depression ◽

Prediction Accuracy ◽

Basic Density ◽

Predictive Ability ◽

Best Linear Unbiased Prediction ◽

The Other ◽

Eucalyptus Nitens ◽

Linear Unbiased Prediction ◽

Best Linear Unbiased ◽

Unbiased Prediction

Abstract Genomic selection (GS) is being increasingly adopted by the tree breeding community. Most of the GS studies in trees are focused on estimating additive genetic effects. Exploiting the dominance effects offers additional opportunities to improve genetic gain. To detect dominance effects, trait relevant markers may be important compared to non-selected markers. Here we used pre-selected markers to study the dominance effects in a Eucalyptus nitens (E. nitens) breeding population consisting of open-pollinated (OP) and controlled-pollinated (CP) families. We used 8221 trees from six progeny trials in this study. Of these, 868 progeny and 255 parents were genotyped with the E. nitens marker panel. Three traits; diameter at breast height (DBH), wood basic density (DEN) and kraft pulp yield (KPY) were analysed. Two types of genomic relationship matrices based on identity-by-state (IBS) and identity-by-descent (IBD) were tested. Performance of the genomic best linear unbiased prediction (GBLUP) models with IBS and IBD matrices were compared with pedigree-based additive best linear unbiased prediction (ABLUP) models with and without the pedigree reconstruction. Similarly, the performance of the single-step GBLUP (ssGBLUP) with IBS and IBD matrices were compared with ABLUP models using all 8221 trees. Significant dominance effects were observed with the GBLUP-AD model for DBH. The predictive ability of DBH is higher with the GBLUP-AD model compared to other models. Similarly, the prediction accuracy of genotypic values is higher with GBLUP-AD compared to the GBLUP-A model. Among the two GBLUP models (IBS and IBD), no differences were observed in predictive abilities and prediction accuracies. While the estimates of predictive ability with additive effects were similar among all four models, prediction accuracies of ABLUP were lower than the GBLUP models. The prediction accuracy of ssGBLUP-IBD is higher than the other three models while the theoretical accuracy of ssGBLUP-IBS is consistently higher than the other three models across all three groups tested (parents, genotyped, non-genotyped). Significant inbreeding depression was observed for DBH and KPY. While there is a linear relationship between inbreeding and DBH, the relationship between inbreeding and KPY is non-linear and quadratic. These results indicate that the inbreeding depression of DBH is mainly due to directional dominance while in KPY it may be due to epistasis. Inbreeding depression may be the main source of the observed dominance effects in DBH. The significant dominance effect observed for DBH may be used to select complementary parents to improve the genetic merit of the progeny in E. nitens.

Download Full-text

Core-dependent changes in genomic predictions using the Algorithm for Proven and Young in single-step genomic best linear unbiased prediction

Journal of Animal Science ◽

10.1093/jas/skaa374 ◽

2020 ◽

Vol 98 (12) ◽

Author(s):

Ignacy Misztal ◽

Shogo Tsuruta ◽

Ivan Pocrnic ◽

Daniela Lourenco

Keyword(s):

Prediction Accuracy ◽

Best Linear Unbiased Prediction ◽

Single Step ◽

Relationship Matrix ◽

Linear Unbiased Prediction ◽

Breeding Values ◽

Best Linear Unbiased ◽

The Impact ◽

Genomic Predictions ◽

Unbiased Prediction

Abstract Single-step genomic best linear unbiased prediction with the Algorithm for Proven and Young (APY) is a popular method for large-scale genomic evaluations. With the APY algorithm, animals are designated as core or noncore, and the computing resources to create the inverse of the genomic relationship matrix (GRM) are reduced by inverting only a portion of that matrix for core animals. However, using different core sets of the same size causes fluctuations in genomic estimated breeding values (GEBVs) up to one additive standard deviation without affecting prediction accuracy. About 2% of the variation in the GRM is noise. In the recursion formula for APY, the error term modeling the noise is different for every set of core animals, creating changes in breeding values. While average changes are small, and correlations between breeding values estimated with different core animals are close to 1.0, based on the normal distribution theory, outliers can be several times bigger than the average. Tests included commercial datasets from beef and dairy cattle and from pigs. Beyond a certain number of core animals, the prediction accuracy did not improve, but fluctuations decreased with more animals. Fluctuations were much smaller than the possible changes based on prediction error variance. GEBVs change over time even for animals with no new data as genomic relationships ties all the genotyped animals, causing reranking of top animals. In contrast, changes in nongenomic models without new data are small. Also, GEBV can change due to details in the model, such as redefinition of contemporary groups or unknown parent groups. In particular, increasing the fraction of blending of the GRM with a pedigree relationship matrix from 5% to 20% caused changes in GEBV up to 0.45 SD, with a correlation of GEBV > 0.99. Fluctuations in genomic predictions are part of genomic evaluation models and are also present without the APY algorithm when genomic evaluations are computed with updated data. The best approach to reduce the impact of fluctuations in genomic evaluations is to make selection decisions not on individual animals with limited individual accuracy but on groups of animals with high average accuracy.

Download Full-text

Evaluation of Genome-Enabled Prediction for Carcass Primal Cut Yields Using Single-Step Genomic Best Linear Unbiased Prediction in Hanwoo Cattle

Genes ◽

10.3390/genes12121886 ◽

2021 ◽

Vol 12 (12) ◽

pp. 1886

Author(s):

Masoumeh Naserkheil ◽

Hossein Mehrban ◽

Deukmin Lee ◽

Mi Na Park

Keyword(s):

Prediction Accuracy ◽

Single Step ◽

Linear Regression Method ◽

Nucleotide Polymorphisms ◽

Linear Unbiased Prediction ◽

Cattle Industry ◽

A Value ◽

Starting Point ◽

Best Linear Unbiased ◽

Hanwoo Cattle

There is a growing interest worldwide in genetically selecting high-value cut carcass weights, which allows for increased profitability in the beef cattle industry. Primal cut yields have been proposed as a potential indicator of cutability and overall carcass merit, and it is worthwhile to assess the prediction accuracies of genomic selection for these traits. This study was performed to compare the prediction accuracy obtained from a conventional pedigree-based BLUP (PBLUP) and a single-step genomic BLUP (ssGBLUP) method for 10 primal cut traits—bottom round, brisket, chuck, flank, rib, shank, sirloin, striploin, tenderloin, and top round—in Hanwoo cattle with the estimators of the linear regression method. The dataset comprised 3467 phenotypic observations for the studied traits and 3745 genotyped individuals with 43,987 single-nucleotide polymorphisms. In the partial dataset, the accuracies ranged from 0.22 to 0.30 and from 0.37 to 0.54 as evaluated using the PBLUP and ssGBLUP models, respectively. The accuracies of PBLUP and ssGBLUP with the whole dataset varied from 0.45 to 0.75 (average 0.62) and from 0.52 to 0.83 (average 0.71), respectively. The results demonstrate that ssGBLUP performed better than PBLUP averaged over the 10 traits, in terms of prediction accuracy, regardless of considering a partial or whole dataset. Moreover, ssGBLUP generally showed less biased prediction and a value of dispersion closer to 1 than PBLUP across the studied traits. Thus, the ssGBLUP seems to be more suitable for improving the accuracy of predictions for primal cut yields, which can be considered a starting point in future genomic evaluation for these traits in Hanwoo breeding practice.

Download Full-text

Application of Genomic Data for Reliability Improvement of Pig Breeding Value Estimates

Animals ◽

10.3390/ani11061557 ◽

2021 ◽

Vol 11 (6) ◽

pp. 1557

Author(s):

Ekaterina Melnikova ◽

Artem Kabanov ◽

Sergey Nikitin ◽

Maria Somova ◽

Sergey Kharitonov ◽

...

Keyword(s):

Growth Traits ◽

Genomic Data ◽

Snp Markers ◽

Single Step ◽

Backfat Thickness ◽

Breeding Value ◽

Genomic Evaluation ◽

Linear Unbiased Prediction ◽

Reproduction Traits ◽

Best Linear Unbiased

Replacement pigs’ genomic prediction for reproduction (total number and born alive piglets in the first parity), meat, fatness and growth traits (muscle depth, days to 100 kg and backfat thickness over 6–7 rib) was tested using single-step genomic best linear unbiased prediction ssGBLUP methodology. These traits were selected as the most economically significant and different in terms of heritability. The heritability for meat, fatness and growth traits varied from 0.17 to 0.39 and for reproduction traits from 0.12 to 0.14. We confirm from our data that ssGBLUP is the most appropriate method of genomic evaluation. The validation of genomic predictions was performed by calculating the correlation between preliminary GEBV (based on pedigree and genomic data only) with high reliable conventional estimates (EBV) (based on pedigree, own phenotype and offspring records) of validating animals. Validation datasets include 151 and 110 individuals for reproduction, meat and fattening traits, respectively. The level of correlation (r) between EBV and GEBV scores varied from +0.44 to +0.55 for meat and fatness traits, and from +0.75 to +0.77 for reproduction traits. Average breeding value (EBV) of group selected on genomic evaluation basis exceeded the group selected on parental average estimates by 22, 24 and 66% for muscle depth, days to 100 kg and backfat thickness over 6–7 rib, respectively. Prediction based on SNP markers data and parental estimates showed a significant increase in the reliability of low heritable reproduction traits (about 40%), which is equivalent to including information about 10 additional descendants for sows and 20 additional descendants for boars in the evaluation dataset.

Download Full-text

Genome prediction accuracy of common bean via Bayesian models

Ciência Rural ◽

10.1590/0103-8478cr20170497 ◽

2018 ◽

Vol 48 (8) ◽

Author(s):

Leiri Daiane Barili ◽

Naine Martins do Vale ◽

Fabyano Fonseca e Silva ◽

José Eustáquio de Souza Carneiro ◽

Hinayah Rojas de Oliveira ◽

...

Keyword(s):

Common Bean ◽

Prediction Accuracy ◽

Bayesian Models ◽

Snp Markers ◽

Genomic Information ◽

Stay Green ◽

Single Nucleotide ◽

Genome Prediction ◽

Heritability Estimation ◽

Selection Of

ABSTRACT: We aimed to apply genomic information based on SNP (single nucleotide polymorphism) markers for the genetic evaluation of the traits “stay-green” (SG), plant architecture (PA), grain aspect (GA) and grain yield (GY) in common bean through Bayesian models. These models were compared in terms of prediction accuracy and ability for heritability estimation for each one of the mentioned traits. A total of 80 cultivars were genotyped for 377 SNP markers, whose effects were estimated by five different Bayesian models: Bayes A (BA), B (BB), C (BC), LASSO (BL) e Ridge regression (BRR). Although, prediction accuracies calculated by means of cross-validation have been similar within each trait, the BB model stood out for the trait SG, whereas the BRR was indicated for the remaining traits. The heritability estimates for the traits SG, PA, GA and GY were 0.61, 0.28, 0.32 and 0.29, respectively. In summary, the Bayesian methods applied here were effective and ease to be implemented. The used SNP markers can help in the early selection of promising genotypes, since incorporating genomic information increase the prediction accuracy of the estimated genetic merit.

Download Full-text

A Bayesian Genomic Multi-output Regressor Stacking Model for Predicting Multi-trait Multi-environment Plant Breeding Data

G3 Genes|Genome|Genetics ◽

10.1534/g3.119.400336 ◽

2019 ◽

Vol 9 (10) ◽

pp. 3381-3393 ◽

Cited By ~ 4

Author(s):

Osval A. Montesinos-López ◽

Abelardo Montesinos-López ◽

José Crossa ◽

Jaime Cuevas ◽

José C. Montesinos-López ◽

...

Keyword(s):

Ridge Regression ◽

Prediction Accuracy ◽

Best Linear Unbiased Prediction ◽

Environment Interaction ◽

Linear Unbiased Prediction ◽

Breeding Programs ◽

Second Stage ◽

Genotype Environment Interaction ◽

Best Linear Unbiased ◽

Two Stages

In this paper we propose a Bayesian multi-output regressor stacking (BMORS) model that is a generalization of the multi-trait regressor stacking method. The proposed BMORS model consists of two stages: in the first stage, a univariate genomic best linear unbiased prediction (GBLUP including genotype × environment interaction GE) model is implemented for each of the L traits under study; then the predictions of all traits are included as covariates in the second stage, by implementing a Ridge regression model. The main objectives of this research were to study alternative models to the existing multi-trait multi-environment (BMTME) model with respect to (1) genomic-enabled prediction accuracy, and (2) potential advantages in terms of computing resources and implementation. We compared the predictions of the BMORS model to those of the univariate GBLUP model using 7 maize and wheat datasets. We found that the proposed BMORS produced similar predictions to the univariate GBLUP model and to the BMTME model in terms of prediction accuracy; however, the best predictions were obtained under the BMTME model. In terms of computing resources, we found that the BMORS is at least 9 times faster than the BMTME method. Based on our empirical findings, the proposed BMORS model is an alternative for predicting multi-trait and multi-environment data, which are very common in genomic-enabled prediction in plant and animal breeding programs.

Download Full-text

Improving the accuracy of genomic evaluation for linear body measurement traits using single-step genomic best linear unbiased prediction in Hanwoo beef cattle

BMC Genetics ◽

10.1186/s12863-020-00928-1 ◽

2020 ◽

Vol 21 (1) ◽

Author(s):

Masoumeh Naserkheil ◽

Deuk Hwan Lee ◽

Hossein Mehrban

Keyword(s):

Prediction Accuracy ◽

Best Linear Unbiased Prediction ◽

Single Step ◽

Farm Animals ◽

Body Measurement ◽

Data Set ◽

Linear Unbiased Prediction ◽

Body Measurement Traits ◽

Best Linear Unbiased ◽

Unbiased Prediction

Abstract Background Recently, there has been a growing interest in the genetic improvement of body measurement traits in farm animals. They are widely used as predictors of performance, longevity, and production traits, and it is worthwhile to investigate the prediction accuracies of genomic selection for these traits. In genomic prediction, the single-step genomic best linear unbiased prediction (ssGBLUP) method allows the inclusion of information from genotyped and non-genotyped relatives in the analysis. Hence, we aimed to compare the prediction accuracy obtained from a pedigree-based BLUP only on genotyped animals (PBLUP-G), a traditional pedigree-based BLUP (PBLUP), a genomic BLUP (GBLUP), and a single-step genomic BLUP (ssGBLUP) method for the following 10 body measurement traits at yearling age of Hanwoo cattle: body height (BH), body length (BL), chest depth (CD), chest girth (CG), chest width (CW), hip height (HH), hip width (HW), rump length (RL), rump width (RW), and thurl width (TW). The data set comprised 13,067 phenotypic records for body measurement traits and 1523 genotyped animals with 34,460 single-nucleotide polymorphisms. The accuracy for each trait and model was estimated only for genotyped animals using five-fold cross-validations. Results The accuracies ranged from 0.02 to 0.19, 0.22 to 0.42, 0.21 to 0.44, and from 0.36 to 0.55 as assessed using the PBLUP-G, PBLUP, GBLUP, and ssGBLUP methods, respectively. The average predictive accuracies across traits were 0.13 for PBLUP-G, 0.34 for PBLUP, 0.33 for GBLUP, and 0.45 for ssGBLUP methods. Our results demonstrated that averaged across all traits, ssGBLUP outperformed PBLUP and GBLUP by 33 and 43%, respectively, in terms of prediction accuracy. Moreover, the least root of mean square error was obtained by ssGBLUP method. Conclusions Our findings suggest that considering the ssGBLUP model may be a promising way to ensure acceptable accuracy of predictions for body measurement traits, especially for improving the prediction accuracy of selection candidates in ongoing Hanwoo breeding programs.

Download Full-text

Validation of the Prediction Accuracy for 13 Traits in Chinese Simmental Beef Cattle Using a Preselected Low-Density SNP Panel

Animals ◽

10.3390/ani11071890 ◽

2021 ◽

Vol 11 (7) ◽

pp. 1890

Author(s):

Ling Xu ◽

Qunhao Niu ◽

Yan Chen ◽

Zezhao Wang ◽

Lei Xu ◽

...

Keyword(s):

Beef Cattle ◽

Prediction Accuracy ◽

Imputation Accuracy ◽

Predictive Performance ◽

Cost Effective ◽

Low Density ◽

Beef Industry ◽

Linear Unbiased Prediction ◽

Best Linear Unbiased ◽

Snp Panel

Chinese Simmental beef cattle play a key role in the Chinese beef industry due to their great adaptability and marketability. To achieve efficient genetic gain at a low breeding cost, it is crucial to develop a customized cost-effective low-density SNP panel for this cattle population. Thirteen growth, carcass, and meat quality traits and a BovineHD Beadchip genotyping of 1346 individuals were used to select trait-associated variants and variants contributing to great genetic variance. In addition, highly informative SNPs with high MAF in each 500 kb sliding window and in each genic region were also included separately. A low-density SNP panel consisting of 30,684 SNPs was developed, with an imputation accuracy of 97.4% when imputed to the 770 K level. Among 13 traits, the average prediction accuracy levels evaluated by genomic best linear unbiased prediction (GBLUP) and BayesA/B/Cπ were 0.22–0.47 and 0.18–0.60 for the ~30 K array and BovineHD Beadchip, respectively. Generally, the predictive performance of the ~30 K array was trait-dependent, with reduced prediction accuracies for seven traits. While differences in terms of prediction accuracy were observed among the 13 traits, the low-density SNP panel achieved moderate to high accuracies for most of the traits and even improved the accuracies for some traits.

Download Full-text