Improving accuracy of genomic prediction by genetic architecture based priors in a Bayesian model

AbstractThe transcriptome-wide association studies (TWAS) that test for association between the study trait and the imputed gene expression levels from cis-acting expression quantitative trait loci (cis-eQTL) genotypes have successfully enhanced the discovery of genetic risk loci for complex traits. By using the gene expression imputation models fitted from reference datasets that have both genetic and transcriptomic data, TWAS facilitates gene-based tests with GWAS data while accounting for the reference transcriptomic data. The existing TWAS tools like PrediXcan and FUSION use parametric imputation models that have limitations for modeling the complex genetic architecture of transcriptomic data. Therefore, we propose an improved Bayesian method that assumes a data-driven nonparametric prior to impute gene expression. Our method is general and flexible and includes both the parametric imputation models used by PrediXcan and FUSION as special cases. Our simulation studies showed that the nonparametric Bayesian model improved both imputation R2 for transcriptomic data and the TWAS power over PrediXcan. In real applications, our nonparametric Bayesian method fitted transcriptomic imputation models for 2X number of genes with 1.7X average regression R2 over PrediXcan, thus improving the power of follow-up TWAS. Hence, the nonparametric Bayesian model is preferred for modeling the complex genetic architecture of transcriptomes and is expected to enhance transcriptome-integrated genetic association studies. We implement our Bayesian approach in a convenient software tool “TIGAR” (Transcriptome-Integrated Genetic Association Resource), which imputes transcriptomic data and performs subsequent TWAS using individual-level or summary-level GWAS data.

Download Full-text

Genetic architecture of maize chlorotic mottle virus and maize lethal necrosis through GWAS, linkage analysis and genomic prediction in tropical maize germplasm

Theoretical and Applied Genetics ◽

10.1007/s00122-019-03360-x ◽

2019 ◽

Vol 132 (8) ◽

pp. 2381-2399 ◽

Cited By ~ 14

Author(s):

Chelang’at Sitonik ◽

L. M. Suresh ◽

Yoseph Beyene ◽

Michael S. Olsen ◽

Dan Makumbi ◽

...

Keyword(s):

Linkage Analysis ◽

Genomic Prediction ◽

Genetic Architecture ◽

Mottle Virus ◽

Tropical Maize ◽

Maize Germplasm ◽

Chlorotic Mottle Virus ◽

Maize Chlorotic Mottle Virus ◽

Chlorotic Mottle

Download Full-text

Quantitative Genomic Dissection of Soybean Yield Components

G3 Genes|Genome|Genetics ◽

10.1534/g3.119.400896 ◽

2019 ◽

Vol 10 (2) ◽

pp. 665-675

Author(s):

Alencar Xavier ◽

Katy M. Rainey

Keyword(s):

Genomic Prediction ◽

Yield Components ◽

Genetic Architecture ◽

Breeding Strategy ◽

Genotype By Environment ◽

Field Crops ◽

Common Parent ◽

Data Collections ◽

Enhanced Yield ◽

Selection Of

Soybean is a crop of major economic importance with low rates of genetic gains for grain yield compared to other field crops. A deeper understanding of the genetic architecture of yield components may enable better ways to tackle the breeding challenges. Key yield components include the total number of pods, nodes and the ratio pods per node. We evaluated the SoyNAM population, containing approximately 5600 lines from 40 biparental families that share a common parent, in 6 environments distributed across 3 years. The study indicates that the yield components under evaluation have low heritability, a reasonable amount of epistatic control, and partially oligogenic architecture: 18 quantitative trait loci were identified across the three yield components using multi-approach signal detection. Genetic correlation between yield and yield components was highly variable from family-to-family, ranging from -0.2 to 0.5. The genotype-by-environment correlation of yield components ranged from -0.1 to 0.4 within families. The number of pods can be utilized for indirect selection of yield. The selection of soybean for enhanced yield components can be successfully performed via genomic prediction, but the challenging data collections necessary to recalibrate models over time makes the introgression of QTL a potentially more feasible breeding strategy. The genomic prediction of yield components was relatively accurate across families, but less accurate predictions were obtained from within family predictions and predicting families not observed included in the calibration set.

Download Full-text

Accounting for genetic architecture in single- and multipopulation genomic prediction using weights from genomewide association studies in pigs

Journal of Animal Breeding and Genetics ◽

10.1111/jbg.12202 ◽

2016 ◽

Vol 133 (3) ◽

pp. 187-196 ◽

Cited By ~ 4

Author(s):

R. Veroneze ◽

P.S. Lopes ◽

M.S. Lopes ◽

A.M. Hidalgo ◽

S.E.F. Guimarães ◽

...

Keyword(s):

Genomic Prediction ◽

Genetic Architecture ◽

Association Studies ◽

Genomewide Association

Download Full-text

Genetic architecture and genomic prediction accuracy of apple quantitative traits across environments

10.1101/2021.11.29.470309 ◽

2021 ◽

Author(s):

Michaela Jung ◽

Beat Keller ◽

Morgane Roth ◽

Maria Jose Aranzana ◽

Annemarie Auwerkerken ◽

...

Keyword(s):

Genomic Prediction ◽

Prediction Accuracy ◽

Genetic Architecture ◽

Quantitative Traits ◽

Prediction Models ◽

Phenotypic Variability ◽

Reference Population ◽

Genomic Study ◽

Genomic Tools ◽

Breeding Efficiency

Implementation of genomic tools is desirable to increase the efficiency of apple breeding. The apple reference population (apple REFPOP) proved useful for rediscovering loci, estimating genomic prediction accuracy, and studying genotype by environment interactions (GxE). Here we show contrasting genetic architecture and genomic prediction accuracies for 30 quantitative traits across up to six European locations using the apple REFPOP. A total of 59 stable and 277 location-specific associations were found using GWAS, 69.2% of which are novel when compared with 41 reviewed publications. Average genomic prediction accuracies of 0.18-0.88 were estimated using single-environment univariate, single-environment multivariate, multi-environment univariate, and multi-environment multivariate models. The GxE accounted for up to 24% of the phenotypic variability. This most comprehensive genomic study in apple in terms of trait-environment combinations provided knowledge of trait biology and prediction models that can be readily applied for marker-assisted or genomic selection, thus facilitating increased breeding efficiency.

Download Full-text

Improving Accuracy of Genomic Prediction in Holstein Friesians

10.31274/ans_air-180814-717 ◽

2013 ◽

Author(s):

Melanie K. Hayr ◽

Mahdi Saatchi ◽

Dave Johnson ◽

Dorian J. Garrick

Keyword(s):

Genomic Prediction ◽

Improving Accuracy

Download Full-text

Evaluation of a Computational Diagnostic for Epistasis in Plant Breeding Populations

10.1101/044453 ◽

2016 ◽

Cited By ~ 1

Author(s):

Reka Howard ◽

William D Beavis ◽

Alicia Carriquiry

Keyword(s):

Plant Breeding ◽

Genomic Prediction ◽

Genetic Architecture ◽

Systematic Approach ◽

Linear Models ◽

Prediction Methods ◽

Response Surface Methods ◽

Breeding Populations ◽

The Difference ◽

Non Parametric

ABSTRACT Previously we reported the inability of genomic prediction methods based on linear models to accurately predict trait values composed of an epistatic genetic architecture. We also reported non-parametric genomic prediction methods applied to the same data produced reasonably accurate predictions. The difference led us to propose analyses by paired parametric and non-parametric methods to the same data could be used as a diagnostic for epistatic genetic architectures in typical plant breeding populations. The suggested computational diagnostic was based on evaluation of 14 genomic prediction methods applied to eight sets of simulated conditions consisting of three factors, each with two levels. Because the potential set of factors that might affect accuracies of genomic predictions is unknown, there is a need for a systematic approach to identify combinations of factors that impact estimates of accuracy. Herein we propose the application of response surface methods to systematically identify conditions that maximize the difference between estimated accuracies of genomic prediction methods. The results indicate that genetic architecture and repeatability at their upper boundaries for complete epistasis and repeatability have the greatest influence on the differences between parametric and non-parametric estimated prediction accuracies. Further, the surface is very steep in the vicinity of the boundary conditions, indicating that the proposed diagnostic is of limited value for discovery of epistatic genetic architectures.

Download Full-text