Genomic prediction based on selected variants from imputed whole-genome sequence data in Australian sheep populations

Abstract Background Whole-genome sequence (WGS) data could contain information on genetic variants at or in high linkage disequilibrium with causative mutations that underlie the genetic variation of polygenic traits. Thus far, genomic prediction accuracy has shown limited increase when using such information in dairy cattle studies, in which one or few breeds with limited diversity predominate. The objective of our study was to evaluate the accuracy of genomic prediction in a multi-breed Australian sheep population of relatively less related target individuals, when using information on imputed WGS genotypes. Methods Between 9626 and 26,657 animals with phenotypes were available for nine economically important sheep production traits and all had WGS imputed genotypes. About 30% of the data were used to discover predictive single nucleotide polymorphism (SNPs) based on a genome-wide association study (GWAS) and the remaining data were used for training and validation of genomic prediction. Prediction accuracy using selected variants from imputed sequence data was compared to that using a standard array of 50k SNP genotypes, thereby comparing genomic best linear prediction (GBLUP) and Bayesian methods (BayesR/BayesRC). Accuracy of genomic prediction was evaluated in two independent populations that were each lowly related to the training set, one being purebred Merino and the other crossbred Border Leicester x Merino sheep. Results A substantial improvement in prediction accuracy was observed when selected sequence variants were fitted alongside 50k genotypes as a separate variance component in GBLUP (2GBLUP) or in Bayesian analysis as a separate category of SNPs (BayesRC). From an average accuracy of 0.27 in both validation sets for the 50k array, the average absolute increase in accuracy across traits with 2GBLUP was 0.083 and 0.073 for purebred and crossbred animals, respectively, whereas with BayesRC it was 0.102 and 0.087. The average gain in accuracy was smaller when selected sequence variants were treated in the same category as 50k SNPs. Very little improvement over 50k prediction was observed when using all WGS variants. Conclusions Accuracy of genomic prediction in diverse sheep populations increased substantially by using variants selected from whole-genome sequence data based on an independent multi-breed GWAS, when compared to genomic prediction using standard 50K genotypes.

Download Full-text

Using imputed whole-genome sequence data to improve the accuracy of genomic prediction for parasite resistance in Australian sheep

Genetics Selection Evolution ◽

10.1186/s12711-019-0476-4 ◽

2019 ◽

Vol 51 (1) ◽

Cited By ~ 6

Author(s):

Mohammad Al Kalaldeh ◽

John Gibson ◽

Naomi Duijvesteijn ◽

Hans D. Daetwyler ◽

Iona MacLeod ◽

...

Keyword(s):

Genome Sequence ◽

Genomic Prediction ◽

Sequence Data ◽

Whole Genome Sequence ◽

Whole Genome ◽

Parasite Resistance ◽

Genome Sequence Data ◽

Australian Sheep

Download Full-text

Utility of whole-genome sequence data for across-breed genomic prediction

Genetics Selection Evolution ◽

10.1186/s12711-018-0396-8 ◽

2018 ◽

Vol 50 (1) ◽

Cited By ~ 21

Author(s):

Biaty Raymond ◽

Aniek C. Bouwman ◽

Chris Schrooten ◽

Jeanine Houwing-Duistermaat ◽

Roel F. Veerkamp

Keyword(s):

Genome Sequence ◽

Genomic Prediction ◽

Sequence Data ◽

Whole Genome Sequence ◽

Whole Genome ◽

Genome Sequence Data

Download Full-text

Accuracy of genomic prediction using imputed whole-genome sequence data in white layers

Journal of Animal Breeding and Genetics ◽

10.1111/jbg.12199 ◽

2016 ◽

Vol 133 (3) ◽

pp. 167-179 ◽

Cited By ~ 24

Author(s):

M. Heidaritabar ◽

M.P.L. Calus ◽

H-J. Megens ◽

A. Vereijken ◽

M.A.M. Groenen ◽

...

Keyword(s):

Genome Sequence ◽

Genomic Prediction ◽

Sequence Data ◽

Whole Genome Sequence ◽

Whole Genome ◽

Genome Sequence Data

Download Full-text

A multi-trait meta-analysis with imputed sequence variants reveals twelve QTL for mammary gland morphology in Fleckvieh cattle

10.1101/030981 ◽

2015 ◽

Author(s):

Hubert Pausch ◽

Reiner Emmerling ◽

Hermann Schwarzenbacher ◽

Ruedi Fries

Keyword(s):

Mammary Gland ◽

Association Analysis ◽

Genome Sequence ◽

Sequence Data ◽

Association Studies ◽

Whole Genome Sequence ◽

Sequence Variants ◽

Whole Genome ◽

Genome Sequence Data ◽

Mammary Gland Morphology

Background: The availability of whole-genome sequence data from key ancestors provides an exhaustive catalogue of polymorphic sites segregating within and across cattle breeds. Sequence variants from key ancestors can be imputed in animals that have been genotyped using medium- and high-density genotyping arrays. Association analysis with imputed sequences, particularly if applied to multiple traits simultaneously, is a very powerful approach to revealing candidate causal variants underlying complex phenotypes. Results: We used whole-genome sequence data from 157 key ancestors of the German Fleckvieh population to impute 20 561 798 sequence variants in 10 363 animals that had (partly imputed) array-derived genotypes at 634 109 SNP. The imputed sequence data were enriched for rare variants. Association studies with imputed sequence variants were performed using seven correlated udder conformation traits as response variables. The calculation of an approximate multi-trait test statistic enabled us to detect twelve major QTL (P<2.97 x 10-9) controlling different aspects of mammary gland morphology. Imputed sequence variants were the most significantly associated at eleven QTL, whereas the top association signal at a QTL on BTA14 resulted from an array-derived variant. Seven QTL were associated with multiple phenotypes. Most QTL were located in non-coding regions of the genome in close neighborhood, however, to plausible candidate genes for mammary gland morphology (SP5, GC, NPFFR2, CRIM1, RXFP2, TBX5, RBM19, ADAM12). Conclusions: Association analysis with imputed sequence variants allows QTL characterization at maximum resolution. Multi-trait approaches can reveal QTL that are not detected in single-trait association studies. Most QTL for udder conformation traits were located in non-coding elements of the genome suggesting regulatory mutations to be the major determinants of variation in mammary gland morphology in cattle.

Download Full-text