41 Using Sequence Data to Increase Accuracy of Genomic Predictions in Livestock: Are We There Yet?
Abstract As sequence data is becoming available for many livestock species, there is a question on whether this information can help to boost the accuracy of genomic predictions beyond what has already been achieved with SNP chips. Several studies have been conducted by our group using simulated and real livestock populations that included from 1,000 to 100,000 animals with full or imputed sequence information. For the real datasets, the potential causative variants were identified based on genome-wide association (GWA) and were added to the current SNP chips. Additional scenarios included the use of only causative variants and the use of all sequence SNP. Genomic predictions were obtained based on single-step GBLUP (ssGBLUP), and in some cases, Bayesian regressions. Overall, in real datasets, we observed no significant increase in accuracy by using all sequence SNP, causative variants alone, or combined with SNP currently used for genomic prediction. However, an increase in accuracy of almost 100% was observed in simulated datasets when the causative variants were added to a 60k SNP panel and their simulated variances were accounted for by the prediction model. Our results show that if true causative variants are identified, together with their position and the variance explained, a boost in accuracy can be observed. This raises a question on the effectiveness of the methods and size of the datasets used to select causative variants in real livestock populations. We observed distinct GWA methods work differently depending on the data structure, and the number of genotyped animals with phenotypes. The combination of large-scale sequence and other layers of omics data (e.g., functional data) can help to identify some of the true causative variants. This could possibly promote an increase in the accuracy of genomic predictions in real populations.