scholarly journals Impact of Marker Pruning Strategies Based on Different Measurements of Marker Distance on Genomic Prediction in Dairy Cattle

Animals ◽  
2021 ◽  
Vol 11 (7) ◽  
pp. 1992
Author(s):  
Duanyang Ren ◽  
Jinyan Teng ◽  
Shuqi Diao ◽  
Qing Lin ◽  
Jiaqi Li ◽  
...  

With the availability of high-density single-nucleotide polymorphism (SNP) data and the development of genotype imputation methods, high-density panel-based genomic prediction (GP) has become possible in livestock breeding. It is generally considered that the genomic estimated breeding value (GEBV) accuracy increases with the marker density, while studies have shown that the GEBV accuracy does not increase or even decrease when high-density panels were used. Therefore, in addition to the SNP number, other measurements of ‘marker density’ seem to have impacts on the GEBV accuracy, and exploring the relationship between the GEBV accuracy and the measurements of ‘marker density’ based on high-density SNP or whole-genome sequence data is important for the field of GP. In this study, we constructed different SNP panels with certain SNP numbers (e.g., 1 k) by using the physical distance (PhyD), genetic distance (GenD) and random distance (RanD) between SNPs respectively based on the high-density SNP data of a Germany Holstein dairy cattle population. Therefore, there are three different panels at a certain SNP number level. These panels were used to construct GP models to predict fat percentage, milk yield and somatic cell score. Meanwhile, the mean (d¯) and variance (σd2) of the physical distance between SNPs and the mean (r2¯) and variance (σr22) of the genetic distance between SNPs in each panel were used as marker density-related measurements and their influence on the GEBV accuracy was investigated. At the same SNP number level, the d¯ of all panels is basically the same, but the σd2, r2¯ and σr22 are different. Therefore, we only investigated the effects of σd2, r2¯ and σr22 on the GEBV accuracy. The results showed that at a certain SNP number level, the GEBV accuracy was negatively correlated with σd2, but not with r2¯ and σr22. Compared with GenD and RanD, the σd2 of panels constructed by PhyD is smaller. The low and moderate-density panels (< 50 k) constructed by RanD or GenD have large .σd2., which is not conducive to genomic prediction. The GEBV accuracy of the low and moderate-density panels constructed by PhyD is 3.8~34.8% higher than that of the low and moderate-density panels constructed by RanD and GenD. Panels with 20–30 k SNPs constructed by PhyD can achieve the same or slightly higher GEBV accuracy than that of high-density SNP panels for all three traits. In summary, the smaller the variation degree of physical distance between adjacent SNPs, the higher the GEBV accuracy. The low and moderate-density panels construct by physical distance are beneficial to genomic prediction, while pruning high-density SNP data based on genetic distance is detrimental to genomic prediction. The results provide suggestions for the development of SNP panels and the research of genome prediction based on whole-genome sequence data.

2019 ◽  
Vol 51 (1) ◽  
Author(s):  
Nasir Moghaddar ◽  
Majid Khansefid ◽  
Julius H. J. van der Werf ◽  
Sunduimijid Bolormaa ◽  
Naomi Duijvesteijn ◽  
...  

Abstract Background Whole-genome sequence (WGS) data could contain information on genetic variants at or in high linkage disequilibrium with causative mutations that underlie the genetic variation of polygenic traits. Thus far, genomic prediction accuracy has shown limited increase when using such information in dairy cattle studies, in which one or few breeds with limited diversity predominate. The objective of our study was to evaluate the accuracy of genomic prediction in a multi-breed Australian sheep population of relatively less related target individuals, when using information on imputed WGS genotypes. Methods Between 9626 and 26,657 animals with phenotypes were available for nine economically important sheep production traits and all had WGS imputed genotypes. About 30% of the data were used to discover predictive single nucleotide polymorphism (SNPs) based on a genome-wide association study (GWAS) and the remaining data were used for training and validation of genomic prediction. Prediction accuracy using selected variants from imputed sequence data was compared to that using a standard array of 50k SNP genotypes, thereby comparing genomic best linear prediction (GBLUP) and Bayesian methods (BayesR/BayesRC). Accuracy of genomic prediction was evaluated in two independent populations that were each lowly related to the training set, one being purebred Merino and the other crossbred Border Leicester x Merino sheep. Results A substantial improvement in prediction accuracy was observed when selected sequence variants were fitted alongside 50k genotypes as a separate variance component in GBLUP (2GBLUP) or in Bayesian analysis as a separate category of SNPs (BayesRC). From an average accuracy of 0.27 in both validation sets for the 50k array, the average absolute increase in accuracy across traits with 2GBLUP was 0.083 and 0.073 for purebred and crossbred animals, respectively, whereas with BayesRC it was 0.102 and 0.087. The average gain in accuracy was smaller when selected sequence variants were treated in the same category as 50k SNPs. Very little improvement over 50k prediction was observed when using all WGS variants. Conclusions Accuracy of genomic prediction in diverse sheep populations increased substantially by using variants selected from whole-genome sequence data based on an independent multi-breed GWAS, when compared to genomic prediction using standard 50K genotypes.


2018 ◽  
Vol 50 (1) ◽  
Author(s):  
Biaty Raymond ◽  
Aniek C. Bouwman ◽  
Chris Schrooten ◽  
Jeanine Houwing-Duistermaat ◽  
Roel F. Veerkamp

2016 ◽  
Vol 133 (3) ◽  
pp. 167-179 ◽  
Author(s):  
M. Heidaritabar ◽  
M.P.L. Calus ◽  
H-J. Megens ◽  
A. Vereijken ◽  
M.A.M. Groenen ◽  
...  

2019 ◽  
Vol 136 (6) ◽  
pp. 418-429 ◽  
Author(s):  
Sanne den Berg ◽  
Jérémie Vandenplas ◽  
Fred A. Eeuwijk ◽  
Marcos S. Lopes ◽  
Roel F. Veerkamp

Author(s):  
Jana Wold ◽  
Stephanie Galla ◽  
David Eccles ◽  
Carolyn J. Hogg ◽  
Klaus-Peter Koepfli ◽  
...  

Increased capability in the conservation genomics community, combined with decreased sequencing costs, is providing new opportunities for the application of whole-genome sequence data to enhance species recovery. Indeed, assessments of genome-wide diversity based on SNP data are already informing the conservation management of threatened species around the world. However, SNP data alone may not sufficiently capture all of the information necessary for the effective conservation management of critically endangered species that suffer from severe inbreeding depression. For threatened species that require significant intervention, it is critical that we as conservation genomicists expand our repertoire to include assessments of functional diversity. Structural variants are a likely source of functional diversity, as their frequency and genomic context affect the dosage and regulation of gene expression through mechanisms that alter genome organization and impact fitness. In this future-focused Opinion, we leverage the existing literature - predominantly focused on model and agricultural species - to identify pan-genomic and chromosomic approaches for readily characterizing structural variants and to consider how integrating these into the conservation genomics toolbox will transform the way we manage some of the world’s most threatened species.


2020 ◽  
Vol 10 (8) ◽  
pp. 2777-2785 ◽  
Author(s):  
Carolina Peñaloza ◽  
Diego Robledo ◽  
Agustin Barría ◽  
Trọng Quốc Trịnh ◽  
Mahirah Mahmuddin ◽  
...  

Tilapia are among the most important farmed fish species worldwide, and are fundamental for the food security of many developing countries. Several genetically improved Nile tilapia (Oreochromis niloticus) strains exist, such as the iconic Genetically Improved Farmed Tilapia (GIFT), and breeding programs typically follow classical pedigree-based selection. The use of genome-wide single-nucleotide polymorphism (SNP) data can enable an understanding of the genetic architecture of economically important traits and the acceleration of genetic gain via genomic selection. Due to the global importance and diversity of Nile tilapia, an open access SNP array would be beneficial for aquaculture research and production. In the current study, a ∼65K SNP array was designed based on SNPs discovered from whole-genome sequence data from a GIFT breeding nucleus population and the overlap with SNP datasets from wild fish populations and several other farmed Nile tilapia strains. The SNP array was applied to clearly distinguish between different tilapia populations across Asia and Africa, with at least ∼30,000 SNPs segregating in each of the diverse population samples tested. It is anticipated that this SNP array will be an enabling tool for population genetics and tilapia breeding research, facilitating consistency and comparison of results across studies.


Sign in / Sign up

Export Citation Format

Share Document