Assessing single nucleotide polymorphism selection methods for the development of a low-density panel optimized for imputation in South African Drakensberger beef cattle

Author(s):  
Simon F Lashmar ◽  
Donagh P Berry ◽  
Rian Pierneef ◽  
Farai C Muchadeyi ◽  
Carina Visser

Abstract A major obstacle in applying genomic selection (GS) to uniquely adapted local breeds in less-developed countries has been the cost of genotyping at high densities of single nucleotide polymorphisms (SNP). Cost reduction can be achieved by imputing genotypes from lower to higher densities. Locally adapted breeds tend to be admixed and exhibit a high degree of genomic heterogeneity thus necessitating the optimization of SNP selection for downstream imputation. The aim of this study was to quantify the achievable imputation accuracy for a sample of 1,135 South African (SA) Drakensberger using several custom-derived lower-density panels varying in both SNP density and how the SNP were selected. From a pool of 120,608 genotyped SNP, subsets of SNP were chosen 1) at random, 2) with even genomic dispersion, 3) by maximizing the mean minor allele frequency (MAF), 4) using a combined score of MAF and linkage disequilibrium (LD), 5) using a partitioning-around-medoids (PAM) algorithm, and finally 6) using a hierarchical LD-based clustering algorithm. Imputation accuracy to higher density improved as SNP density increased; animal-wise imputation accuracy defined as the within-animal correlation between the imputed and actual alleles ranged from 0.625 to 0.990 when 2,500 randomly selected SNP were chosen versus a range of 0.918 to 0.999 when 50,000 randomly selected SNP were used. At a panel density of 10,000 SNP, the mean (standard deviation) animal-wise allele concordance rate was 0.976 (0.018) versus 0.982 (0.014) when the worst (i.e., random) as opposed to the best (i.e., combination of MAF and LD) SNP selection strategy was employed. A difference of 0.071 units was observed between the mean correlation-based accuracy of imputed SNP categorized as low (0.01<MAF≤0.1) versus high MAF (0.4<MAF≤0.5). Greater mean imputation accuracy was achieved for SNP located on autosomal extremes when these regions were populated with more SNP. The presented results suggested that genotype imputation can be a practical cost-saving strategy for indigenous breeds such as the South African Drakensberger. Based on the results, a genotyping panel consisting of approximately 10,000 SNP selected based on a combination of MAF and LD would suffice in achieving a less than 3% imputation error rate for a breed characterized by genomic admixture on the condition that these SNP are selected based on breed-specific selection criteria.

2021 ◽  
Author(s):  
Zhi Ming Xu ◽  
Sina Rüeger ◽  
Michaela Zwyer ◽  
Daniela Brites ◽  
Hellen Hiza ◽  
...  

AbstractGenome-wide association studies rely on the statistical inference of untyped variants, called imputation, to increase the coverage of genotyping arrays. However, the results are often suboptimal in populations underrepresented in existing reference panels and array designs, since the selected single nucleotide polymorphisms (SNPs) may fail to capture population-specific haplotype structures, hence the full extent of common genetic variation. Here, we propose to sequence the full genome of a small subset of an underrepresented study cohort to inform the selection of population-specific add-on SNPs, such that the remaining array-genotyped cohort could be more accurately imputed. Using a Tanzania-based cohort as a proof-of-concept, we demonstrate the validity of our approach by showing improvements in imputation accuracy after the addition of our designed addon SNPs to the base H3Africa array.


2022 ◽  
Vol 18 (1) ◽  
pp. e1009628
Author(s):  
Zhi Ming Xu ◽  
Sina Rüeger ◽  
Michaela Zwyer ◽  
Daniela Brites ◽  
Hellen Hiza ◽  
...  

Genome-wide association studies rely on the statistical inference of untyped variants, called imputation, to increase the coverage of genotyping arrays. However, the results are often suboptimal in populations underrepresented in existing reference panels and array designs, since the selected single nucleotide polymorphisms (SNPs) may fail to capture population-specific haplotype structures, hence the full extent of common genetic variation. Here, we propose to sequence the full genomes of a small subset of an underrepresented study cohort to inform the selection of population-specific add-on tag SNPs and to generate an internal population-specific imputation reference panel, such that the remaining array-genotyped cohort could be more accurately imputed. Using a Tanzania-based cohort as a proof-of-concept, we demonstrate the validity of our approach by showing improvements in imputation accuracy after the addition of our designed add-on tags to the base H3Africa array.


Animals ◽  
2020 ◽  
Vol 10 (3) ◽  
pp. 498 ◽  
Author(s):  
Antonio Boccardo ◽  
Stefano Paolo Marelli ◽  
Davide Pravettoni ◽  
Alessandro Bagnato ◽  
Giuseppe Achille Busca ◽  
...  

The German Shorthaired Pointer (GSHP) is a breed worldwide known for its hunting versatility. Dogs of this breed are appreciated as valuable companions, effective trackers, field trailers and obedience athletes. The aim of the present work is to describe the genomic architecture of the GSHP breed and to analyze inbreeding levels under a genomic and a genealogic perspective. A total of 34 samples were collected (24 Italian, 10 USA), and the genomic and pedigree coefficients of inbreeding have been calculated. A total of 3183 runs of homozygosity (ROH) across all 34 dogs have been identified. The minimum and maximum number of Single Nucleotide Polymorphisms (SNPs) defining all ROH are 40 and 3060. The mean number of ROH for the sample was 93.6. ROH were found on all chromosomes. A total of 854 SNPs (TOP_SNPs) defined 11 ROH island regions (TOP_ROH), in which some gene already associated with behavioral and morphological canine traits was annotated. The proportion of averaged observed homozygotes estimated on total number of SNPs was 0.70. The genomic inbreeding coefficient based on ROH was 0.17. The mean inbreeding based on genealogical information resulted 0.023. The results describe a low inbred population with quite a good level of genetic variability.


Viruses ◽  
2019 ◽  
Vol 11 (7) ◽  
pp. 658 ◽  
Author(s):  
Boitumelo Motsoeneng ◽  
Michael D. Jukes ◽  
Caroline M. Knox ◽  
Martin P. Hill ◽  
Sean D. Moore

The complete genome of an endemic South African Cydia pomonella granulovirus isolate was sequenced and analyzed. Several missing or truncated open reading frames (ORFs) were identified, including a 24 bp deletion in the pe38 gene which is reported to be associated with type I resistance-breaking potential. Comparison of single nucleotide polymorphisms (SNPs) with five other fully sequenced CpGV isolates identified 67 unique events, 47 of which occurred within ORFs, leading to several amino acid changes. Further analysis of single nucleotide variations (SNVs) within CpGV-SA revealed that this isolate consists of mixed genotypes. Phylogenetic analysis using complete genome sequences placed CpGV-SA basal to M, I12 and E2 and distal to S and I07 but with no distinct classification into any of the previously defined CpGV genogroups. These results suggest that CpGV-SA is a novel and genetically distinct isolate with significant potential as a biopesticide for management of codling moth (CM), not only in South Africa, but potentially in other pome fruit producing countries, particularly where CM resistance to CpGV has been reported.


2019 ◽  
Vol 48 (D1) ◽  
pp. D659-D667 ◽  
Author(s):  
Wenqian Yang ◽  
Yanbo Yang ◽  
Cecheng Zhao ◽  
Kun Yang ◽  
Dongyang Wang ◽  
...  

Abstract Animal-ImputeDB (http://gong_lab.hzau.edu.cn/Animal_ImputeDB/) is a public database with genomic reference panels of 13 animal species for online genotype imputation, genetic variant search, and free download. Genotype imputation is a process of estimating missing genotypes in terms of the haplotypes and genotypes in a reference panel. It can effectively increase the density of single nucleotide polymorphisms (SNPs) and thus can be widely used in large-scale genome-wide association studies (GWASs) using relatively inexpensive and low-density SNP arrays. However, most animals except humans lack high-quality reference panels, which greatly limits the application of genotype imputation in animals. To overcome this limitation, we developed Animal-ImputeDB, which is dedicated to collecting genotype data and whole-genome resequencing data of nonhuman animals from various studies and databases. A computational pipeline was developed to process different types of raw data to construct reference panels. Finally, 13 high-quality reference panels including ∼400 million SNPs from 2265 samples were constructed. In Animal-ImputeDB, an easy-to-use online tool consisting of two popular imputation tools was designed for the purpose of genotype imputation. Collectively, Animal-ImputeDB serves as an important resource for animal genotype imputation and will greatly facilitate research on animal genomic selection and genetic improvement.


2018 ◽  
Vol 28 (9) ◽  
pp. 1123-1128 ◽  
Author(s):  
Feifei Si ◽  
Yao Wu ◽  
Xianmin Wang ◽  
Fang Gao ◽  
Dan Yang ◽  
...  

AbstractBackgroundKawasaki disease is the leading cause of acquired heart disease in children from developed countries. The Interleukin-6/ Interleukin-12 cytokine family has many members, including the paradoxical anti- and pro-inflammatory Interleukin-27. Recent studies have demonstrated that Interleukin-27 plays a role in immune diseases. Given this, we sought to evaluate the association betweenInterleukin-27genetic polymorphisms and Kawasaki disease in Chinese children.Methods and resultsInterleukin-27 was genotyped in 100 Kawasaki disease children and 98 healthy children (controls), resulting in the direct sequencing of eight Single-nucleotide Polymorphisms: rs17855750, rs40837, rs26528, rs428253, rs4740, rs4905, rs153109, and rs181206). There were no significant differences in Interleukin-27 genotypes between Kawasaki disease and control groups. Of the eight Single-nucleotide Polymorphisms, there was a significant increase in the risk of Kawasaki disease with coronary arterial lesions in children with the rs17855750 (T>G), rs40837 (A>G), rs4740 (G>A), rs4905 (A>G), rs153109 (T>C), and rs26528 (A>G) Single-nucleotide Polymorphisms. This was particularly true for rs17855750 (T>G), which had a greater frequency in Kawasaki disease children with coronary arterial aneurysm.ConclusionThese findings may be used as risk factors when assessing a child’s likelihood of developing Kawasaki disease, as well as for the development of future therapeutic treatments for Kawasaki disease.


2016 ◽  
Vol 43 (6) ◽  
pp. 1045-1049 ◽  
Author(s):  
Kwangwoo Kim ◽  
So-Young Bang ◽  
Young Bin Joo ◽  
Taehyeung Kim ◽  
Hye-Soon Lee ◽  
...  

Objective.Cyclophosphamide (CYC) is an immunosuppressant drug widely used to treat various diseases including lupus nephritis, but its efficacy highly varies from individual to individual. This pharmacogenomics association study searched for genetic variations associated with CYC efficacy.Methods.Genome-wide association scan was performed for 109 Korean patients with systemic lupus erythematosus with lupus nephritis (classes III–V) who received intravenous CYC induction therapy. Genetic differences between responders and nonresponders were examined using Cochran–Armitage trend tests, and genotype imputation was used for defining the association locus.Results.Genetic polymorphisms in the Fcγ receptor gene (FCGR) cluster at human chromosome 1q23, previously associated with lupus nephritis susceptibility, were associated with the response to CYC treatment for lupus nephritis. Significant response association was found for 3 perfectly correlated (r2 = 1) single-nucleotide polymorphisms (SNP): rs6697139, rs10917686, and rs10917688, located between the FCGR2B and FCRLA genes (p = 3.4 × 10−8). Carriage of the minor alleles in these SNP was found only in nonresponders (31%) and none in responders (0%).Conclusion.This first genome-wide association approach for CYC response yielded a robust profile of genetic associations including large-effect SNP in the FCGR2B-FCRLA locus, which may provide better insights to CYC metabolism and efficacy.


2017 ◽  
Vol 41 (S1) ◽  
pp. S103-S103
Author(s):  
A. Zdanowicz ◽  
A. Sakowicz ◽  
E. Kusidel ◽  
P. Wierzbinski

IntroductionTAAR1 is a G protein-coupled receptor expressed broadly throughout the brain. Recently, TAAR1 has been demonstrated to be an important modulator of the dopaminergic, serotonergic and glutamatergic activity.AimsAssessment of the relation between two single-nucleotide polymorphisms of TAAR1 gene, suicide attempts and alcohol abuse.MethodsA total of 150 Polish patients were included, 59 subjects after suicide attempt vs. 91 controls. The chosen SNPs (rs759733834 and rs9402439) were studied using RFLP-PCR methods. The Hardy-Weinberg equilibrium was tested in control group.Statistical testsChi2 or Yeates Chi2 Test were used.ResultsThe mean age of study subjects and controls was: 38 ± 12.3 and 42 ± 12.8 respectively; 49% study males vs. 54% male controls. We did not observe the association between the carriage of the genotypes GG, GA and AA of rs759733834 polymorphisms in either of the groups. The distribution of genotypes in respect to rs9402439 polymorphism (CC, CG, GG) was also insignificant. Among patients with alcohol dependence, the frequency G allele of rs9402439 polymorphism was lower compared to non-addicted ones (27 vs. 47%) P < 0.01.ConclusionsTAAR1 polymorphisms rs759733834 and rs9402439 are not related to suicide attempts. The carriage of allele G of rs9402439 polymorphism is related to lower risk of alcohol addiction OR 0.40 95%Cl 0.20–0.81. To our knowledge, this is the first study on the TAAR1 receptor and the risk of suicide and it might offer a new insight into genetic etiology of TAAR1 receptor.Disclosure of interestThe authors have not supplied their declaration of competing interest.


2020 ◽  
Vol 4 (Supplement_1) ◽  
Author(s):  
Afifah Azam ◽  
Mohammad Arif Shahar ◽  
Siti Liyana Saud Gany ◽  
Norlela Sukor ◽  
Nor Azmi Kamaruddin ◽  
...  

Abstract Primary aldosteronism (PA), also known as Conn’s syndrome, is a common curable cause of hypertension. Family studies of essential hypertensive patients suggest that heritable genetic factors play a role in blood pressure regulation1. Interestingly, single nucleotide polymorphisms (SNP) in genes encoding enzymes involved with adrenal steroidogenesis, CYP11B2, CYP11B1 and CYP17A1, associate with increased risk of hypertension2. Therefore, we analysed whether selected SNPs in these genes are associated with PA. We performed an association study using genotype imputation for selected SNPs of the steroidogenic enzyme genes CYP11B2 (rs4546, rs1799998, rs13268025), CYP11B1 (rs6410, rs149845727), and CYP17A1 (rs1004467, rs138009835, rs2150927) from a pilot genome wide association study of Malaysian PA patients and healthy controls which was merged with the Singapore Genome Variation Project (SGVP) population dataset3. Genotype imputation for minor and major alleles was validated using PCR sequencing (n&gt;10 for each genotype). Further, one SNP from each steroidogenic enzyme (CYP11B2:rs1799998, CYP11B1:rs6410 and CYP17A1:rs1004467) was validated using commercial TaqMan genotyping assays on the ABI 7000 Sequence Detection System which was performed on 149 PA patients and 78 non-hypertensive healthy individuals. Case-control genetic association analysis was performed at http://www.oege.org/software/orcalc.html and the association between genotypes and phenotypes was done using the independent-samples Kruskal-Wallis test on SPSS (version 25). The Minor Allele Frequencies (MAFs) for rs1004467, rs6410 and rs1799998 were similar to East Asian populations but differed significantly different from European, African, American and South Asian populations (rs1004467 MAF: C=0.258/298, rs6410 MAF: A=0.265/298, rs1799998 MAF: C=0.225/298). In Chinese patients matched by gender, heterozygotes for rs6410 had significantly increased risk of PA compared to common homozygotes (OR: 3.15, 95% CI: 1.01–9.8, p=0.04). Across patients of different ethnicity, the distribution of aldosterone levels was significantly different (p=0.039). In summary, only SNP rs6410 in Chinese patients matched by gender showed association with PA in our South East Asian cohort. More functional experiments need to be done to find out whether this is causal for PA or whether the SNP is in linkage disequilibrium with the actual functional causative SNPs. Once the functional SNP is known, identification of these germline variants in asymptomatic family members would allow early screening of PA to be offered and potentially provide novel drug targets to treat the disease. References: 1Timberlake et al., Curr Opin Nephrol Hypertens. 2001 Jan;10(1):71-9. 2MacKenzie et al., Int J Mol Sci. 2017 Mar 7;18(3). pii: E579. 3Teo et al., Genome Res. 2009 Nov;19(11):2154-62.


Sign in / Sign up

Export Citation Format

Share Document