Assessing single nucleotide polymorphism selection methods for the development of a low-density panel optimized for imputation in South African Drakensberger beef cattle

Abstract A major obstacle in applying genomic selection (GS) to uniquely adapted local breeds in less-developed countries has been the cost of genotyping at high densities of single nucleotide polymorphisms (SNP). Cost reduction can be achieved by imputing genotypes from lower to higher densities. Locally adapted breeds tend to be admixed and exhibit a high degree of genomic heterogeneity thus necessitating the optimization of SNP selection for downstream imputation. The aim of this study was to quantify the achievable imputation accuracy for a sample of 1,135 South African (SA) Drakensberger using several custom-derived lower-density panels varying in both SNP density and how the SNP were selected. From a pool of 120,608 genotyped SNP, subsets of SNP were chosen 1) at random, 2) with even genomic dispersion, 3) by maximizing the mean minor allele frequency (MAF), 4) using a combined score of MAF and linkage disequilibrium (LD), 5) using a partitioning-around-medoids (PAM) algorithm, and finally 6) using a hierarchical LD-based clustering algorithm. Imputation accuracy to higher density improved as SNP density increased; animal-wise imputation accuracy defined as the within-animal correlation between the imputed and actual alleles ranged from 0.625 to 0.990 when 2,500 randomly selected SNP were chosen versus a range of 0.918 to 0.999 when 50,000 randomly selected SNP were used. At a panel density of 10,000 SNP, the mean (standard deviation) animal-wise allele concordance rate was 0.976 (0.018) versus 0.982 (0.014) when the worst (i.e., random) as opposed to the best (i.e., combination of MAF and LD) SNP selection strategy was employed. A difference of 0.071 units was observed between the mean correlation-based accuracy of imputed SNP categorized as low (0.01<MAF≤0.1) versus high MAF (0.4<MAF≤0.5). Greater mean imputation accuracy was achieved for SNP located on autosomal extremes when these regions were populated with more SNP. The presented results suggested that genotype imputation can be a practical cost-saving strategy for indigenous breeds such as the South African Drakensberger. Based on the results, a genotyping panel consisting of approximately 10,000 SNP selected based on a combination of MAF and LD would suffice in achieving a less than 3% imputation error rate for a breed characterized by genomic admixture on the condition that these SNP are selected based on breed-specific selection criteria.

Download Full-text

Using population-specific add-on polymorphisms to improve genotype imputation in underrepresented populations

10.1101/2021.02.03.429542 ◽

2021 ◽

Author(s):

Zhi Ming Xu ◽

Sina Rüeger ◽

Michaela Zwyer ◽

Daniela Brites ◽

Hellen Hiza ◽

...

Keyword(s):

Association Studies ◽

Imputation Accuracy ◽

Genotype Imputation ◽

Small Subset ◽

Study Cohort ◽

Genome Wide Association Studies ◽

Nucleotide Polymorphisms ◽

Single Nucleotide ◽

Genome Wide ◽

Selection Of

AbstractGenome-wide association studies rely on the statistical inference of untyped variants, called imputation, to increase the coverage of genotyping arrays. However, the results are often suboptimal in populations underrepresented in existing reference panels and array designs, since the selected single nucleotide polymorphisms (SNPs) may fail to capture population-specific haplotype structures, hence the full extent of common genetic variation. Here, we propose to sequence the full genome of a small subset of an underrepresented study cohort to inform the selection of population-specific add-on SNPs, such that the remaining array-genotyped cohort could be more accurately imputed. Using a Tanzania-based cohort as a proof-of-concept, we demonstrate the validity of our approach by showing improvements in imputation accuracy after the addition of our designed addon SNPs to the base H3Africa array.

Download Full-text

Using population-specific add-on polymorphisms to improve genotype imputation in underrepresented populations

PLoS Computational Biology ◽

10.1371/journal.pcbi.1009628 ◽

2022 ◽

Vol 18 (1) ◽

pp. e1009628

Author(s):

Zhi Ming Xu ◽

Sina Rüeger ◽

Michaela Zwyer ◽

Daniela Brites ◽

Hellen Hiza ◽

...

Keyword(s):

Association Studies ◽

Imputation Accuracy ◽

Genotype Imputation ◽

Small Subset ◽

Study Cohort ◽

Genome Wide Association Studies ◽

Nucleotide Polymorphisms ◽

Single Nucleotide ◽

Genome Wide ◽

Selection Of

Genome-wide association studies rely on the statistical inference of untyped variants, called imputation, to increase the coverage of genotyping arrays. However, the results are often suboptimal in populations underrepresented in existing reference panels and array designs, since the selected single nucleotide polymorphisms (SNPs) may fail to capture population-specific haplotype structures, hence the full extent of common genetic variation. Here, we propose to sequence the full genomes of a small subset of an underrepresented study cohort to inform the selection of population-specific add-on tag SNPs and to generate an internal population-specific imputation reference panel, such that the remaining array-genotyped cohort could be more accurately imputed. Using a Tanzania-based cohort as a proof-of-concept, we demonstrate the validity of our approach by showing improvements in imputation accuracy after the addition of our designed add-on tags to the base H3Africa array.

Download Full-text

The German Shorthair Pointer Dog Breed (Canis lupus familiaris): Genomic Inbreeding and Variability

Animals ◽

10.3390/ani10030498 ◽

2020 ◽

Vol 10 (3) ◽

pp. 498 ◽

Cited By ~ 1

Author(s):

Antonio Boccardo ◽

Stefano Paolo Marelli ◽

Davide Pravettoni ◽

Alessandro Bagnato ◽

Giuseppe Achille Busca ◽

...

Keyword(s):

Canis Lupus ◽

Inbreeding Coefficient ◽

Nucleotide Polymorphisms ◽

Canis Lupus Familiaris ◽

Runs Of Homozygosity ◽

Single Nucleotide ◽

The Mean ◽

Dog Breed ◽

Genomic Inbreeding ◽

Genealogical Information

The German Shorthaired Pointer (GSHP) is a breed worldwide known for its hunting versatility. Dogs of this breed are appreciated as valuable companions, effective trackers, field trailers and obedience athletes. The aim of the present work is to describe the genomic architecture of the GSHP breed and to analyze inbreeding levels under a genomic and a genealogic perspective. A total of 34 samples were collected (24 Italian, 10 USA), and the genomic and pedigree coefficients of inbreeding have been calculated. A total of 3183 runs of homozygosity (ROH) across all 34 dogs have been identified. The minimum and maximum number of Single Nucleotide Polymorphisms (SNPs) defining all ROH are 40 and 3060. The mean number of ROH for the sample was 93.6. ROH were found on all chromosomes. A total of 854 SNPs (TOP_SNPs) defined 11 ROH island regions (TOP_ROH), in which some gene already associated with behavioral and morphological canine traits was annotated. The proportion of averaged observed homozygotes estimated on total number of SNPs was 0.70. The genomic inbreeding coefficient based on ROH was 0.17. The mean inbreeding based on genealogical information resulted 0.023. The results describe a low inbred population with quite a good level of genetic variability.

Download Full-text

Genome Analysis of A Novel South African Cydia pomonella granulovirus (CpGV-SA) with Resistance-Breaking Potential

Viruses ◽

10.3390/v11070658 ◽

2019 ◽

Vol 11 (7) ◽

pp. 658 ◽

Cited By ~ 1

Author(s):

Boitumelo Motsoeneng ◽

Michael D. Jukes ◽

Caroline M. Knox ◽

Martin P. Hill ◽

Sean D. Moore

Keyword(s):

South African ◽

Complete Genome ◽

Cydia Pomonella ◽

Codling Moth ◽

Type I ◽

Nucleotide Polymorphisms ◽

Pome Fruit ◽

Single Nucleotide ◽

Resistance Breaking ◽

Cydia Pomonella Granulovirus

The complete genome of an endemic South African Cydia pomonella granulovirus isolate was sequenced and analyzed. Several missing or truncated open reading frames (ORFs) were identified, including a 24 bp deletion in the pe38 gene which is reported to be associated with type I resistance-breaking potential. Comparison of single nucleotide polymorphisms (SNPs) with five other fully sequenced CpGV isolates identified 67 unique events, 47 of which occurred within ORFs, leading to several amino acid changes. Further analysis of single nucleotide variations (SNVs) within CpGV-SA revealed that this isolate consists of mixed genotypes. Phylogenetic analysis using complete genome sequences placed CpGV-SA basal to M, I12 and E2 and distal to S and I07 but with no distinct classification into any of the previously defined CpGV genogroups. These results suggest that CpGV-SA is a novel and genetically distinct isolate with significant potential as a biopesticide for management of codling moth (CM), not only in South Africa, but potentially in other pome fruit producing countries, particularly where CM resistance to CpGV has been reported.

Download Full-text

Common and rare single nucleotide polymorphisms in the LDLR gene are present in a black South African population and associate with low-density lipoprotein cholesterol levels

Journal of Human Genetics ◽

10.1038/jhg.2013.123 ◽

2013 ◽

Vol 59 (2) ◽

pp. 88-94 ◽

Cited By ~ 6

Author(s):

Tertia van Zyl ◽

Johann C Jerling ◽

Karin R Conradie ◽

Edith JM Feskens

Keyword(s):

South African ◽

Low Density Lipoprotein ◽

Density Lipoprotein ◽

Nucleotide Polymorphisms ◽

Ldlr Gene ◽

Low Density Lipoprotein Cholesterol ◽

South African Population ◽

Single Nucleotide ◽

Black South African ◽

Cholesterol Levels

Download Full-text

Animal-ImputeDB: a comprehensive database with multiple animal reference panels for genotype imputation

Nucleic Acids Research ◽

10.1093/nar/gkz854 ◽

2019 ◽

Vol 48 (D1) ◽

pp. D659-D667 ◽

Cited By ~ 2

Author(s):

Wenqian Yang ◽

Yanbo Yang ◽

Cecheng Zhao ◽

Kun Yang ◽

Dongyang Wang ◽

...

Keyword(s):

Large Scale ◽

Association Studies ◽

Genotype Imputation ◽

Genome Wide Association Studies ◽

Nucleotide Polymorphisms ◽

High Quality ◽

Single Nucleotide ◽

Genome Wide ◽

Whole Genome Resequencing ◽

Missing Genotypes

Abstract Animal-ImputeDB (http://gong_lab.hzau.edu.cn/Animal_ImputeDB/) is a public database with genomic reference panels of 13 animal species for online genotype imputation, genetic variant search, and free download. Genotype imputation is a process of estimating missing genotypes in terms of the haplotypes and genotypes in a reference panel. It can effectively increase the density of single nucleotide polymorphisms (SNPs) and thus can be widely used in large-scale genome-wide association studies (GWASs) using relatively inexpensive and low-density SNP arrays. However, most animals except humans lack high-quality reference panels, which greatly limits the application of genotype imputation in animals. To overcome this limitation, we developed Animal-ImputeDB, which is dedicated to collecting genotype data and whole-genome resequencing data of nonhuman animals from various studies and databases. A computational pipeline was developed to process different types of raw data to construct reference panels. Finally, 13 high-quality reference panels including ∼400 million SNPs from 2265 samples were constructed. In Animal-ImputeDB, an easy-to-use online tool consisting of two popular imputation tools was designed for the purpose of genotype imputation. Collectively, Animal-ImputeDB serves as an important resource for animal genotype imputation and will greatly facilitate research on animal genomic selection and genetic improvement.

Download Full-text

The relationship betweenInterleukin-27gene polymorphisms and Kawasaki disease in a population of Chinese children

Cardiology in the Young ◽

10.1017/s1047951118000914 ◽

2018 ◽

Vol 28 (9) ◽

pp. 1123-1128 ◽

Cited By ~ 1

Author(s):

Feifei Si ◽

Yao Wu ◽

Xianmin Wang ◽

Fang Gao ◽

Dan Yang ◽

...

Keyword(s):

Kawasaki Disease ◽

Single Nucleotide Polymorphisms ◽

Developed Countries ◽

Chinese Children ◽

Interleukin 12 ◽

Healthy Children ◽

Arterial Aneurysm ◽

Nucleotide Polymorphisms ◽

Single Nucleotide ◽

Interleukin 27

AbstractBackgroundKawasaki disease is the leading cause of acquired heart disease in children from developed countries. The Interleukin-6/ Interleukin-12 cytokine family has many members, including the paradoxical anti- and pro-inflammatory Interleukin-27. Recent studies have demonstrated that Interleukin-27 plays a role in immune diseases. Given this, we sought to evaluate the association betweenInterleukin-27genetic polymorphisms and Kawasaki disease in Chinese children.Methods and resultsInterleukin-27 was genotyped in 100 Kawasaki disease children and 98 healthy children (controls), resulting in the direct sequencing of eight Single-nucleotide Polymorphisms: rs17855750, rs40837, rs26528, rs428253, rs4740, rs4905, rs153109, and rs181206). There were no significant differences in Interleukin-27 genotypes between Kawasaki disease and control groups. Of the eight Single-nucleotide Polymorphisms, there was a significant increase in the risk of Kawasaki disease with coronary arterial lesions in children with the rs17855750 (T>G), rs40837 (A>G), rs4740 (G>A), rs4905 (A>G), rs153109 (T>C), and rs26528 (A>G) Single-nucleotide Polymorphisms. This was particularly true for rs17855750 (T>G), which had a greater frequency in Kawasaki disease children with coronary arterial aneurysm.ConclusionThese findings may be used as risk factors when assessing a child’s likelihood of developing Kawasaki disease, as well as for the development of future therapeutic treatments for Kawasaki disease.

Download Full-text

Response to Intravenous Cyclophosphamide Treatment for Lupus Nephritis Associated with Polymorphisms in the FCGR2B-FCRLA Locus

The Journal of Rheumatology ◽

10.3899/jrheum.150665 ◽

2016 ◽

Vol 43 (6) ◽

pp. 1045-1049 ◽

Cited By ~ 8

Author(s):

Kwangwoo Kim ◽

So-Young Bang ◽

Young Bin Joo ◽

Taehyeung Kim ◽

Hye-Soon Lee ◽

...

Keyword(s):

Lupus Nephritis ◽

Lupus Erythematosus ◽

Receptor Gene ◽

Genotype Imputation ◽

Genome Wide Association ◽

Nucleotide Polymorphisms ◽

Genetic Associations ◽

Single Nucleotide ◽

Genome Wide ◽

Cyclophosphamide Treatment

Objective.Cyclophosphamide (CYC) is an immunosuppressant drug widely used to treat various diseases including lupus nephritis, but its efficacy highly varies from individual to individual. This pharmacogenomics association study searched for genetic variations associated with CYC efficacy.Methods.Genome-wide association scan was performed for 109 Korean patients with systemic lupus erythematosus with lupus nephritis (classes III–V) who received intravenous CYC induction therapy. Genetic differences between responders and nonresponders were examined using Cochran–Armitage trend tests, and genotype imputation was used for defining the association locus.Results.Genetic polymorphisms in the Fcγ receptor gene (FCGR) cluster at human chromosome 1q23, previously associated with lupus nephritis susceptibility, were associated with the response to CYC treatment for lupus nephritis. Significant response association was found for 3 perfectly correlated (r2 = 1) single-nucleotide polymorphisms (SNP): rs6697139, rs10917686, and rs10917688, located between the FCGR2B and FCRLA genes (p = 3.4 × 10−8). Carriage of the minor alleles in these SNP was found only in nonresponders (31%) and none in responders (0%).Conclusion.This first genome-wide association approach for CYC response yielded a robust profile of genetic associations including large-effect SNP in the FCGR2B-FCRLA locus, which may provide better insights to CYC metabolism and efficacy.

Download Full-text

Association Between Two Single-nucleotide Polymorphism of TAAR1 Gene and Suicide Attempts

European Psychiatry ◽

10.1016/j.eurpsy.2017.01.318 ◽

2017 ◽

Vol 41 (S1) ◽

pp. S103-S103

Author(s):

A. Zdanowicz ◽

A. Sakowicz ◽

E. Kusidel ◽

P. Wierzbinski

Keyword(s):

Suicide Attempts ◽

Statistical Tests ◽

Control Group ◽

Nucleotide Polymorphisms ◽

Single Nucleotide ◽

Hardy Weinberg Equilibrium ◽

The Mean ◽

Competing Interest ◽

G Protein Coupled ◽

The Brain

IntroductionTAAR1 is a G protein-coupled receptor expressed broadly throughout the brain. Recently, TAAR1 has been demonstrated to be an important modulator of the dopaminergic, serotonergic and glutamatergic activity.AimsAssessment of the relation between two single-nucleotide polymorphisms of TAAR1 gene, suicide attempts and alcohol abuse.MethodsA total of 150 Polish patients were included, 59 subjects after suicide attempt vs. 91 controls. The chosen SNPs (rs759733834 and rs9402439) were studied using RFLP-PCR methods. The Hardy-Weinberg equilibrium was tested in control group.Statistical testsChi2 or Yeates Chi2 Test were used.ResultsThe mean age of study subjects and controls was: 38 ± 12.3 and 42 ± 12.8 respectively; 49% study males vs. 54% male controls. We did not observe the association between the carriage of the genotypes GG, GA and AA of rs759733834 polymorphisms in either of the groups. The distribution of genotypes in respect to rs9402439 polymorphism (CC, CG, GG) was also insignificant. Among patients with alcohol dependence, the frequency G allele of rs9402439 polymorphism was lower compared to non-addicted ones (27 vs. 47%) P < 0.01.ConclusionsTAAR1 polymorphisms rs759733834 and rs9402439 are not related to suicide attempts. The carriage of allele G of rs9402439 polymorphism is related to lower risk of alcohol addiction OR 0.40 95%Cl 0.20–0.81. To our knowledge, this is the first study on the TAAR1 receptor and the risk of suicide and it might offer a new insight into genetic etiology of TAAR1 receptor.Disclosure of interestThe authors have not supplied their declaration of competing interest.

Download Full-text

SUN-711 Association of Single Nucleotide Polymorphisms of CYP11B2, CYP11B1 and CYP17A1 with Primary Aldosteronism in a Multi-Ethnic Malaysian Cohort

Journal of the Endocrine Society ◽

10.1210/jendso/bvaa046.1679 ◽

2020 ◽

Vol 4 (Supplement_1) ◽

Author(s):

Afifah Azam ◽

Mohammad Arif Shahar ◽

Siti Liyana Saud Gany ◽

Norlela Sukor ◽

Nor Azmi Kamaruddin ◽

...

Keyword(s):

Single Nucleotide Polymorphisms ◽

Primary Aldosteronism ◽

East Asian ◽

Genotype Imputation ◽

Chinese Patients ◽

Nucleotide Polymorphisms ◽

Steroidogenic Enzyme ◽

Single Nucleotide ◽

Asian Populations ◽

Increased Risk

Abstract Primary aldosteronism (PA), also known as Conn’s syndrome, is a common curable cause of hypertension. Family studies of essential hypertensive patients suggest that heritable genetic factors play a role in blood pressure regulation1. Interestingly, single nucleotide polymorphisms (SNP) in genes encoding enzymes involved with adrenal steroidogenesis, CYP11B2, CYP11B1 and CYP17A1, associate with increased risk of hypertension2. Therefore, we analysed whether selected SNPs in these genes are associated with PA. We performed an association study using genotype imputation for selected SNPs of the steroidogenic enzyme genes CYP11B2 (rs4546, rs1799998, rs13268025), CYP11B1 (rs6410, rs149845727), and CYP17A1 (rs1004467, rs138009835, rs2150927) from a pilot genome wide association study of Malaysian PA patients and healthy controls which was merged with the Singapore Genome Variation Project (SGVP) population dataset3. Genotype imputation for minor and major alleles was validated using PCR sequencing (n>10 for each genotype). Further, one SNP from each steroidogenic enzyme (CYP11B2:rs1799998, CYP11B1:rs6410 and CYP17A1:rs1004467) was validated using commercial TaqMan genotyping assays on the ABI 7000 Sequence Detection System which was performed on 149 PA patients and 78 non-hypertensive healthy individuals. Case-control genetic association analysis was performed at http://www.oege.org/software/orcalc.html and the association between genotypes and phenotypes was done using the independent-samples Kruskal-Wallis test on SPSS (version 25). The Minor Allele Frequencies (MAFs) for rs1004467, rs6410 and rs1799998 were similar to East Asian populations but differed significantly different from European, African, American and South Asian populations (rs1004467 MAF: C=0.258/298, rs6410 MAF: A=0.265/298, rs1799998 MAF: C=0.225/298). In Chinese patients matched by gender, heterozygotes for rs6410 had significantly increased risk of PA compared to common homozygotes (OR: 3.15, 95% CI: 1.01–9.8, p=0.04). Across patients of different ethnicity, the distribution of aldosterone levels was significantly different (p=0.039). In summary, only SNP rs6410 in Chinese patients matched by gender showed association with PA in our South East Asian cohort. More functional experiments need to be done to find out whether this is causal for PA or whether the SNP is in linkage disequilibrium with the actual functional causative SNPs. Once the functional SNP is known, identification of these germline variants in asymptomatic family members would allow early screening of PA to be offered and potentially provide novel drug targets to treat the disease. References: 1Timberlake et al., Curr Opin Nephrol Hypertens. 2001 Jan;10(1):71-9. 2MacKenzie et al., Int J Mol Sci. 2017 Mar 7;18(3). pii: E579. 3Teo et al., Genome Res. 2009 Nov;19(11):2154-62.

Download Full-text