Genomic Tools for the Identification of Loci Associated with Facial Eczema in New Zealand Sheep

Facial eczema (FE) is a significant metabolic disease that affects New Zealand ruminants. Ingestion of the mycotoxin sporidesmin leads to liver and bile duct damage, which can result in photosensitisation, reduced productivity and death. Strategies used to manage the incidence and severity of the disease include breeding. In sheep, there is considerable genetic variation in the response to FE. A commercial testing program is available for ram breeders who aim to increase tolerance, determined by the concentration of the serum enzyme, γ-glutamyltransferase 21 days after a measured sporidesmin challenge (GGT21). Genome-wide association studies were carried out to determine regions of the genome associated with GGT21. Two regions on chromosomes 15 and 24 are reported, which explain 5% and 1% of the phenotypic variance in the response to FE, respectively. The region on chromosome 15 contains the β-globin locus. Of the significant SNPs in the region, one is a missense variant within the haemoglobin subunit β (HBB) gene. Mass spectrometry of haemoglobin from animals with differing genotypes at this locus indicated that genotypes are associated with different forms of adult β-globin. Haemoglobin haplotypes have previously been associated with variation in several health-related traits in sheep and warrant further investigation regarding their role in tolerance to FE in sheep. We show a strategic approach to the identification of regions of importance for commercial breeding programs with a combination of discovery, statistical and biological validation. This study highlights the power of using increased density genotyping for the identification of influential genomic regions, combined with subsequent inclusion on lower density genotyping platforms.

Download Full-text

GWAS of serum ALT and AST reveals an association of SLC30A10 Thr95Ile with hypermanganesemia symptoms

Nature Communications ◽

10.1038/s41467-021-24563-1 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Lucas D. Ward ◽

Ho-Chou Tu ◽

Chelsea B. Quenneville ◽

Shira Tsour ◽

Alexander O. Flynn-Carroll ◽

...

Keyword(s):

Bile Duct ◽

Extrahepatic Bile Duct ◽

Association Studies ◽

Missense Variant ◽

Deficiency Anemia ◽

Genome Wide Association Studies ◽

Loss Of Function ◽

Extrahepatic Bile Duct Cancer ◽

Hepatocellular Damage ◽

Increased Risk

AbstractUnderstanding mechanisms of hepatocellular damage may lead to new treatments for liver disease, and genome-wide association studies (GWAS) of alanine aminotransferase (ALT) and aspartate aminotransferase (AST) serum activities have proven useful for investigating liver biology. Here we report 100 loci associating with both enzymes, using GWAS across 411,048 subjects in the UK Biobank. The rare missense variant SLC30A10 Thr95Ile (rs188273166) associates with the largest elevation of both enzymes, and this association replicates in the DiscovEHR study. SLC30A10 excretes manganese from the liver to the bile duct, and rare homozygous loss of function causes the syndrome hypermanganesemia with dystonia-1 (HMNDYT1) which involves cirrhosis. Consistent with hematological symptoms of hypermanganesemia, SLC30A10 Thr95Ile carriers have increased hematocrit and risk of iron deficiency anemia. Carriers also have increased risk of extrahepatic bile duct cancer. These results suggest that genetic variation in SLC30A10 adversely affects more individuals than patients with diagnosed HMNDYT1.

Download Full-text

Genome-wide association studies identified loci contribute to phenotypic variance of gastric cancer

Gut ◽

10.1136/gutjnl-2017-315230 ◽

2017 ◽

Vol 67 (7) ◽

pp. 1366-1368 ◽

Cited By ~ 1

Author(s):

Caiwang Yan ◽

Meng Zhu ◽

Tongtong Huang ◽

Fei Yu ◽

Guangfu Jin

Keyword(s):

Gastric Cancer ◽

Association Studies ◽

Genome Wide Association ◽

Phenotypic Variance ◽

Genome Wide Association Studies ◽

Genome Wide

Download Full-text

Estimating Effect Sizes and Expected Replication Probabilities from GWAS Summary Statistics

10.1101/032474 ◽

2015 ◽

Author(s):

Dominic Holland ◽

Yunpeng Wang ◽

Wesley K Thompson ◽

Andrew Schork ◽

Chi-Hua Chen ◽

...

Keyword(s):

Association Studies ◽

Significant Snps ◽

Effect Sizes ◽

Genome Wide Association Studies ◽

Summary Statistics ◽

Sample Sizes ◽

Genetic Components ◽

Complex Phenotypes ◽

Genome Wide ◽

Z Scores

Genome-wide Association Studies (GWAS) result in millions of summary statistics (``z-scores'') for single nucleotide polymorphism (SNP) associations with phenotypes. These rich datasets afford deep insights into the nature and extent of genetic contributions to complex phenotypes such as psychiatric disorders, which are understood to have substantial genetic components that arise from very large numbers of SNPs. The complexity of the datasets, however, poses a significant challenge to maximizing their utility. This is reflected in a need for better understanding the landscape of z-scores, as such knowledge would enhance causal SNP and gene discovery, help elucidate mechanistic pathways, and inform future study design. Here we present a parsimonious methodology for modeling effect sizes and replication probabilities that does not require raw genotype data, relying only on summary statistics from GWAS substudies, and a scheme allowing for direct empirical validation. We show that modeling z-scores as a mixture of Gaussians is conceptually appropriate, in particular taking into account ubiquitous non-null effects that are likely in the datasets due to weak linkage disequilibrium with causal SNPs. The four-parameter model allows for estimating the degree of polygenicity of the phenotype -- the proportion of SNPs (after uniform pruning, so that large LD blocks are not over-represented) likely to be in strong LD with causal/mechanistically associated SNPs -- and predicting the proportion of chip heritability explainable by genome wide significant SNPs in future studies with larger sample sizes. We apply the model to recent GWAS of schizophrenia (N=82,315) and additionally, for purposes of illustration, putamen volume (N=12,596), with approximately 9.3 million SNP z-scores in both cases. We show that, over a broad range of z-scores and sample sizes, the model accurately predicts expectation estimates of true effect sizes and replication probabilities in multistage GWAS designs. We estimate the degree to which effect sizes are over-estimated when based on linear regression association coefficients. We estimate the polygenicity of schizophrenia to be 0.037 and the putamen to be 0.001, while the respective sample sizes required to approach fully explaining the chip heritability are 106and 105. The model can be extended to incorporate prior knowledge such as pleiotropy and SNP annotation. The current findings suggest that the model is applicable to a broad array of complex phenotypes and will enhance understanding of their genetic architectures.

Download Full-text

Data safe havens to combine health and genomic data: benefits and challenges

International Journal for Population Data Science ◽

10.23889/ijpds.v1i1.348 ◽

2017 ◽

Vol 1 (1) ◽

Author(s):

Kerina H Jones ◽

Arron S Lacey ◽

Brian L Perkins ◽

Mark I Rees

Keyword(s):

Association Studies ◽

Genomic Data ◽

Population Level ◽

Data Availability ◽

Genome Wide Association Studies ◽

Related Data ◽

Research Areas ◽

Individual Privacy ◽

Access Controls ◽

Health Related

ABSTRACTObjectivesData safe havens can bring together and combine a rich array of anonymised person-based data for research and policy evaluation within a secure setting. To date, the majority of available datasets have been structured micro-data derived from routine health-related records. Possibilities are opening up for the greater reuse of genomic data such as Genome Wide Association studies (GWAS) and Whole Exome/Genome Sequencing (WES or WGS). However, there are considerable challenges to be addressed if the benefits of using these data in combination with health-related data are to be realized safely. ApproachWe explore the benefits and challenges of using genomic datasets with health-related data, and using the Secure Anonymised Information Linkage (SAIL) system as a case study, the implications and way forward for Data Safe Havens in seeking to incorporate genomic data for use with health-related data. ResultsThe benefits of using GWAS, WES and WGS data in conjunction with health-related data include the potential to explore genetics at a population level and open up novel research areas. These include the ability to increasingly stratify and personalize how medical indications are detected and treated through precision medicine by understanding rare conditions and adding socioeconomic and environmental context to genomic data. Among the challenges are: data availability, computing capacity, technical solutions, legal and regulatory frameworks, public perceptions, individual privacy and organizational risk. Many of the challenges within these areas are common to person-based data in general, and often Data Safe Havens have been designed to address these. But there are also aspects of these challenges, and other challenges, specific to genomic data. These include issues due to the unknown clinical significance of genomic information now or in the future, with corresponding risks for privacy and impact on individuals. ConclusionGenomic data sets contain vast amounts of valuable information, some of which is currently undefined, but which may have direct bearing on individual health at some point. The use of these data in combination with health-related data has the potential to bring great benefits, better clinical trial stratification, epidemiology project design and clinical improvements. It is, therefore, essential that such data are surrounded by a properly-designed, robust governance framework including technical and procedural access controls that enable the data to be used safely.

Download Full-text

297 GWAS for complex models accounting for populations structure with GBLUP and ssGBLUP

Journal of Animal Science ◽

10.1093/jas/skaa278.057 ◽

2020 ◽

Vol 98 (Supplement_4) ◽

pp. 32-32

Author(s):

Juan P Steibel ◽

Ignacio Aguilar

Keyword(s):

Hypothesis Testing ◽

Large Scale ◽

Mixed Model ◽

Prediction Models ◽

Association Studies ◽

Least Square ◽

Type I ◽

Phenotypic Variance ◽

Genome Wide Association Studies ◽

Formal Hypothesis Testing

Abstract Genomic Best Linear Unbiased Prediction (GBLUP) is the method of choice for incorporating genomic information into the genetic evaluation of livestock species. Furthermore, single step GBLUP (ssGBLUP) is adopted by many breeders’ associations and private entities managing large scale breeding programs. While prediction of breeding values remains the primary use of genomic markers in animal breeding, a secondary interest focuses on performing genome-wide association studies (GWAS). The goal of GWAS is to uncover genomic regions that harbor variants that explain a large proportion of the phenotypic variance, and thus become candidates for discovering and studying causative variants. Several methods have been proposed and successfully applied for embedding GWAS into genomic prediction models. Most methods commonly avoid formal hypothesis testing and resort to estimation of SNP effects, relying on visual inspection of graphical outputs to determine candidate regions. However, with the advent of high throughput phenomics and transcriptomics, a more formal testing approach with automatic discovery thresholds is more appealing. In this work we present the methodological details of a method for performing formal hypothesis testing for GWAS in GBLUP models. First, we present the method and its equivalencies and differences with other GWAS methods. Moreover, we demonstrate through simulation analyses that the proposed method controls type I error rate at the nominal level. Second, we demonstrate two possible computational implementations based on mixed model equations for ssGBLUP and based on the generalized least square equations (GLS). We show that ssGBLUP can deal with datasets with extremely large number of animals and markers and with multiple traits. GLS implementations are well suited for dealing with smaller number of animals with tens of thousands of phenotypes. Third, we show several useful extensions, such as: testing multiple markers at once, testing pleiotropic effects and testing association of social genetic effects.

Download Full-text

Genome-wide association study reveals candidate genes for flowering time in cowpea (Vigna unguiculata [L.] Walp)

10.1101/2021.04.01.438123 ◽

2021 ◽

Author(s):

Dev Paudel ◽

Rocheteau Dareus ◽

Julia Rosenwald ◽

Maria Munoz-Amatriain ◽

Esteban Rios

Keyword(s):

Flowering Time ◽

Candidate Genes ◽

Vigna Unguiculata ◽

Association Studies ◽

Snp Markers ◽

Genome Wide Association ◽

Human Consumption ◽

Phenotypic Variance ◽

Genome Wide Association Studies ◽

Genome Wide

Cowpea (Vigna unguiculata [L.] Walp., diploid, 2n = 22) is a major crop used as a protein source for human consumption as well as a quality feed for livestock. It is drought and heat tolerant and has been bred to develop varieties that are resilient to changing climates. Plant adaptation to new climates and their yield are strongly affected by flowering time. Therefore, understanding the genetic basis of flowering time is critical to advance cowpea breeding. The aim of this study was to perform genome-wide association studies (GWAS) to identify marker trait associations for flowering time in cowpea using single nucleotide polymorphism (SNP) markers. A total of 367 accessions from a cowpea mini-core collection were evaluated in Ft. Collins, CO in 2019 and 2020, and 292 accessions were evaluated in Citra, FL in 2018. These accessions were genotyped using the Cowpea iSelect Consortium Array that contained 51,128 SNPs. GWAS revealed seven reliable SNPs for flowering time that explained 8-12% of the phenotypic variance. Candidate genes including FT, GI, CRY2, LSH3, UGT87A2, LIF2, and HTA9 that are associated with flowering time were identified for the significant SNP markers. Further efforts to validate these loci will help to understand their role in flowering time in cowpea, and it could facilitate the transfer of some of this knowledge to other closely related legume species.

Download Full-text

Heritability jointly explained by host genotype and microbiome: will improve traits prediction?

Briefings in Bioinformatics ◽

10.1093/bib/bbaa175 ◽

2020 ◽

Author(s):

Denis Awany ◽

Emile R Chimusa

Keyword(s):

Genetic Variants ◽

Association Studies ◽

Heritability Estimate ◽

Substantial Part ◽

Phenotypic Variance ◽

Genome Wide Association Studies ◽

Host Genotype ◽

Genome Wide ◽

Heritability Estimation

Abstract As we observe the $70$th anniversary of the publication by Robertson that formalized the notion of ‘heritability’, geneticists remain puzzled by the problem of missing/hidden heritability, where heritability estimates from genome-wide association studies (GWASs) fall short of that from twin-based studies. Many possible explanations have been offered for this discrepancy, including existence of genetic variants poorly captured by existing arrays, dominance, epistasis and unaccounted-for environmental factors; albeit these remain controversial. We believe a substantial part of this problem could be solved or better understood by incorporating the host’s microbiota information in the GWAS model for heritability estimation and may also increase human traits prediction for clinical utility. This is because, despite empirical observations such as (i) the intimate role of the microbiome in many complex human phenotypes, (ii) the overlap between genetic variants associated with both microbiome attributes and complex diseases and (iii) the existence of heritable bacterial taxa, current GWAS models for heritability estimate do not take into account the contributory role of the microbiome. Furthermore, heritability estimate from twin-based studies does not discern microbiome component of the observed total phenotypic variance. Here, we summarize the concept of heritability in GWAS and microbiome-wide association studies, focusing on its estimation, from a statistical genetics perspective. We then discuss a possible statistical method to incorporate the microbiome in the estimation of heritability in host GWAS.

Download Full-text

Exome resequencing and GWAS for growth, ecophysiology, and chemical and metabolomic composition of wood of Populus trichocarpa

BMC Genomics ◽

10.1186/s12864-019-6160-9 ◽

2019 ◽

Vol 20 (1) ◽

Cited By ~ 1

Author(s):

Fernando P. Guerra ◽

Haktan Suren ◽

Jason Holliday ◽

James H. Richards ◽

Oliver Fiehn ◽

...

Keyword(s):

Biomass Production ◽

Complex Traits ◽

Association Studies ◽

Populus Trichocarpa ◽

Significant Snps ◽

Snp Markers ◽

Exome Capture ◽

Genome Wide Association Studies ◽

Nucleotide Polymorphisms ◽

Improvement Programs

Abstract Background Populus trichocarpa is an important forest tree species for the generation of lignocellulosic ethanol. Understanding the genomic basis of biomass production and chemical composition of wood is fundamental in supporting genetic improvement programs. Considerable variation has been observed in this species for complex traits related to growth, phenology, ecophysiology and wood chemistry. Those traits are influenced by both polygenic control and environmental effects, and their genome architecture and regulation are only partially understood. Genome wide association studies (GWAS) represent an approach to advance that aim using thousands of single nucleotide polymorphisms (SNPs). Genotyping using exome capture methodologies represent an efficient approach to identify specific functional regions of genomes underlying phenotypic variation. Results We identified 813 K SNPs, which were utilized for genotyping 461 P. trichocarpa clones, representing 101 provenances collected from Oregon and Washington, and established in California. A GWAS performed on 20 traits, considering single SNP-marker tests identified a variable number of significant SNPs (p-value < 6.1479E-8) in association with diameter, height, leaf carbon and nitrogen contents, and δ15N. The number of significant SNPs ranged from 2 to 220 per trait. Additionally, multiple-marker analyses by sliding-windows tests detected between 6 and 192 significant windows for the analyzed traits. The significant SNPs resided within genes that encode proteins belonging to different functional classes as such protein synthesis, energy/metabolism and DNA/RNA metabolism, among others. Conclusions SNP-markers within genes associated with traits of importance for biomass production were detected. They contribute to characterize the genomic architecture of P. trichocarpa biomass required to support the development and application of marker breeding technologies.

Download Full-text

Genome-Wide Association Studies for the Concentration of Albumin in Colostrum and Serum in Chinese Holstein

Animals ◽

10.3390/ani10122211 ◽

2020 ◽

Vol 10 (12) ◽

pp. 2211

Author(s):

Shan Lin ◽

Zihui Wan ◽

Junnan Zhang ◽

Lingna Xu ◽

Bo Han ◽

...

Keyword(s):

Association Studies ◽

Significant Snps ◽

Albumin Concentration ◽

Genome Wide Association Studies ◽

Nucleotide Polymorphisms ◽

Single Nucleotide ◽

Chinese Holstein ◽

Genome Wide ◽

Chinese Holstein Cows ◽

Newborn Calves

Albumin can be of particular benefit in fighting infections for newborn calves due to its anti-inflammatory and anti-oxidative stress properties. To identify the candidate genes related to the concentration of albumin in colostrum and serum, we collected the colostrum and blood samples from 572 Chinese Holstein cows within 24 h after calving and measured the concentration of albumin in the colostrum and serum using the ELISA methods. The cows were genotyped with GeneSeek 150 K chips (containing 140,668 single nucleotide polymorphisms; SNPs). After quality control, we performed GWASs via GCTA software with 91,620 SNPs and 563 cows. Consequently, 9 and 7 genome-wide significant SNPs (false discovery rate (FDR) at 1%) were identified. Correspondingly, 42 and 206 functional genes that contained or were approximate to (±1 Mbp) the significant SNPs were acquired. Integrating the biological process of these genes and the reported QTLs for immune and inflammation traits in cattle, 3 and 12 genes were identified as candidates for the concentration of colostrum and serum albumin, respectively; these are RUNX1, CBR1, OTULIN,CDK6, SHARPIN, CYC1, EXOSC4, PARP10, NRBP2, GFUS, PYCR3, EEF1D, GSDMD, PYCR2 and CXCL12. Our findings provide important information for revealing the genetic mechanism behind albumin concentration and for molecular breeding of disease-resistance traits in dairy cattle.

Download Full-text

Single-plant GWAS coupled with bulk segregant analysis allows rapid identification and corroboration of plant-height candidate SNPs

BMC Plant Biology ◽

10.1186/s12870-019-2000-y ◽

2019 ◽

Vol 19 (1) ◽

Cited By ~ 4

Author(s):

Abiskar Gyawali ◽

Vivek Shrestha ◽

Katherine E. Guill ◽

Sherry Flint-Garcia ◽

Timothy M. Beissinger

Keyword(s):

Plant Height ◽

Bulk Segregant Analysis ◽

Association Studies ◽

Significant Snps ◽

Rapid Identification ◽

Candidate Snps ◽

Genome Wide Association Studies ◽

Nucleotide Polymorphisms ◽

Single Plant ◽

Crop Species

Abstract Background Genome wide association studies (GWAS) are a powerful tool for identifying quantitative trait loci (QTL) and causal single nucleotide polymorphisms (SNPs)/genes associated with various important traits in crop species. Typically, GWAS in crops are performed using a panel of inbred lines, where multiple replicates of the same inbred are measured and the average phenotype is taken as the response variable. Here we describe and evaluate single plant GWAS (sp-GWAS) for performing a GWAS on individual plants, which does not require an association panel of inbreds. Instead sp-GWAS relies on the phenotypes and genotypes from individual plants sampled from a randomly mating population. Importantly, we demonstrate how sp-GWAS can be efficiently combined with a bulk segregant analysis (BSA) experiment to rapidly corroborate evidence for significant SNPs. Results In this study we used the Shoepeg maize landrace, collected as an open pollinating variety from a farm in Southern Missouri in the 1960’s, to evaluate whether sp-GWAS coupled with BSA can efficiently and powerfully used to detect significant association of SNPs for plant height (PH). Plant were grown in 8 locations across two years and in total 768 individuals were genotyped and phenotyped for sp-GWAS. A total of 306 k polymorphic markers in 768 individuals evaluated via association analysis detected 25 significant SNPs (P ≤ 0.00001) for PH. The results from our single-plant GWAS were further validated by bulk segregant analysis (BSA) for PH. BSA sequencing was performed on the same population by selecting tall and short plants as separate bulks. This approach identified 37 genomic regions for plant height. Of the 25 significant SNPs from GWAS, the three most significant SNPs co-localize with regions identified by BSA. Conclusion Overall, this study demonstrates that sp-GWAS coupled with BSA can be a useful tool for detecting significant SNPs and identifying candidate genes. This result is particularly useful for species/populations where association panels are not readily available.

Download Full-text