scholarly journals VAMPIRE: A model blood cell trait focused annotation tool for interpretation and prioritization of complex trait genome-wide association study results

2021 ◽  
Author(s):  
Cheynna Crowley ◽  
Quan Sun ◽  
Le Huang ◽  
Erik L. Bao ◽  
Paul Auer ◽  
...  

AbstractThousands of genetic loci have been identified as associated with hematological indices (red blood cell, white blood cell, and platelet related traits), as well as other complex traits and disease. However, most loci identified are noncoding and not clearly linked to target genes, and tools are needed to prioritize the most likely functional variants for experimental follow-up. We here describe VAMPIRE: Variant Annotation Method Pointing to Interesting Regulatory Effects, an interactive web application implemented in R Shiny (http://shiny.bios.unc.edu/vampire/) for blood cell trait associated loci from recent large multi-ethnic genome-wide association studies (GWAS). This tool efficiently displays information from blood cell relevant tissues on epigenomic signatures, functional and conservation summary scores, variant impact on protein and gene expression, chromatin conformation information from Hi-C and similar technologies, as well as publicly available GWAS and phenome-wide association study (PheWAS) results. Variants are classified into multiple prioritization categories according to these functional signatures. Leveraging data generated from independent functional validation experiments, we demonstrate that our prioritized variants are enriched within experimentally validated variant sets. VAMPIRE allows rapid prioritization and interpretation of blood cell trait GWAS variants and could be easily adapted for use with other complex trait GWAS results and extended to new annotation sources.Author SummaryMany large genome-wide association studies (GWAS) have recently been performed for blood cell traits, with thousands of associations identified. However, most of the associated variants are in noncoding regions and are often hard to interpret, link to genes, and prioritize for functional follow-up. Similar challenges exist for genetic studies of many other traits and diseases. Trying to translate knowledge of GWAS significant variants to target genes and biological insights, we here describe VAMPIRE: Variant Annotation Method Pointing to Interesting Regulatory Effects, an interactive web application implemented in R Shiny (http://shiny.bios.unc.edu/vampire/) for blood cell trait associated loci from recent large multi-ethnic GWAS. This tool displays a variety of information including epigenomic signatures, variant impact on protein and gene expression, chromatin conformation information, and publicly available GWAS and phenome-wide association study (PheWAS) results for other traits. We classified variants into annotation categories using this information, and show that variants in the highest priority categories are enriched in likely causal variant sets from previous functional experiments. We anticipate this tool will guide appropriate variants to prioritize for experimental validation for researchers studying blood cell traits, as well as providing an easily adaptable model for the creation of similar annotation tools for other complex traits and diseases.

2021 ◽  
Author(s):  
Gui-Juan Feng ◽  
Qian Xu ◽  
Jing-Jing Ni ◽  
Shan-Shan Yang ◽  
Bai-Xue Han ◽  
...  

Abstract Age at menarche (AAM) is a sign of puberty of females. It is a heritable trait associated with various adult diseases. However, the genetic mechanism that determines AAM and links it to disease risk is poorly understood. Aiming to uncover the genetic basis for AAM, we conducted a joint association study in up to 438,089 participants from 3 genome-wide association studies of European and East Asian ancestries. Twenty-one novel genomic loci were identified at the genome-wide significance level. Besides, we observed significant genetic correlations between AAM and 67 complex traits, and the highest genetic correlation was observed between AAM and body mass index (rg=-0.19, P=6.11×10−31). Latent causal variable analyses demonstrate that there is a genetically causal effect of AAM on high blood pressure (GCP=0.47, P=0.02), forced vital capacity (GCP=0.63, P=0.02), age at first live birth (GCP=0.51, P=0.03), impedance of right arm (GCP=0.41, P<1×10-7) and right leg fat percentage (GCP=-0.10, P=0.02), etc. Enrichment analysis identified 5 enriched tissues and 51 enriched gene sets. Four of the five enriched tissues were related to the nervous system, including the hypothalamus middle, hypothalamo hypophyseal system, neurosecretory systems and hypothalamus. The fifth tissue was the retina in the sensory organ. The most significant gene set was the ‘decreased circulating luteinizing hormone level’ (P=2.45×10-6). Our findings may provide useful insights that elucidate the mechanisms determining AAM and the genetic interplay between AAM and some traits of women.


2018 ◽  
Author(s):  
Urmo Võsa ◽  
Annique Claringbould ◽  
Harm-Jan Westra ◽  
Marc Jan Bonder ◽  
Patrick Deelen ◽  
...  

SummaryWhile many disease-associated variants have been identified through genome-wide association studies, their downstream molecular consequences remain unclear.To identify these effects, we performedcis-andtrans-expressionquantitative trait locus (eQTL) analysis in blood from 31,684 individuals through the eQTLGen Consortium.We observed thatcis-eQTLs can be detected for 88% of the studied genes, but that they have a different genetic architecture compared to disease-associated variants, limiting our ability to usecis-eQTLs to pinpoint causal genes within susceptibility loci.In contrast, trans-eQTLs (detected for 37% of 10,317 studied trait-associated variants) were more informative. Multiple unlinked variants, associated to the same complex trait, often converged on trans-genes that are known to play central roles in disease etiology.We observed the same when ascertaining the effect of polygenic scores calculated for 1,263 genome-wide association study (GWAS) traits. Expression levels of 13% of the studied genes correlated with polygenic scores, and many resulting genes are known to drive these traits.


2019 ◽  
Vol 20 (1) ◽  
pp. 461-493 ◽  
Author(s):  
Guy Sella ◽  
Nicholas H. Barton

Many traits of interest are highly heritable and genetically complex, meaning that much of the variation they exhibit arises from differences at numerous loci in the genome. Complex traits and their evolution have been studied for more than a century, but only in the last decade have genome-wide association studies (GWASs) in humans begun to reveal their genetic basis. Here, we bring these threads of research together to ask how findings from GWASs can further our understanding of the processes that give rise to heritable variation in complex traits and of the genetic basis of complex trait evolution in response to changing selection pressures (i.e., of polygenic adaptation). Conversely, we ask how evolutionary thinking helps us to interpret findings from GWASs and informs related efforts of practical importance.


BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Yao Hu ◽  
Stephanie A. Bien ◽  
Katherine K. Nishimura ◽  
Jeffrey Haessler ◽  
Chani J. Hodonsky ◽  
...  

Abstract Background Circulating white blood cell and platelet traits are clinically linked to various disease outcomes and differ across individuals and ancestry groups. Genetic factors play an important role in determining these traits and many loci have been identified. However, most of these findings were identified in populations of European ancestry (EA), with African Americans (AA), Hispanics/Latinos (HL), and other races/ethnicities being severely underrepresented. Results We performed ancestry-combined and ancestry-specific genome-wide association studies (GWAS) for white blood cell and platelet traits in the ancestrally diverse Population Architecture using Genomics and Epidemiology (PAGE) Study, including 16,201 AA, 21,347 HL, and 27,236 EA participants. We identified six novel findings at suggestive significance (P < 5E-8), which need confirmation, and independent signals at six previously established regions at genome-wide significance (P < 2E-9). We confirmed multiple previously reported genome-wide significant variants in the single variant association analysis and multiple genes using PrediXcan. Evaluation of loci reported from a Euro-centric GWAS indicated attenuation of effect estimates in AA and HL compared to EA populations. Conclusions Our results highlighted the potential to identify ancestry-specific and ancestry-agnostic variants in participants with diverse backgrounds and advocate for continued efforts in improving inclusion of racially/ethnically diverse populations in genetic association studies for complex traits.


Author(s):  
Jianhua Wang ◽  
Dandan Huang ◽  
Yao Zhou ◽  
Hongcheng Yao ◽  
Huanhuan Liu ◽  
...  

Abstract Genome-wide association studies (GWASs) have revolutionized the field of complex trait genetics over the past decade, yet for most of the significant genotype-phenotype associations the true causal variants remain unknown. Identifying and interpreting how causal genetic variants confer disease susceptibility is still a big challenge. Herein we introduce a new database, CAUSALdb, to integrate the most comprehensive GWAS summary statistics to date and identify credible sets of potential causal variants using uniformly processed fine-mapping. The database has six major features: it (i) curates 3052 high-quality, fine-mappable GWAS summary statistics across five human super-populations and 2629 unique traits; (ii) estimates causal probabilities of all genetic variants in GWAS significant loci using three state-of-the-art fine-mapping tools; (iii) maps the reported traits to a powerful ontology MeSH, making it simple for users to browse studies on the trait tree; (iv) incorporates highly interactive Manhattan and LocusZoom-like plots to allow visualization of credible sets in a single web page more efficiently; (v) enables online comparison of causal relations on variant-, gene- and trait-levels among studies with different sample sizes or populations and (vi) offers comprehensive variant annotations by integrating massive base-wise and allele-specific functional annotations. CAUSALdb is freely available at http://mulinlab.org/causaldb.


2021 ◽  
Vol 42 (1) ◽  
Author(s):  
Dinesh K. Saini ◽  
Yuvraj Chopra ◽  
Jagmohan Singh ◽  
Karansher S. Sandhu ◽  
Anand Kumar ◽  
...  

Author(s):  
Nasa Sinnott-Armstrong ◽  
Sahin Naqvi ◽  
Manuel Rivas ◽  
Jonathan K Pritchard

SummaryGenome-wide association studies (GWAS) have been used to study the genetic basis of a wide variety of complex diseases and other traits. However, for most traits it remains difficult to interpret what genes and biological processes are impacted by the top hits. Here, as a contrast, we describe UK Biobank GWAS results for three molecular traits—urate, IGF-1, and testosterone—that are biologically simpler than most diseases, and for which we know a great deal in advance about the core genes and pathways. Unlike most GWAS of complex traits, for all three traits we find that most top hits are readily interpretable. We observe huge enrichment of significant signals near genes involved in the relevant biosynthesis, transport, or signaling pathways. We show how GWAS data illuminate the biology of variation in each trait, including insights into differences in testosterone regulation between females and males. Meanwhile, in other respects the results are reminiscent of GWAS for more-complex traits. In particular, even these molecular traits are highly polygenic, with most of the variance coming not from core genes, but from thousands to tens of thousands of variants spread across most of the genome. Given that diseases are often impacted by many distinct biological processes, including these three, our results help to illustrate why so many variants can affect risk for any given disease.


2019 ◽  
Author(s):  
Jan A. Freudenthal ◽  
Markus J. Ankenbrand ◽  
Dominik G. Grimm ◽  
Arthur Korte

AbstractMotivationGenome-wide association studies (GWAS) are one of the most commonly used methods to detect associations between complex traits and genomic polymorphisms. As both genotyping and phenotyping of large populations has become easier, typical modern GWAS have to cope with massive amounts of data. Thus, the computational demand for these analyses grew remarkably during the last decades. This is especially true, if one wants to implement permutation-based significance thresholds, instead of using the naïve Bonferroni threshold. Permutation-based methods have the advantage to provide an adjusted multiple hypothesis correction threshold that takes the underlying phenotypic distribution into account and will thus remove the need to find the correct transformation for non Gaussian phenotypes. To enable efficient analyses of large datasets and the possibility to compute permutation-based significance thresholds, we used the machine learning framework TensorFlow to develop a linear mixed model (GWAS-Flow) that can make use of the available CPU or GPU infrastructure to decrease the time of the analyses especially for large datasets.ResultsWe were able to show that our application GWAS-Flow outperforms custom GWAS scripts in terms of speed without loosing accuracy. Apart from p-values, GWAS-Flow also computes summary statistics, such as the effect size and its standard error for each individual marker. The CPU-based version is the default choice for small data, while the GPU-based version of GWAS-Flow is especially suited for the analyses of big data.AvailabilityGWAS-Flow is freely available on GitHub (https://github.com/Joyvalley/GWAS_Flow) and is released under the terms of the MIT-License.


Sign in / Sign up

Export Citation Format

Share Document