causal snps
Recently Published Documents


TOTAL DOCUMENTS

14
(FIVE YEARS 5)

H-INDEX

4
(FIVE YEARS 0)

2021 ◽  
Vol 17 (10) ◽  
pp. e1009483
Author(s):  
Ruth Johnson ◽  
Kathryn S. Burch ◽  
Kangcheng Hou ◽  
Mario Paciuc ◽  
Bogdan Pasaniuc ◽  
...  

The number of variants that have a non-zero effect on a trait (i.e. polygenicity) is a fundamental parameter in the study of the genetic architecture of a complex trait. Although many previous studies have investigated polygenicity at a genome-wide scale, a detailed understanding of how polygenicity varies across genomic regions is currently lacking. In this work, we propose an accurate and scalable statistical framework to estimate regional polygenicity for a complex trait. We show that our approach yields approximately unbiased estimates of regional polygenicity in simulations across a wide-range of various genetic architectures. We then partition the polygenicity of anthropometric and blood pressure traits across 6-Mb genomic regions (N = 290K, UK Biobank) and observe that all analyzed traits are highly polygenic: over one-third of regions harbor at least one causal variant for each of the traits analyzed. Additionally, we observe wide variation in regional polygenicity: on average across all traits, 48.9% of regions contain at least 5 causal SNPs, 5.44% of regions contain at least 50 causal SNPs. Finally, we find that heritability is proportional to polygenicity at the regional level, which is consistent with the hypothesis that heritability enrichments are largely driven by the variation in the number of causal SNPs.


2021 ◽  
Vol 14 (1) ◽  
Author(s):  
Kerry E. Poppenberg ◽  
Vincent M. Tutino ◽  
Evan Tarbell ◽  
James N. Jarvis

Abstract Background Genetic variants in the human leukocyte antigen (HLA) locus contribute to the risk for developing scleroderma/systemic sclerosis (SSc). However, there are other replicated loci that also contribute to genetic risk for SSc, and it is unknown whether genetic risk in these non-HLA loci acts primarily on the vasculature, immune system, fibroblasts, or other relevant cell types. We used the Cistrome database to investigate the epigenetic landscapes surrounding 11 replicated SSc associated loci to determine whether SNPs in these loci may affect regulatory elements and whether they are likely to impact a specific cell type. Methods We mapped 11 replicated SNPs to haplotypes and sought to determine whether there was significant enrichment for H3K27ac and H3K4me1 marks, epigenetic signatures of enhancer function, on these haplotypes. We queried pathologically relevant cell types: B cells, endothelial cells, fibroblasts, monocytes, and T cells. We then identified the topologically associated domains (TADs) that encompass the SSc risk haplotypes in primary T cells to identify the full range of genes that may be influenced by SSc causal SNPs. We used gene ontology analyses of the genes within the TADs to gain insight into immunologic functions that might be affected by SSc causal SNPs. Results The SSc-associated haplotypes were enriched (p value < 0.01) for H3K4me1/H3K27ac marks in monocytes. Enrichment of one of the two histone marks was found in B cells, fibroblasts, and T cells. No enrichment was identified in endothelial cells. Ontological analyses of genes within the TADs encompassing the risk haplotypes showed enrichment for regulation of transcription, protein binding, activation of T lymphocytes, and proliferation of immune cells. Conclusions The 11 non-HLA SSc risk haplotypes queried are highly enriched for H3K4me1/H3K27ac-marked regulatory elements in a broad range of immune cells and fibroblasts. Furthermore, in immune cells, the risk haplotypes belong to larger chromatin structures encompassing genes that regulate a wide array of immune processes associated with SSc pathogenesis. Though importance of the vasculature in the pathobiology of SSc is widely accepted, we were unable to find evidence for genetic influences on endothelial cell function in these regions.


Author(s):  
Kimberly E Taylor ◽  
K Mark Ansel ◽  
Alexander Marson ◽  
Lindsey A Criswell ◽  
Kyle Kai-How Farh

Abstract   The Probabilistic Identification of Causal SNPs (PICS) algorithm and web application was developed as a fine-mapping tool to determine the likelihood that each single nucleotide polymorphism (SNP) in LD with a reported index SNP is a true causal polymorphism. PICS is notable for its ability to identify candidate causal SNPs within a locus using only the index SNP, which are widely available from published GWAS, whereas other methods require full summary statistics or full genotype data. However, the original PICS web application operates on a single SNP at a time, with slow performance, severely limiting its usability. We have developed a next-generation PICS tool, PICS2, which enables performance of PICS analyses of large batches of index SNPs with much faster performance. Additional updates and extensions include use of LD reference data generated from 1000 Genomes phase 3; annotation of variant consequences; annotation of GTEx eQTL genes and downloadable PICS SNPs from GTEx eQTLs; the option of generating PICS probabilities from experimental summary statistics; and generation of PICS SNPs from all SNPs of the GWAS catalog, automatically updated weekly. These free and easy-to-use resources will enable efficient determination of candidate loci for biological studies to investigate the true causal variants underlying disease processes. Availability PICS2 is available at https://pics2.ucsf.edu. Supplementary information Supplementary data are available at Bioinformatics online.


2019 ◽  
Author(s):  
Huwenbo Shi ◽  
Kathryn S. Burch ◽  
Ruth Johnson ◽  
Malika K. Freund ◽  
Gleb Kichaev ◽  
...  

AbstractDespite strong transethnic genetic correlations reported in the literature for many complex traits, the non-transferability of polygenic risk scores across populations suggests the presence of population-specific components of genetic architecture. We propose an approach that models GWAS summary data for one trait in two populations to estimate genome-wide proportions of population-specific/shared causal SNPs. In simulations across various genetic architectures, we show that our approach yields approximately unbiased estimates with in-sample LD and slight upward-bias with out-of-sample LD. We analyze 9 complex traits in individuals of East Asian and European ancestry, restricting to common SNPs (MAF > 5%), and find that most common causal SNPs are shared by both populations. Using the genome-wide estimates as priors in an empirical Bayes framework, we perform fine-mapping and observe that high-posterior SNPs (for both the population-specific and shared causal configurations) have highly correlated effects in East Asians and Europeans. In population-specific GWAS risk regions, we observe a 2.8x enrichment of shared high-posterior SNPs, suggesting that population-specific GWAS risk regions harbor shared causal SNPs that are undetected in the other GWAS due to differences in LD, allele frequencies, and/or sample size. Finally, we report enrichments of shared high-posterior SNPs in 53 tissue-specific functional categories and find evidence that SNP-heritability enrichments are driven largely by many low-effect common SNPs.


2019 ◽  
Author(s):  
Xiao Zhang ◽  
Kenneth C. Ehrlich ◽  
Fangtang Yu ◽  
Xiaojun Hu ◽  
Hong-Wen Deng ◽  
...  

AbstractA major challenge in translating findings from genome-wide association studies (GWAS) to biological mechanisms is pinpointing functional variants because only a very small percentage of variants associated with a given trait actually impact the trait. We used an extensive epigenetics, transcriptomics, and genetics analysis of theTBX15/WARS2neighborhood to prioritize this region’s best-candidate causal variants for the genetic risk of osteoporosis (estimated bone density, eBMD) and obesity (waist-hip ratio or waist circumference adjusted for body mass index).TBX15encodes a transcription factor that is important in bone development and adipose biology. Manual curation of 692 GWAS-derived variants gave eight strong candidates for causal SNPs that modulateTBX15transcription in subcutaneous adipose tissue (SAT) or osteoblasts, which highly and specifically express this gene. None of these SNPs were prioritized by Bayesian fine-mapping. The eight regulatory causal SNPs were in enhancer or promoter chromatin seen preferentially in SAT or osteoblasts atTBX15intron-1 or upstream. They overlap strongly predicted, allele-specific transcription factor binding sites. Our analysis suggests that these SNPs act independently of two missense SNPs inTBX15. Remarkably, five of the regulatory SNPs were associated with eBMD and obesity and had the same trait-increasing allele for both. We found thatWARS2obesity-related SNPs can be ascribed to high linkage disequilibrium withTBX15intron-1 SNPs. Our findings from GWAS index, proxy, and imputed SNPs suggest that a few SNPs, including three in a 0.7-kb cluster, act as causal regulatory variants to fine-tuneTBX15expression and, thereby, affect both obesity and osteoporosis risk.


2018 ◽  
Author(s):  
Dominic Holland ◽  
Oleksandr Frei ◽  
Rahul Desikan ◽  
Chun-Chieh Fan ◽  
Alexey A. Shadrin ◽  
...  

AbstractOf signal interest in the genetics of human traits is estimating their polygenicity (the proportion of causally associated single nucleotide polymorphisms (SNPs)) and the discoverability (or effect size variance) of the causal SNPs. Narrow-sense heritability is proportional to the product of these quantities. We present a basic model, using detailed linkage disequilibrium structure from an extensive reference panel, to estimate these quantities from genome-wide association studies (GWAS) summary statistics for SNPs with minor allele frequency >1%. We apply the model to diverse phenotypes and validate the implementation with simulations. We find model polygenicities ranging from ≃ 2 × 10−5 to ≃ 4 × 10−3, with discoverabilities similarly ranging over two orders of magnitude. A power analysis allows us to estimate the proportions of phenotypic variance explained additively by causal SNPs at current sample sizes, and map out sample sizes required to explain larger portions of additive SNP heritability. The model also allows for estimating residual inflation.


2018 ◽  
Author(s):  
Bingxin Zhao ◽  
Fei Zou

Polygenic risk score (PRS) is the state-of-art prediction method for complex traits using summary level data from discovery genome-wide association studies (GWAS). The PRS, as its name suggests, is designed for polygenic traits by aggregating small genetic effects from a large number of causal SNPs and thus is viewed as a powerful method for predicting complex polygenic traits by the genetics community. However, one concern is that the prediction accuracy of PRS in practice remains low with little clinical utility, even for highly heritable traits. Another practical concern is whether genome-wide SNPs should be used in constructing PRS or not. To address the two concerns, we investigate PRS both empirically and theoretically. We show how the performance of PRS is influenced by the triplet (n, p, m), where n, p, m are the sample size, the number of SNPs studied, and the number of true causal SNPs, respectively. For a given heritability, we find that i) when PRS is constructed with all p SNPs (referred as GWAS-PRS), its prediction accuracy is controlled by the p/n ratio; while ii) when PRS is built with a set of top-ranked SNPs that pass a pre-specified threshold (referred as threshold-PRS), its accuracy varies depending on how sparse the true genetic signals are. Only when m is magnitude smaller than n, or genetic signals are sparse, can threshold-PRS perform well and outperform GWAS-PRS. Our results demystify the low performance of PRS in predicting highly polygenic traits, which will greatly increase researchers’ aware-ness of the power and limitations of PRS, and clear up some confusion on the clinical application of PRS.


2017 ◽  
Author(s):  
Dominic Holland ◽  
Oleksandr Frei ◽  
Rahul Desikan ◽  
Chun-Chieh Fan ◽  
Alexey A. Shadrin ◽  
...  

AbstractEstimating the polygenicity (proportion of causally associated single nucleotide polymorphisms (SNPs)) and discoverability (effect size variance) of causal SNPs for human traits is currently of considerable interest. SNP-heritability is proportional to the product of these quantities. We present a basic model, using detailed linkage disequilibrium structure from an extensive reference panel, to estimate these quantities from genome-wide association studies (GWAS) summary statistics. We apply the model to diverse phenotypes and validate the implementation with simulations. We find model polygenicities ranging from ≃ 2 × 10−5to ≃ 4 × 10−3, with discoverabilities similarly ranging over two orders of magnitude. A power analysis allows us to estimate the proportions of phenotypic variance explained additively by causal SNPs reaching genome-wide significance at current sample sizes, and map out sample sizes required to explain larger portions of additive SNP heritability. The model also allows for estimating residual inflation (or deflation from over-correcting of z-scores), and assessing compatibility of replication and discovery GWAS summary statistics.Author SummaryThere are ~10 million common variants in the genome of humans with European ancestry. For any particular phenotype a number of these variants will have some causal effect. It is of great interest to be able to quantify the number of these causal variants and the strength of their effect on the phenotype.Genome wide association studies (GWAS) produce very noisy summary statistics for the association between subsets of common variants and phenotypes. For any phenotype, these statistics collectively are difficult to interpret, but buried within them is the true landscape of causal effects. In this work, we posit a probability distribution for the causal effects, and assess its validity using simulations. Using a detailed reference panel of ~11 million common variants – among which only a small fraction are likely to be causal, but allowing for non-causal variants to show an association with the phenotype due to correlation with causal variants – we implement an exact procedure for estimating the number of causal variants and their mean strength of association with the phenotype. We find that, across different phenotypes, both these quantities – whose product allows for lower bound estimates of heritability – vary by orders of magnitude.


2016 ◽  
Vol 14 (05) ◽  
pp. 1644003 ◽  
Author(s):  
Kento Kodama ◽  
Hiroto Saigo

Despite the accumulation of quantitative trait loci (QTL) data in many complex human diseases, most of current approaches that have attempted to relate genotype to phenotype have achieved limited success, and genetic factors of many common diseases are yet remained to be elucidated. One of the reasons that makes this problem complex is the existence of single nucleotide polymorphism (SNP) interaction, or epistasis. Due to excessive amount of computation for searching the combinatorial space, existing approaches cannot fully incorporate high-order SNP interactions into their models, but limit themselves to detecting only lower-order SNP interactions. We present an empirical approach based on ridge regression with polynomial kernels and model selection technique for determining the true degree of epistasis among SNPs. Computer experiments in simulated data show the ability of the proposed method to correctly predict the number of interacting SNPs provided that the number of samples is large enough relative to the number of SNPs. For cases in which the number of the available samples is limited, we propose to perform sliding window approach to ensure sufficiently large sample/SNP ratio in each window. In computational experiments using heterogeneous stock mice data, our approach has successfully detected subregions that harbor known causal SNPs. Our analysis further suggests the existence of additional candidate causal SNPs interacting to each other in the neighborhood of the known causal gene. Software is available from https://github.com/HirotoSaigo/KDSNP .


Sign in / Sign up

Export Citation Format

Share Document