scholarly journals A Pleiotropy-Informed Bayesian False Discovery Rate adapted to a Shared Control Design Finds New Disease Associations From GWAS Summary Statistics

2015 ◽  
Author(s):  
James Liley ◽  
Chris Wallace

Genome-wide association studies (GWAS) have been successful in identifying single nucleotide polymorphisms (SNPs) associated with many traits and diseases. However, at existing sample sizes, these variants explain only part of the estimated heritability. Leverage of GWAS results from related phenotypes may improve detection without the need for larger datasets. The Bayesian conditional false discovery rate (cFDR) constitutes an upper bound on the expected false discovery rate (FDR) across a set of SNPs whose p values for two diseases are both less than two disease-specific thresholds. Calculation of the cFDR requires only summary statistics and has several advantages over traditional GWAS analysis. However, existing methods require distinct control samples between studies. Here, we extend the technique to allow for some or all controls to be shared, increasing applicability. Several different SNP sets can be defined with the same cFDR value, and we show that the expected FDR across the union of these sets may exceed expected FDR in any single set. We describe a procedure to establish an upper bound for the expected FDR among the union of such sets of SNPs. We apply our technique to pairwise analysis of p values from ten autoimmune diseases with variable sharing of controls, enabling discovery of 59 SNP-disease associations which do not reach GWAS significance after genomic control in individual datasets. Most of the SNPs we highlight have previously been confirmed using replication studies or larger GWAS, a useful validation of our technique; we report eight SNP-disease associations across five diseases not previously declared. Our technique extends and strengthens the previous algorithm, and establishes robust limits on the expected FDR. This approach can improve SNP detection in GWAS, and give insight into shared aetiology between phenotypically related conditions.

2017 ◽  
Vol 27 (9) ◽  
pp. 2795-2808 ◽  
Author(s):  
Wei Jiang ◽  
Weichuan Yu

In genome-wide association studies, we normally discover associations between genetic variants and diseases/traits in primary studies, and validate the findings in replication studies. We consider the associations identified in both primary and replication studies as true findings. An important question under this two-stage setting is how to determine significance levels in both studies. In traditional methods, significance levels of the primary and replication studies are determined separately. We argue that the separate determination strategy reduces the power in the overall two-stage study. Therefore, we propose a novel method to determine significance levels jointly. Our method is a reanalysis method that needs summary statistics from both studies. We find the most powerful significance levels when controlling the false discovery rate in the two-stage study. To enjoy the power improvement from the joint determination method, we need to select single nucleotide polymorphisms for replication at a less stringent significance level. This is a common practice in studies designed for discovery purpose. We suggest this practice is also suitable in studies with validation purpose in order to identify more true findings. Simulation experiments show that our method can provide more power than traditional methods and that the false discovery rate is well-controlled. Empirical experiments on datasets of five diseases/traits demonstrate that our method can help identify more associations. The R-package is available at: http://bioinformatics.ust.hk/RFdr.html .


Author(s):  
Ismaïl Ahmed ◽  
Anna-Liisa Hartikainen ◽  
Marjo-Riitta Järvelin ◽  
Sylvia Richardson

Stability Selection, which combines penalized regression with subsampling, is a promising algorithm to perform variable selection in ultra high dimension. This work is motivated by its evaluation in the context of genome-wide association studies (GWAS). One critical aspect for its use lies in the choice of a decision rule that accounts for the massive number of comparisons realised. The current decision rule relies on the control of the Family Wise Error Rate (FWER) by means of an upper bound derived theoretically. Alternatively, we propose to set the detection threshold according to the more liberal false discovery rate (FDR) criterion. The procedure we propose for its estimation relies on permutations. This procedure is evaluated by simulations according to several scenarios mimicking various correlation structures of genetic data and is compared to the original FWER upper bound. The proposed procedure is shown to be less conservative, and able to pick up more true signals than the FWER upper bound. Finally, the proposed methodology is illustrated on a GWAS analysis of a lipid phenotype (high-density lipoproteins, HDL) in the Northern Finland Birth Cohort.


2020 ◽  
Author(s):  
Matteo Sesia ◽  
Stephen Bates ◽  
Emmanuel Candès ◽  
Jonathan Marchini ◽  
Chiara Sabatti

AbstractThis paper proposes a novel statistical method to address population structure in genome-wide association studies while controlling the false discovery rate, which overcomes some limitations of existing approaches. Our solution accounts for linkage disequilibrium and diverse ancestries by combining conditional testing via knockoffs with hidden Markov models from state-of-the-art phasing methods. Furthermore, we account for familial relatedness by describing the joint distribution of haplotypes sharing long identical-by-descent segments with a generalized hidden Markov model. Extensive simulations affirm the validity of this method, while applications to UK Biobank phenotypes yield many more discoveries compared to BOLT-LMM, most of which are confirmed by the Japan Biobank and FinnGen data.


2017 ◽  
Author(s):  
Rong W. Zablocki ◽  
Richard A. Levine ◽  
Andrew J. Schork ◽  
Shujing Xu ◽  
Yunpeng Wang ◽  
...  

While genome-wide association studies (GWAS) have discovered thousands of risk loci for heritable disorders, so far even very large meta-analyses have recovered only a fraction of the heritability of most complex traits. Recent work utilizing variance components models has demonstrated that a larger fraction of the heritability of complex phenotypes is captured by the additive effects of SNPs than is evident only in loci surpassing genome-wide significance thresholds, typically set at a Bonferroni-inspired p ≤ 5 x 10-8. Procedures that control false discovery rate can be more powerful, yet these are still under-powered to detect the majority of non-null effects from GWAS. The current work proposes a novel Bayesian semi-parametric two-group mixture model and develops a Markov Chain Monte Carlo (MCMC) algorithm for a covariate-modulated local false discovery rate (cmfdr). The probability of being non-null depends on a set of covariates via a logistic function, and the non-null distribution is approximated as a linear combination of B-spline densities, where the weight of each B-spline density depends on a multinomial function of the covariates. The proposed methods were motivated by work on a large meta-analysis of schizophrenia GWAS performed by the Psychiatric Genetics Consortium (PGC). We show that the new cmfdr model fits the PGC schizophrenia GWAS test statistics well, performing better than our previously proposed parametric gamma model for estimating the non-null density and substantially improving power over usual fdr. Using loci declared significant at cmfdr ≤ 0.20, we perform follow-up pathway analyses using the Kyoto Encyclopedia of Genes and Genomes (KEGG) homo sapiens pathways database. We demonstrate that the increased yield from the cmfdr model results in an improved ability to test for pathways associated with schizophrenia compared to using those SNPs selected according to usual fdr.


2021 ◽  
Vol 5 (Supplement_1) ◽  
pp. 986-986
Author(s):  
Yury Loika ◽  
Elena Loiko ◽  
Irina Culminskaya ◽  
Alexander Kulminski

Abstract Epidemiological studies report beneficial associations of higher educational attainment (EDU) with Alzheimer’s disease (AD). Prior genome-wide association studies (GWAS) also reported variants associated with AD and EDU separately. The analysis of pleiotropic predisposition to these phenotypes may shed light on EDU-related protection against AD. We examined pleiotropic predisposition to AD and EDU using Fisher’s method and omnibus test applied to summary statistics for single nucleotide polymorphisms (SNPs) associated with AD and EDU in large-scale univariate GWAS at suggestive-effect (5×10-8


BMC Medicine ◽  
2021 ◽  
Vol 19 (1) ◽  
Author(s):  
Haojie Lu ◽  
Jiahao Qiao ◽  
Zhonghe Shao ◽  
Ting Wang ◽  
Shuiping Huang ◽  
...  

Abstract Background Recent genome-wide association studies (GWASs) have revealed the polygenic nature of psychiatric disorders and discovered a few of single-nucleotide polymorphisms (SNPs) associated with multiple psychiatric disorders. However, the extent and pattern of pleiotropy among distinct psychiatric disorders remain not completely clear. Methods We analyzed 14 psychiatric disorders using summary statistics available from the largest GWASs by far. We first applied the cross-trait linkage disequilibrium score regression (LDSC) to estimate genetic correlation between disorders. Then, we performed a gene-based pleiotropy analysis by first aggregating a set of SNP-level associations into a single gene-level association signal using MAGMA. From a methodological perspective, we viewed the identification of pleiotropic associations across the entire genome as a high-dimensional problem of composite null hypothesis testing and utilized a novel method called PLACO for pleiotropy mapping. We ultimately implemented functional analysis for identified pleiotropic genes and used Mendelian randomization for detecting causal association between these disorders. Results We confirmed extensive genetic correlation among psychiatric disorders, based on which these disorders can be grouped into three diverse categories. We detected a large number of pleiotropic genes including 5884 associations and 2424 unique genes and found that differentially expressed pleiotropic genes were significantly enriched in pancreas, liver, heart, and brain, and that the biological process of these genes was remarkably enriched in regulating neurodevelopment, neurogenesis, and neuron differentiation, offering substantial evidence supporting the validity of identified pleiotropic loci. We further demonstrated that among all the identified pleiotropic genes there were 342 unique ones linked with 6353 drugs with drug-gene interaction which can be classified into distinct types including inhibitor, agonist, blocker, antagonist, and modulator. We also revealed causal associations among psychiatric disorders, indicating that genetic overlap and causality commonly drove the observed co-existence of these disorders. Conclusions Our study is among the first large-scale effort to characterize gene-level pleiotropy among a greatly expanded set of psychiatric disorders and provides important insight into shared genetic etiology underlying these disorders. The findings would inform psychiatric nosology, identify potential neurobiological mechanisms predisposing to specific clinical presentations, and pave the way to effective drug targets for clinical treatment.


Sign in / Sign up

Export Citation Format

Share Document