Semi-parametric empirical Bayes factor for genome-wide association studies

AbstractBayes factor analysis has the attractive property of accommodating the risks of both false negatives and false positives when identifying susceptibility gene variants in genome-wide association studies (GWASs). For a particular SNP, the critical aspect of this analysis is that it incorporates the probability of obtaining the observed value of a statistic on disease association under the alternative hypotheses of non-null association. An approximate Bayes factor (ABF) was proposed by Wakefield (Genetic Epidemiology 2009;33:79–86) based on a normal prior for the underlying effect-size distribution. However, misspecification of the prior can lead to failure in incorporating the probability under the alternative hypothesis. In this paper, we propose a semi-parametric, empirical Bayes factor (SP-EBF) based on a nonparametric effect-size distribution estimated from the data. Analysis of several GWAS datasets revealed the presence of substantial numbers of SNPs with small effect sizes, and the SP-EBF attributed much greater significance to such SNPs than the ABF. Overall, the SP-EBF incorporates an effect-size distribution that is estimated from the data, and it has the potential to improve the accuracy of Bayes factor analysis in GWASs.

Download Full-text

Estimation of effect size distribution from genome-wide association studies and implications for future discoveries

Nature Genetics ◽

10.1038/ng.610 ◽

2010 ◽

Vol 42 (7) ◽

pp. 570-575 ◽

Cited By ~ 427

Author(s):

Ju-Hyun Park ◽

Sholom Wacholder ◽

Mitchell H Gail ◽

Ulrike Peters ◽

Kevin B Jacobs ◽

...

Keyword(s):

Size Distribution ◽

Effect Size ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Genome Wide ◽

Effect Size Distribution

Download Full-text

T21ESTIMATING POLYGENICITY AND EFFECT-SIZE DISTRIBUTION IN FUNCTIONAL CATEGORIES OF THE GENOME USING SUMMARY STATISTICS DATA FROM GENOME-WIDE ASSOCIATION STUDIES

European Neuropsychopharmacology ◽

10.1016/j.euroneuro.2019.08.220 ◽

2019 ◽

Vol 29 ◽

pp. S229

Author(s):

Alexey Shadrin ◽

Oleksandr Frei ◽

Olav Smeland ◽

Francesco Bettella ◽

Kevin O'Connell ◽

...

Keyword(s):

Size Distribution ◽

Effect Size ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Summary Statistics ◽

Functional Categories ◽

Genome Wide ◽

Effect Size Distribution

Download Full-text

An Empirical Bayes Mixture Model for Effect Size Distributions in Genome-Wide Association Studies

PLoS Genetics ◽

10.1371/journal.pgen.1005717 ◽

2015 ◽

Vol 11 (12) ◽

pp. e1005717 ◽

Cited By ~ 18

Author(s):

Wesley K. Thompson ◽

Yunpeng Wang ◽

Andrew J. Schork ◽

Aree Witoelar ◽

Verena Zuber ◽

...

Keyword(s):

Mixture Model ◽

Effect Size ◽

Empirical Bayes ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Size Distributions ◽

Genome Wide

Download Full-text

Faculty Opinions recommendation of Estimation of complex effect-size distributions using summary-level statistics from genome-wide association studies across 32 complex traits.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.733803377.793550136 ◽

2018 ◽

Author(s):

Mohan Liu

Keyword(s):

Effect Size ◽

Complex Traits ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Size Distributions ◽

Complex Effect ◽

Genome Wide ◽

Level Statistics

Download Full-text

Reproducibility in the UK Biobank of Genome-Wide Significant Signals Discovered in Earlier Genome-wide Association Studies

10.1101/2020.06.24.20139576 ◽

2020 ◽

Cited By ~ 1

Author(s):

Jack W. O’Sullivan ◽

John P. A. Ioannidis

Keyword(s):

Effect Size ◽

Association Studies ◽

Genome Wide Association ◽

P Value ◽

Genome Wide Association Studies ◽

Uk Biobank ◽

Single Nucleotide ◽

Genome Wide ◽

The Uk ◽

Open Question

AbstractWith the establishment of large biobanks, discovery of single nucleotide polymorphism (SNPs) that are associated with various phenotypes has been accelerated. An open question is whether SNPs identified with genome-wide significance in earlier genome-wide association studies (GWAS) are replicated also in later GWAS conducted in biobanks. To address this question, the authors examined a publicly available GWAS database and identified two, independent GWAS on the same phenotype (an earlier, “discovery” GWAS and a later, replication GWAS done in the UK biobank). The analysis evaluated 136,318,924 SNPs (of which 6,289 had reached p<5e-8 in the discovery GWAS) from 4,397,962 participants across nine phenotypes. The overall replication rate was 85.0% and it was lower for binary than for quantitative phenotypes (58.1% versus 94.8% respectively). There was a18.0% decrease in SNP effect size for binary phenotypes, but a 12.0% increase for quantitative phenotypes. Using the discovery SNP effect size, phenotype trait (binary or quantitative), and discovery p-value, we built and validated a model that predicted SNP replication with area under the Receiver Operator Curve = 0.90. While non-replication may often reflect lack of power rather than genuine false-positive findings, these results provide insights about which discovered associations are likely to be seen again across subsequent GWAS.

Download Full-text