Genetic polymorphisms associated with sleep-related phenotypes; relationships with individual nocturnal symptoms of insomnia in the HUNT study

Abstract Background In recent years, several GWAS (genome wide association studies) of sleep-related traits have identified a number of SNPs (single nucleotides polymorphism) but their relationships with symptoms of insomnia are not known. The aim of this study was to investigate whether SNPs, previously reported in association with sleep-related phenotypes, are associated with individual symptoms of insomnia. Methods We selected participants from the HUNT study (Norway) who reported at least one symptom of insomnia consisting of sleep onset, maintenance or early morning awakening difficulties, (cases, N = 2563) compared to participants who presented no symptoms at all (controls, N = 3665). Cases were further divided in seven subgroups according to different combinations of these three symptoms. We used multinomial logistic regressions to test the association among different patterns of symptoms and 59 SNPs identified in past GWAS studies. Results Although 16 SNPS were significantly associated (p < 0.05) with at least one symptom subgroup, none of the investigated SNPs remained significant after correction for multiple testing using the false discovery rate (FDR) method. Conclusions SNPs associated with sleep-related traits do not replicate on any pattern of insomnia symptoms after multiple tests correction. However, correction in this case may be overly conservative.

Download Full-text

Genome-wide meta-analysis of depression identifies 102 independent variants and highlights the importance of the prefrontal brain regions

10.1101/433367 ◽

2018 ◽

Cited By ~ 3

Author(s):

David M. Howard ◽

Mark J. Adams ◽

Toni-Kim Clarke ◽

Jonathan D. Hafferty ◽

Jude Gibson ◽

...

Keyword(s):

Multiple Testing ◽

Drug Repositioning ◽

Association Studies ◽

Meta Analysis ◽

Enrichment Analysis ◽

Brain Regions ◽

Genome Wide Association Studies ◽

Multiple Testing Correction ◽

Synaptic Structure ◽

Genome Wide

AbstractMajor depression is a debilitating psychiatric illness that is typically associated with low mood, anhedonia and a range of comorbidities. Depression has a heritable component that has remained difficult to elucidate with current sample sizes due to the polygenic nature of the disorder. To maximise sample size, we meta-analysed data on 807,553 individuals (246,363 cases and 561,190 controls) from the three largest genome-wide association studies of depression. We identified 102 independent variants, 269 genes, and 15 gene-sets associated with depression, including both genes and gene-pathways associated with synaptic structure and neurotransmission. Further evidence of the importance of prefrontal brain regions in depression was provided by an enrichment analysis. In an independent replication sample of 1,306,354 individuals (414,055 cases and 892,299 controls), 87 of the 102 associated variants were significant following multiple testing correction. Based on the putative genes associated with depression this work also highlights several potential drug repositioning opportunities. These findings advance our understanding of the complex genetic architecture of depression and provide several future avenues for understanding aetiology and developing new treatment approaches.

Download Full-text

An approach to gene-based testing accounting for dependence of tests among nearby genes

10.1101/2021.05.24.445494 ◽

2021 ◽

Author(s):

Ronald J Yurko ◽

Kathryn Roeder ◽

Bernie Devlin ◽

Max G'Sell

Keyword(s):

Multiple Testing ◽

Association Studies ◽

Autism Spectrum ◽

P Value ◽

Genome Wide Association Studies ◽

Strongly Correlated ◽

Test Statistics ◽

Test Statistic ◽

Genome Wide ◽

Insight Into

In genome-wide association studies (GWAS), it has become commonplace to test millions of SNPs for phenotypic association. Gene-based testing can improve power to detect weak signal by reducing multiple testing and pooling signal strength. While such tests account for linkage disequilibrium (LD) structure of SNP alleles within each gene, current approaches do not capture LD of SNPs falling in different nearby genes, which can induce correlation of gene-based test statistics. We introduce an algorithm to account for this correlation. When a gene's test statistic is independent of others, it is assessed separately; when test statistics for nearby genes are strongly correlated, their SNPs are agglomerated and tested as a locus. To provide insight into SNPs and genes driving association within loci, we develop an interactive visualization tool to explore localized signal. We demonstrate our approach in the context of weakly powered GWAS for autism spectrum disorder, which is contrasted to more highly powered GWAS for schizophrenia and educational attainment. To increase power for these analyses, especially those for autism, we use adaptive p-value thresholding (AdaPT), guided by high-dimensional metadata modeled with gradient boosted trees, highlighting when and how it can be most useful. Notably our workflow is based on summary statistics.

Download Full-text

The harmonic mean p-value for combining dependent tests

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1814092116 ◽

2019 ◽

Vol 116 (4) ◽

pp. 1195-1200 ◽

Cited By ~ 43

Author(s):

Daniel J. Wilson

Keyword(s):

Multiple Testing ◽

Statistical Power ◽

Scientific Discovery ◽

Association Studies ◽

Harmonic Mean ◽

P Value ◽

Genome Wide Association Studies ◽

Familywise Error Rate ◽

Significance Threshold ◽

Genome Wide

Analysis of “big data” frequently involves statistical comparison of millions of competing hypotheses to discover hidden processes underlying observed patterns of data, for example, in the search for genetic determinants of disease in genome-wide association studies (GWAS). Controlling the familywise error rate (FWER) is considered the strongest protection against false positives but makes it difficult to reach the multiple testing-corrected significance threshold. Here, I introduce the harmonic mean p-value (HMP), which controls the FWER while greatly improving statistical power by combining dependent tests using generalized central limit theorem. I show that the HMP effortlessly combines information to detect statistically significant signals among groups of individually nonsignificant hypotheses in examples of a human GWAS for neuroticism and a joint human–pathogen GWAS for hepatitis C viral load. The HMP simultaneously tests all ways to group hypotheses, allowing the smallest groups of hypotheses that retain significance to be sought. The power of the HMP to detect significant hypothesis groups is greater than the power of the Benjamini–Hochberg procedure to detect significant hypotheses, although the latter only controls the weaker false discovery rate (FDR). The HMP has broad implications for the analysis of large datasets, because it enhances the potential for scientific discovery.

Download Full-text

Assessment of Power and False Discovery Rate in Genome-Wide Association Studies using the BarleyCAP Germplasm

Crop Science ◽

10.2135/cropsci2010.02.0064 ◽

2011 ◽

Vol 51 (1) ◽

pp. 52-59 ◽

Cited By ~ 35

Author(s):

Peter Bradbury ◽

Thomas Parker ◽

Martha T. Hamblin ◽

Jean-Luc Jannink

Keyword(s):

False Discovery Rate ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

False Discovery ◽

Genome Wide

Download Full-text

Multiple testing in genome-wide association studies via hidden Markov models

Bioinformatics ◽

10.1093/bioinformatics/btp476 ◽

2009 ◽

Vol 25 (21) ◽

pp. 2802-2808 ◽

Cited By ~ 31

Author(s):

Zhi Wei ◽

Wenguang Sun ◽

Kai Wang ◽

Hakon Hakonarson

Keyword(s):

Hidden Markov Models ◽

Multiple Testing ◽

Markov Models ◽

Hidden Markov ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Genome Wide

Download Full-text

Jointly determining significance levels of primary and replication studies by controlling the false discovery rate in two-stage genome-wide association studies

Statistical Methods in Medical Research ◽

10.1177/0962280216687168 ◽

2017 ◽

Vol 27 (9) ◽

pp. 2795-2808 ◽

Cited By ~ 1

Author(s):

Wei Jiang ◽

Weichuan Yu

Keyword(s):

False Discovery Rate ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Two Stage ◽

Traditional Methods ◽

Replication Studies ◽

False Discovery ◽

Genome Wide ◽

Significance Levels

In genome-wide association studies, we normally discover associations between genetic variants and diseases/traits in primary studies, and validate the findings in replication studies. We consider the associations identified in both primary and replication studies as true findings. An important question under this two-stage setting is how to determine significance levels in both studies. In traditional methods, significance levels of the primary and replication studies are determined separately. We argue that the separate determination strategy reduces the power in the overall two-stage study. Therefore, we propose a novel method to determine significance levels jointly. Our method is a reanalysis method that needs summary statistics from both studies. We find the most powerful significance levels when controlling the false discovery rate in the two-stage study. To enjoy the power improvement from the joint determination method, we need to select single nucleotide polymorphisms for replication at a less stringent significance level. This is a common practice in studies designed for discovery purpose. We suggest this practice is also suitable in studies with validation purpose in order to identify more true findings. Simulation experiments show that our method can provide more power than traditional methods and that the false discovery rate is well-controlled. Empirical experiments on datasets of five diseases/traits demonstrate that our method can help identify more associations. The R-package is available at: http://bioinformatics.ust.hk/RFdr.html .

Download Full-text

Mixture model-based association analysis with case-control data in genome wide association studies

Statistical Applications in Genetics and Molecular Biology ◽

10.1515/sagmb-2016-0022 ◽

2017 ◽

Vol 16 (3) ◽

Author(s):

Fadhaa Ali ◽

Jian Zhang

Keyword(s):

Mixture Model ◽

Multiple Testing ◽

Hypothesis Test ◽

Association Studies ◽

Real Data ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Model Based ◽

Genome Wide ◽

The Individual

AbstractMultilocus haplotype analysis of candidate variants with genome wide association studies (GWAS) data may provide evidence of association with disease, even when the individual loci themselves do not. Unfortunately, when a large number of candidate variants are investigated, identifying risk haplotypes can be very difficult. To meet the challenge, a number of approaches have been put forward in recent years. However, most of them are not directly linked to the disease-penetrances of haplotypes and thus may not be efficient. To fill this gap, we propose a mixture model-based approach for detecting risk haplotypes. Under the mixture model, haplotypes are clustered directly according to their estimated disease penetrances. A theoretical justification of the above model is provided. Furthermore, we introduce a hypothesis test for haplotype inheritance patterns which underpin this model. The performance of the proposed approach is evaluated by simulations and real data analysis. The results show that the proposed approach outperforms an existing multiple testing method.

Download Full-text

Controlling the false discovery rate in GWAS with population structure

10.1101/2020.08.04.236703 ◽

2020 ◽

Author(s):

Matteo Sesia ◽

Stephen Bates ◽

Emmanuel Candès ◽

Jonathan Marchini ◽

Chiara Sabatti

Keyword(s):

Population Structure ◽

False Discovery Rate ◽

Markov Models ◽

State Of The Art ◽

Hidden Markov ◽

Association Studies ◽

Genome Wide Association Studies ◽

False Discovery ◽

Identical By Descent ◽

Genome Wide

AbstractThis paper proposes a novel statistical method to address population structure in genome-wide association studies while controlling the false discovery rate, which overcomes some limitations of existing approaches. Our solution accounts for linkage disequilibrium and diverse ancestries by combining conditional testing via knockoffs with hidden Markov models from state-of-the-art phasing methods. Furthermore, we account for familial relatedness by describing the joint distribution of haplotypes sharing long identical-by-descent segments with a generalized hidden Markov model. Extensive simulations affirm the validity of this method, while applications to UK Biobank phenotypes yield many more discoveries compared to BOLT-LMM, most of which are confirmed by the Japan Biobank and FinnGen data.

Download Full-text

Identification of novel non-synonymous variants associated with type 2 diabetes-related metabolites in Korean population

Bioscience Reports ◽

10.1042/bsr20190078 ◽

2019 ◽

Vol 39 (10) ◽

Author(s):

Tae-Joon Park ◽

Heun-Sik Lee ◽

Young Jin Kim ◽

Bong-Jo Kim

Keyword(s):

Type 2 Diabetes ◽

Genetic Variants ◽

Multiple Testing ◽

Association Studies ◽

Linear Regression Analysis ◽

Genetic Regulation ◽

Genome Wide Association Studies ◽

Functional Variants ◽

Genome Wide

Abstract Metabolome-genome wide association studies (mGWASs) are useful for understanding the genetic regulation of metabolites in complex diseases, including type 2 diabetes (T2D). Numerous genetic variants associated with T2D-related metabolites have been identified in previous mGWASs; however, these analyses seem to have difficulty in detecting the genetic variants with functional effects. An exome array focussed on potentially functional variants is an alternative platform to obtain insight into the genetics of biochemical conversion processes. In the present study, we performed an mGWAS using 27,140 non-synonymous variants included in the Illumina HumanExome BeadChip and nine T2D-related metabolites identified by a targetted metabolomics approach to evaluate 2,338 Korean individuals from the Korea Association REsource (KARE) cohort. A linear regression analysis controlling for age, sex, BMI, and T2D status as covariates was performed to identify novel non-synonymous variants associated with T2D-related metabolites. We found significant associations between glycine and CPS1 (rs1047883) and PC ae C36:0 and CYP4F2 (rs2108622) variants (P<2.05 × 10−7, after the Bonferroni correction for multiple testing). One of the two significantly associated variants, rs1047883 was newly identified whereas rs2108622 had been previously reported to be associated with T2D-related traits. These findings expand our understanding of the genetic determinants of T2D-related metabolites and provide a basis for further functional validation.

Download Full-text

SEMI-PARAMETRIC COVARIATE-MODULATED LOCAL FALSE DISCOVERY RATE FOR GENOME-WIDE ASSOCIATION STUDIES

10.1101/183384 ◽

2017 ◽

Author(s):

Rong W. Zablocki ◽

Richard A. Levine ◽

Andrew J. Schork ◽

Shujing Xu ◽

Yunpeng Wang ◽

...

Keyword(s):

False Discovery Rate ◽

Complex Traits ◽

Association Studies ◽

Logistic Function ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Local False Discovery Rate ◽

B Spline ◽

False Discovery ◽

Genome Wide

While genome-wide association studies (GWAS) have discovered thousands of risk loci for heritable disorders, so far even very large meta-analyses have recovered only a fraction of the heritability of most complex traits. Recent work utilizing variance components models has demonstrated that a larger fraction of the heritability of complex phenotypes is captured by the additive effects of SNPs than is evident only in loci surpassing genome-wide significance thresholds, typically set at a Bonferroni-inspired p ≤ 5 x 10-8. Procedures that control false discovery rate can be more powerful, yet these are still under-powered to detect the majority of non-null effects from GWAS. The current work proposes a novel Bayesian semi-parametric two-group mixture model and develops a Markov Chain Monte Carlo (MCMC) algorithm for a covariate-modulated local false discovery rate (cmfdr). The probability of being non-null depends on a set of covariates via a logistic function, and the non-null distribution is approximated as a linear combination of B-spline densities, where the weight of each B-spline density depends on a multinomial function of the covariates. The proposed methods were motivated by work on a large meta-analysis of schizophrenia GWAS performed by the Psychiatric Genetics Consortium (PGC). We show that the new cmfdr model fits the PGC schizophrenia GWAS test statistics well, performing better than our previously proposed parametric gamma model for estimating the non-null density and substantially improving power over usual fdr. Using loci declared significant at cmfdr ≤ 0.20, we perform follow-up pathway analyses using the Kyoto Encyclopedia of Genes and Genomes (KEGG) homo sapiens pathways database. We demonstrate that the increased yield from the cmfdr model results in an improved ability to test for pathways associated with schizophrenia compared to using those SNPs selected according to usual fdr.

Download Full-text