scholarly journals LD Score Regression Distinguishes Confounding from Polygenicity in Genome-Wide Association Studies

2014 ◽  
Author(s):  
Brendan K. Bulik-Sullivan ◽  
Po-Ru Loh ◽  
Hilary Finucane ◽  
Stephan Ripke ◽  
Jian Yang ◽  
...  

AbstractBoth polygenicity1,2 (i.e. many small genetic effects) and confounding biases, such as cryptic relatedness and population stratification3, can yield inflated distributions of test statistics in genome-wide association studies (GWAS). However, current methods cannot distinguish between inflation from bias and true signal from polygenicity. We have developed an approach that quantifies the contributions of each by examining the relationship between test statistics and linkage disequilibrium (LD). We term this approach LD Score regression. LD Score regression provides an upper bound on the contribution of confounding bias to the observed inflation in test statistics and can be used to estimate a more powerful correction factor than genomic control4–14. We find strong evidence that polygenicity accounts for the majority of test statistic inflation in many GWAS of large sample size.

2021 ◽  
Author(s):  
Ronald J Yurko ◽  
Kathryn Roeder ◽  
Bernie Devlin ◽  
Max G'Sell

In genome-wide association studies (GWAS), it has become commonplace to test millions of SNPs for phenotypic association. Gene-based testing can improve power to detect weak signal by reducing multiple testing and pooling signal strength. While such tests account for linkage disequilibrium (LD) structure of SNP alleles within each gene, current approaches do not capture LD of SNPs falling in different nearby genes, which can induce correlation of gene-based test statistics. We introduce an algorithm to account for this correlation. When a gene's test statistic is independent of others, it is assessed separately; when test statistics for nearby genes are strongly correlated, their SNPs are agglomerated and tested as a locus. To provide insight into SNPs and genes driving association within loci, we develop an interactive visualization tool to explore localized signal. We demonstrate our approach in the context of weakly powered GWAS for autism spectrum disorder, which is contrasted to more highly powered GWAS for schizophrenia and educational attainment. To increase power for these analyses, especially those for autism, we use adaptive p-value thresholding (AdaPT), guided by high-dimensional metadata modeled with gradient boosted trees, highlighting when and how it can be most useful. Notably our workflow is based on summary statistics.


2021 ◽  
Vol 22 (6) ◽  
pp. 365-373
Author(s):  
Sofia Coelho Abreu ◽  
Valéria Tavares ◽  
Filipa Carneiro ◽  
Rui Medeiros

Aim & methods: To review the existing literature concerning the relationship between venous thromboembolism (VTE) and prostate cancer (PC) and explore the putative biological and clinical implications of VTE genetic markers on PC patients by screening the PubMed database. Results: Considering the roles of VTE genome-wide association studies-identified genetic determinants in disease development in the general population, these variants might also underlie the susceptibility for PC-related VTE. Therefore, they could help to identify those with a positive benefit-to-harm ratio for thromboprophylaxis approaches during cancer therapy management, thereby improving patient’s prognosis. Conclusion: Future studies are mandatory to explore the relationship between VTE and PC and dissect the predictive value of VTE genome-wide association studies-identified genetic determinants in PC patients, given their clinical implications.


2021 ◽  
Vol 20 (1) ◽  
Author(s):  
Szymon Zmorzyński ◽  
Wojciech Styk ◽  
Waldemar Klinkosz ◽  
Justyna Iskra ◽  
Agata Anna Filip

Abstract Background The most popular tool used for measuring personality traits is the Five-Factor Model (FFM). It includes neuroticism, extraversion, openness, agreeableness and conscientiousness. Many studies indicated the association of genes encoding neurotransmitter receptors/transporters with personality traits. The relationship connecting polymorphic DNA sequences and FFM features has been described in the case of genes encoding receptors of cannabinoid and dopaminergic systems. Moreover, dopaminergic system receives inputs from other neurotransmitters, like GABAergic or serotoninergic systems. Methods We searched PubMed Central (PMC), Science Direct, Scopus, Cochrane Library, Web of Science and EBSCO databases from their inception to November 19, 2020, to identify original studies, as well as peer-reviewed studies examining the FFM and its association with gene polymorphisms affecting the neurotransmitter functions in central nervous system. Results Serotonin neurons modulate dopamine function. In gene encoding serotonin transporter protein, SLC6A4, was found polymorphism, which was correlated with openness to experience (in Sweden population), and high scores of neuroticism and low levels of agreeableness (in Caucasian population). The genome-wide association studies (GWASs) found an association of 5q34-q35, 3p24, 3q13 regions with higher scores of neuroticism, extraversion and agreeableness. However, the results for chromosome 3 regions are inconsistent, which was shown in our review paper. Conclusions GWASs on polymorphisms are being continued in order to determine and further understand the relationship between the changes in DNA and personality traits.


2018 ◽  
Author(s):  
Armin Pourshafeie ◽  
Carlos D. Bustamante ◽  
Snehit Prabhu

AbstractGenome-wide association studies have been effective at revealing the genetic architecture of simple traits. Extending this approach to more complex phenotypes has necessitated a massive increase in cohort size. To achieve sufficient power, participants are recruited across multiple collaborating institutions, leaving researchers with two choices: either collect all the raw data at a single institution or rely on meta-analyses to test for association. In this work, we present a third alternative. Here, we implement an entire GWAS workflow (quality control, population structure control, and association) in a fully decentralized setting. Our iterative approach (a) does not rely on consolidating the raw data at a single coordination center, and (b) does not hinge upon large sample size assumptions at each silo. As we show, our approach overcomes challenges faced by meta-studies when it comes to associating rare alleles and when case/control proportions are wildly imbalanced at each silo. We demonstrate the feasibility of our method in cohorts ranging in size from 2K (small) to 500K (large), and recruited across 2 to 10 collaborating institutions.


2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Yan Xu ◽  
Li Xing ◽  
Jessica Su ◽  
Xuekui Zhang ◽  
Weiliang Qiu

Abstract Genome-wide association studies (GWASs) aim to detect genetic risk factors for complex human diseases by identifying disease-associated single-nucleotide polymorphisms (SNPs). The traditional SNP-wise approach along with multiple testing adjustment is over-conservative and lack of power in many GWASs. In this article, we proposed a model-based clustering method that transforms the challenging high-dimension-small-sample-size problem to low-dimension-large-sample-size problem and borrows information across SNPs by grouping SNPs into three clusters. We pre-specify the patterns of clusters by minor allele frequencies of SNPs between cases and controls, and enforce the patterns with prior distributions. In the simulation studies our proposed novel model outperforms traditional SNP-wise approach by showing better controls of false discovery rate (FDR) and higher sensitivity. We re-analyzed two real studies to identifying SNPs associated with severe bortezomib-induced peripheral neuropathy (BiPN) in patients with multiple myeloma (MM). The original analysis in the literature failed to identify SNPs after FDR adjustment. Our proposed method not only detected the reported SNPs after FDR adjustment but also discovered a novel BiPN-associated SNP rs4351714 that has been reported to be related to MM in another study.


2020 ◽  
Vol 36 (14) ◽  
pp. 4154-4162
Author(s):  
Meiyue Wang ◽  
Ruidong Li ◽  
Shizhong Xu

Abstract Motivation Genome-wide association studies (GWAS) are still the primary steps toward gene discovery. The urgency is more obvious in the big data era when GWAS are conducted simultaneously for thousand traits, e.g. transcriptomic and metabolomic traits. Efficient mixed model association (EMMA) and genome-wide efficient mixed model association (GEMMA) are the widely used methods for GWAS. An algorithm with high computational efficiency is badly needed. It is interesting to note that the test statistics of the ordinary ridge regression (ORR) have the same patterns across the genome as those obtained from the EMMA method. However, ORR has never been used for GWAS due to its severe shrinkage on the estimated effects and the test statistics. Results We introduce a degree of freedom for each marker effect obtained from ORR and use it to deshrink both the estimated effect and the standard error so that the Wald test of ORR is brought back to the same level as that of EMMA. The new method is called deshrinking ridge regression (DRR). By evaluating the methods under three different model sizes (small, medium and large), we demonstrate that DRR is more generalized for all model sizes than EMMA, which only works for medium and large models. Furthermore, DRR detect all markers in a simultaneous manner instead of scanning one marker at a time. As a result, the computational time complexity of DRR is much simpler than EMMA and about m (number of genetic variants) times simpler than that of GEMMA when the sample size is way smaller than the number of markers. Contact [email protected] Supplementary information Supplementary data are available at Bioinformatics online.


Sign in / Sign up

Export Citation Format

Share Document