scholarly journals A response to Yurko et al: H-MAGMA, inheriting a shaky statistical foundation, yields excess false positives

2020 ◽  
Author(s):  
Christiaan de Leeuw ◽  
Nancy Y. A. Sey ◽  
Danielle Posthuma ◽  
Hyejung Won

AbstractHi-C coupled multimarker analysis of genomic annotation (H-MAGMA) was initially developed to advance MAGMA by assigning non-coding SNPs to their cognate genes based on threedimensional chromatin architecture. Yurko and colleagues raised concerns that the SNP-wise mean gene-analysis model of MAGMA may allow inflation in type I errors. Accordingly, we updated MAGMA and found that the updated version (MAGMA v.1.08) effectively controls for error rate inflation. Intrigued by this result, H-MAGMA was also updated by implementing MAGMA v.1.08. As expected, H-MAGMA v.1.08 detected a smaller set of risk genes than its original version (v.1.07), but the overall statistical architecture remained largely unchanged between v.1.07 and v.1.08. H-MAGMA v.1.08 was then applied to genome-wide association studies (GWAS) of five psychiatric disorders, from which we recapitulated our previous findings that psychiatric disorder risk genes display neuronal and prenatal enrichment. Therefore, issues raised by Yurko and colleagues can be overcome by using (H-)MAGMA v.1.08.

Author(s):  
Greg Dyson ◽  
Charles F. Sing

AbstractWe have developed a modified Patient Rule-Induction Method (PRIM) as an alternative strategy for analyzing representative samples of non-experimental human data to estimate and test the role of genomic variations as predictors of disease risk in etiologically heterogeneous sub-samples. A computational limit of the proposed strategy is encountered when the number of genomic variations (predictor variables) under study is large (>500) because permutations are used to generate a null distribution to test the significance of a term (defined by values of particular variables) that characterizes a sub-sample of individuals through the peeling and pasting processes. As an alternative, in this paper we introduce a theoretical strategy that facilitates the quick calculation of Type I and Type II errors in the evaluation of terms in the peeling and pasting processes carried out in the execution of a PRIM analysis that are under-estimated and non-existent, respectively, when a permutation-based hypothesis test is employed. The resultant savings in computational time makes possible the consideration of larger numbers of genomic variations (an example genome-wide association study is given) in the selection of statistically significant terms in the formulation of PRIM prediction models.


2020 ◽  
Author(s):  
Janet C. Harwood ◽  
Ganna Leonenko ◽  
Rebecca Sims ◽  
Valentina Escott-Price ◽  
Julie Williams ◽  
...  

AbstractMore than 50 genetic loci have been identified as being associated with Alzheimer’s disease (AD) from genome-wide association studies (GWAS) and many of these are involved in immune pathways and lipid metabolism. Therefore, we performed a transcriptome-wide association study (TWAS) of immune-relevant cells, to study the mis-regulation of genes implicated in AD. We used expression and genetic data from naive and induced CD14+ monocytes and two GWAS of AD to study genetically controlled gene expression in monocytes at different stages of differentiation and compared the results with those from TWAS of brain and blood. We identified nine genes with statistically independent TWAS signals, seven are known AD risk genes from GWAS: BIN1, PTK2B, SPI1, MS4A4A, MS4A6E, APOE and PVR and two, LACTB2 and PLIN2/ADRP, are novel candidate genes for AD. Three genes, SPI1, PLIN2 and LACTB2, are TWAS significant specifically in monocytes. LACTB2 is a mitochondrial endoribonuclease and PLIN2/ADRP associates with intracellular neutral lipid storage droplets (LSDs) which have been shown to play a role in the regulation of the immune response. Notably, LACTB2 and PLIN2 were not detected from GWAS alone.


2018 ◽  
Vol 2018 ◽  
pp. 1-9 ◽  
Author(s):  
Baolin Wu ◽  
James S. Pankow

Multiple correlated traits are often collected in genetic studies. By jointly analyzing multiple traits, we can increase power by aggregating multiple weak effects and reveal additional insights into the genetic architecture of complex human diseases. In this article, we propose a multivariate linear regression-based method to test the joint association of multiple quantitative traits. It is flexible to accommodate any covariates, has very accurate control of type I errors, and offers very competitive performance. We also discuss fast and accurate significance p value computation especially for genome-wide association studies with small-to-medium sample sizes. We demonstrate through extensive numerical studies that the proposed method has competitive performance. Its usefulness is further illustrated with application to genome-wide association analysis of diabetes-related traits in the Atherosclerosis Risk in Communities (ARIC) study. We found some very interesting associations with diabetes traits which have not been reported before. We implemented the proposed methods in a publicly available R package.


2018 ◽  
Vol 43 (1) ◽  
pp. 102-111 ◽  
Author(s):  
Jeremy A. Sabourin ◽  
Cheryl D. Cropp ◽  
Heejong Sung ◽  
Lawrence C. Brody ◽  
Joan E. Bailey-Wilson ◽  
...  

BMC Genetics ◽  
2005 ◽  
Vol 6 (Suppl 1) ◽  
pp. S134 ◽  
Author(s):  
Qiong Yang ◽  
Jing Cui ◽  
Irmarie Chazaro ◽  
L Adrienne Cupples ◽  
Serkalem Demissie

Genes ◽  
2021 ◽  
Vol 12 (5) ◽  
pp. 736
Author(s):  
Xiaotian Dai ◽  
Guifang Fu ◽  
Shaofei Zhao ◽  
Yifei Zeng

Despite the fact that imbalance between case and control groups is prevalent in genome-wide association studies (GWAS), it is often overlooked. This imbalance is getting more significant and urgent as the rapid growth of biobanks and electronic health records have enabled the collection of thousands of phenotypes from large cohorts, in particular for diseases with low prevalence. The unbalanced binary traits pose serious challenges to traditional statistical methods in terms of both genomic selection and disease prediction. For example, the well-established linear mixed models (LMM) yield inflated type I error rates in the presence of unbalanced case-control ratios. In this article, we review multiple statistical approaches that have been developed to overcome the inaccuracy caused by the unbalanced case-control ratio, with the advantages and limitations of each approach commented. In addition, we also explore the potential for applying several powerful and popular state-of-the-art machine-learning approaches, which have not been applied to the GWAS field yet. This review paves the way for better analysis and understanding of the unbalanced case-control disease data in GWAS.


2020 ◽  
Author(s):  
Anyi Yang ◽  
Jingqi Chen ◽  
Xing-Ming Zhao

AbstractMotivationAnnotating genetic variants from summary statistics of genome-wide association studies (GWAS) is crucial for predicting risk genes of various disorders. The multi-marker analysis of genomic annotation (MAGMA) is one of the most popular tools for this purpose, where MAGMA aggregates signals of single nucleotide polymorphisms (SNPs) to their nearby genes. However, SNPs may also affect genes in a distance, thus missed by MAGMA. Although different upgrades of MAGMA have been proposed to extend gene-wise variant annotations with more information (e.g. Hi-C or eQTL), the regulatory relationships among genes and the tissue-specificity of signals have not been taken into account.ResultsWe propose a new approach, namely network-enhanced MAGMA (nMAGMA), for gene-wise annotation of variants from GWAS summary statistics. Compared with MAGMA and H-MAGMA, nMAGMA significantly extends the lists of genes that can be annotated to SNPs by integrating local signals, long-range regulation signals, and tissue-specific gene networks. When applied to schizophrenia, nMAGMA is able to detect more risk genes (217% more than MAGMA and 57% more than H-MAGMA) that are reasonably involved in schizophrenia compared to MAGMA and H-MAGMA. Some disease-related functions (e.g. the ATPase pathway in Cortex) tissues are also uncovered in nMAGMA but not in MAGMA or H-MAGMA. Moreover, nMAGMA provides tissue-specific risk signals, which are useful for understanding disorders with multi-tissue origins.


Sign in / Sign up

Export Citation Format

Share Document