scholarly journals A large-scale genome-wide enrichment analysis identifies new trait-associated genes, pathways and tissues across 31 human phenotypes*

2017 ◽  
Author(s):  
Xiang Zhu ◽  
Matthew Stephens

Genome-wide association studies (GWAS) aim to identify genetic factors that are associated with complex traits. Standard analyses test individual genetic variants, one at a time, for association with a trait. However, variant-level associations are hard to identify (because of small effects) and can be difficult to interpret biologically. “Enrichment analyses” help address both these problems by focusing on sets of biologically-related variants. Here we introduce a new model-based enrichment analysis method that requires only GWAS summary statistics, and has several advantages over existing methods. Applying this method to interrogate 3,913 biological pathways and 113 tissue-based gene sets in 31 human phenotypes identifies many previously-unreported enrichments. These include enrichments of the endochondral ossification pathway for adult height, the NFAT-dependent transcription pathway for rheumatoid arthritis, brain-related genes for coronary artery disease, and liver-related genes for late-onset Alzheimer’s disease. A key feature of our method is that inferred enrichments automatically help identify new trait-associated genes. For example, accounting for enrichment in lipid transport genes yields strong evidence for association between MTTP and low-density lipoprotein levels, whereas conventional analyses of the same data found no significant variants near this gene.


2018 ◽  
Author(s):  
Doug Speed ◽  
David J Balding

LD Score Regression (LDSC) has been widely applied to the results of genome-wide association studies. However, its estimates of SNP heritability are derived from an unrealistic model in which each SNP is expected to contribute equal heritability. As a consequence, LDSC tends to over-estimate confounding bias, under-estimate the total phenotypic variation explained by SNPs, and provide misleading estimates of the heritability enrichment of SNP categories. Therefore, we present SumHer, software for estimating SNP heritability from summary statistics using more realistic heritability models. After demonstrating its superiority over LDSC, we apply SumHer to the results of 24 large-scale association studies (average sample size 121 000). First we show that these studies have tended to substantially over-correct for confounding, and as a result the number of genome-wide significant loci has under-reported by about 20%. Next we estimate enrichment for 24 categories of SNPs defined by functional annotations. A previous study using LDSC reported that conserved regions were 13-fold enriched, and found a further twelve categories with above 2-fold enrichment. By contrast, our analysis using SumHer finds that conserved regions are only 1.6-fold (SD 0.06) enriched, and that no category has enrichment above 1.7-fold. SumHer provides an improved understanding of the genetic architecture of complex traits, which enables more efficient analysis of future genetic data.



2019 ◽  
Author(s):  
Roman Teo Oliynyk

AbstractFor more than a decade, genome-wide association studies have been making steady progress in discovering the causal gene variants that contribute to late-onset human diseases. Polygenic late-onset diseases in an aging population display the risk allele frequency decrease at older ages, caused by individuals with higher polygenic risk scores becoming ill proportionately earlier and bringing about a change in the distribution of risk alleles between new cases and the as-yet-unaffected population. This phenomenon is most prominent for diseases characterized by high cumulative incidence and high heritability, examples of which include Alzheimer’s disease, coronary artery disease, cerebral stroke, and type 2 diabetes, while for late-onset diseases with relatively lower prevalence and heritability, exemplified by cancers, the effect is significantly lower. Computer simulations have determined that genome-wide association studies of the late-onset polygenic diseases showing high cumulative incidence together with high initial heritability will benefit from using the youngest possible age-matched cohorts. Moreover, rather than using age-matched cohorts, study cohorts combining the youngest possible cases with the oldest possible controls may significantly improve the discovery power of genome-wide association studies.



2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Guoyi Yang ◽  
C. Mary Schooling

AbstractStatins have been suggested as a potential treatment for immune-related diseases. Conversely, statins might trigger auto-immune conditions. To clarify the role of statins in allergic diseases and auto-immune diseases, we conducted a Mendelian randomization (MR) study. Using established genetic instruments to mimic statins via 3-hydroxy-3-methylglutaryl-coenzyme A reductase (HMGCR) inhibition, we assessed the effects of statins on asthma, eczema, allergic rhinitis, rheumatoid arthritis (RA), psoriasis, type 1 diabetes, systemic lupus erythematosus (SLE), multiple sclerosis (MS), Crohn’s disease and ulcerative colitis in the largest available genome wide association studies (GWAS). Genetically mimicked effects of statins via HMGCR inhibition were not associated with any immune-related diseases in either study after correcting for multiple testing; however, they were positively associated with the risk of asthma in East Asians (odds ratio (OR) 2.05 per standard deviation (SD) decrease in low-density lipoprotein cholesterol (LDL-C), 95% confidence interval (CI) 1.20 to 3.52, p value 0.009). These associations did not differ by sex and were robust to sensitivity analysis. These findings suggested that genetically mimicked effects of statins via HMGCR inhibition have little effect on allergic diseases or auto-immune diseases. However, we cannot exclude the possibility that genetically mimicked effects of statins via HMGCR inhibition might increase the risk of asthma in East Asians.



2020 ◽  
Author(s):  
Min Zhao ◽  
Hong Qu

Abstract Background: Circular RNAs (circRNAs) play important roles in regulating gene expression through binding miRNAs and RNA binding proteins. Genetic variation of circRNAs may affect complex traits/diseases by changing their binding efficiency to target miRNAs and proteins. There is a growing demand for investigations of the functions of genetic changes using large-scale experimental evidence. However, there is no online genetic resource for circRNA genes. Results: We performed extensive genetic annotation of 295,526 circRNAs integrated from circBase, circNet and circRNAdb. All pre-computed genetic variants were presented at our online resource, circVAR, with data browsing and search functionality. We explored the chromosome-based distribution of circRNAs and their associated variants. We found that, based on mapping to the 1000 Genomes and ClinVAR databases, chromosome 17 has a relatively large number of circRNAs and associated common and health-related genetic variants. Following the annotation of genome wide association studies (GWAS)-based circRNA variants, we found many non-coding variants within circRNAs, suggesting novel mechanisms for common diseases reported from GWAS studies. For cancer-based somatic variants, we found that chromosome 7 has many highly complex mutations that have been overlooked in previous research. Conclusion: We used the circVAR database to collect SNPs and small insertions and deletions (INDELs) in putative circRNA regions and to identify their potential phenotypic information. To provide a reusable resource for the circRNA research community, we have published all the pre-computed genetic data concerning circRNAs and associated genes together with data query and browsing functions at http://soft.bioinfo-minzhao.org/circvar .



2019 ◽  
Author(s):  
Hiroshi Matsunaga ◽  
Kaoru Ito ◽  
Masato Akiyama ◽  
Atsushi Takahashi ◽  
Satoshi Koyama ◽  
...  

AbstractBackgroundGenome-wide association studies (GWAS) provided many biological insights into coronary artery disease (CAD), but these studies were mainly performed in Europeans. GWAS in diverse populations have the potential to advance our understanding of CAD.Methods and ResultsWe conducted two GWAS for CAD in the Japanese population, which included 12,494 cases and 28,879 controls, and 2,808 cases and 7,261 controls, respectively. Then, we performed transethnic meta-analysis using the results of the CARDIoGRAMplusC4D 1000 Genomes meta-analysis with UK Biobank. We identified 3 new loci on chromosome 1q21 (CTSS), 10q26 (WDR11-FGFR2), and 11q22 (RDX-FDX1). Quantitative trait locus analyses suggested the association of CTSS and RDX-FDX1 with atherosclerotic immune cells. Tissue/cell type enrichment analysis showed the involvement of arteries, adrenal glands and fat tissues in the development of CAD. Finally, we performed tissue/cell type enrichment analysis using East Asian-frequent and European-frequent variants according to the risk allele frequencies, and identified significant enrichment of adrenal glands in the East Asian-frequent group while the enrichment of arteries and fat tissues was found in the European-frequent group. These findings indicate biological differences in CAD susceptibility between Japanese and Europeans.ConclusionsWe identified 3 new loci for CAD and highlighted the genetic differences between the Japanese and European populations. Moreover, our transethnic analyses showed both shared and unique genetic architectures between the Japanese and Europeans. While most of the underlying genetic bases for CAD are shared, further analyses in diverse populations will be needed to elucidate variations fully.



2018 ◽  
Vol 3 ◽  
pp. 114 ◽  
Author(s):  
Thomas Battram ◽  
Luke Hoskins ◽  
David A. Hughes ◽  
Johannes Kettunen ◽  
Susan M. Ring ◽  
...  

Background: Genome-wide association studies have identified genetic variants associated with coronary artery disease (CAD) in adults – the leading cause of death worldwide. It often occurs later in life, but variants may impact CAD-relevant phenotypes early and throughout the life-course. Cohorts with longitudinal and genetic data on thousands of individuals are letting us explore the antecedents of this adult disease. Methods: 149 metabolites, with a focus on the lipidome, measured using nuclear magnetic resonance (1H-NMR) spectroscopy, and genotype data were available from 5,905 individuals at ages 7, 15, and 17 years from the Avon Longitudinal Study of Parents and Children (ALSPAC) cohort. Linear regression was used to assess the association between the metabolites and an adult-derived genetic risk score (GRS) of CAD comprising 146 variants. Individual variant-metabolite associations were also examined. Results: The CAD-GRS associated with 118 of 149 metabolites (false discovery rate [FDR] < 0.05), the strongest associations being with low-density lipoprotein (LDL) and atherogenic non-LDL subgroups. Nine of 146 variants in the GRS associated with one or more metabolites (FDR < 0.05). Seven of these are within lipid loci: rs11591147 PCSK9, rs12149545 HERPUD1-CETP, rs17091891 LPL, rs515135 APOB, rs602633 CELSR2-PSRC1, rs651821 APOA5, rs7412 APOE-APOC1. All associated with metabolites in the LDL or atherogenic non-LDL subgroups or both including aggregate cholesterol measures. The other two variants identified were rs112635299 SERPINA1 and rs2519093 ABO. Conclusions: Genetic variants that influence CAD risk in adults are associated with large perturbations in metabolite levels in individuals as young as seven. The variants identified are mostly within lipid-related loci and the metabolites they associated with are primarily linked to lipoproteins. This knowledge could allow for preventative measures, such as increased monitoring of at-risk individuals and perhaps treatment earlier in life, to be taken years before any symptoms of the disease arise.



2019 ◽  
Vol 3 ◽  
pp. 114 ◽  
Author(s):  
Thomas Battram ◽  
Luke Hoskins ◽  
David A. Hughes ◽  
Johannes Kettunen ◽  
Susan M. Ring ◽  
...  

Background: Genome-wide association studies have identified genetic variants associated with coronary artery disease (CAD) in adults – the leading cause of death worldwide. It often occurs later in life, but variants may impact CAD-relevant phenotypes early and throughout the life-course. Cohorts with longitudinal and genetic data on thousands of individuals are letting us explore the antecedents of this adult disease. Methods: 148 metabolites, with a focus on the lipidome, measured using nuclear magnetic resonance (1H-NMR) spectroscopy, and genotype data were available from 5,907 individuals at ages 7, 15, and 17 years from the Avon Longitudinal Study of Parents and Children (ALSPAC) cohort. Linear regression was used to assess the association between the metabolites and an adult-derived genetic risk score (GRS) of CAD comprising 146 variants. Individual variant-metabolite associations were also examined. Results: The CAD-GRS associated with 118 of 148 metabolites (false discovery rate [FDR] < 0.05), the strongest associations being with low-density lipoprotein (LDL) and atherogenic non-LDL subgroups. Nine of 146 variants in the GRS associated with one or more metabolites (FDR < 0.05). Seven of these are within lipid loci: rs11591147 PCSK9, rs12149545 HERPUD1-CETP, rs17091891 LPL, rs515135 APOB, rs602633 CELSR2-PSRC1, rs651821 APOA5, rs7412 APOE-APOC1. All associated with metabolites in the LDL or atherogenic non-LDL subgroups or both including aggregate cholesterol measures. The other two variants identified were rs112635299 SERPINA1 and rs2519093 ABO. Conclusions: Genetic variants that influence CAD risk in adults are associated with large perturbations in metabolite levels in individuals as young as seven. The variants identified are mostly within lipid-related loci and the metabolites they associated with are primarily linked to lipoproteins. Along with further research, this knowledge could allow for preventative measures, such as increased monitoring of at-risk individuals and perhaps treatment earlier in life, to be taken years before any symptoms of the disease arise.



2019 ◽  
Vol 9 (3) ◽  
pp. 38 ◽  
Author(s):  
Roman Teo Oliynyk

For more than a decade, genome-wide association studies have been making steady progress in discovering the causal gene variants that contribute to late-onset human diseases. Polygenic late-onset diseases in an aging population display a risk allele frequency decrease at older ages, caused by individuals with higher polygenic risk scores becoming ill proportionately earlier and bringing about a change in the distribution of risk alleles between new cases and the as-yet-unaffected population. This phenomenon is most prominent for diseases characterized by high cumulative incidence and high heritability, examples of which include Alzheimer’s disease, coronary artery disease, cerebral stroke, and type 2 diabetes, while for late-onset diseases with relatively lower prevalence and heritability, exemplified by cancers, the effect is significantly lower. In this research, computer simulations have demonstrated that genome-wide association studies of late-onset polygenic diseases showing high cumulative incidence together with high initial heritability will benefit from using the youngest possible age-matched cohorts. Moreover, rather than using age-matched cohorts, study cohorts combining the youngest possible cases with the oldest possible controls may significantly improve the discovery power of genome-wide association studies.



2020 ◽  
pp. 1-6
Author(s):  
Michalopoulou Helena ◽  
◽  
Ligga Georgia ◽  

Hypertension (HTN) is one of the major risk factors for almost all cardiovascular diseases including coronary artery disease, stroke, heart failure and renal failure. Nonetheless , blood pressure (BP) regulation is insufficient due to its multifactorial nature involving interactions among genetic, environmental, mechanistic and neuroendocrine factors. Essential hypertension is the most frequent diagnosis indicating that a monocausal etiology has not been identified. The identification of causal genetic determinants has been unfulfilling. Analyses of rare monogenic syndromes of HTN focusing on renal sodium handling and steroid hormone metabolism have proved the well-defined genetic frame of hypertension though they do not affect the normal distribution of BP in the general population. Genome-wide association studies (GWAS) have revealed genetic variants that are associated with BP with small effect size which cumulatively explain to a very small extend the variability of BP. New large-scale studies in the genomic arena will clarify the polygenic determinants of BP and open a perspective on translation of the progression in BP genetics to clinical use.



2018 ◽  
Author(s):  
Karl A. G. Kremling ◽  
Christine H. Diepenbrock ◽  
Michael A. Gore ◽  
Edward S. Buckler ◽  
Nonoy B. Bandillo

AbstractModern improvement of complex traits in agricultural species relies on successful associations of heritable molecular variation with observable phenotypes. Historically, this pursuit has primarily been based on easily measurable genetic markers. The recent advent of new technologies allows assaying and quantifying biological intermediates (hereafter endophenotypes) which are now readily measurable at a large scale across diverse individuals. The potential of using endophenotypes for dissecting traits of interest remains underexplored in plants. The work presented here illustrated the utility of a large-scale (299 genotype and 7 tissue) gene expression resource to dissect traits across multiple levels of biological organization. Using single-tissue- and multi-tissue-based transcriptome-wide association studies (TWAS), we revealed that about half of the functional variation for agronomic and seed quality (carotenoid, tocochromanol) traits is regulatory. Comparing the efficacy of TWAS with genome-wide association studies (GWAS) and an ensemble approach that combines both GWAS and TWAS, we demonstrated that results of TWAS in combination with GWAS increase the power to detect known genes and aid in prioritizing likely causal genes. Using a variance partitioning approach in the independent maize Nested Association Mapping (NAM) population, we also showed that the most strongly associated genes identified by combining GWAS and TWAS explain more heritable variance for a majority of traits, beating the heritability captured by the random genes and the genes identified by GWAS or TWAS alone. This improves not only the ability to link genes to phenotypes, but also highlights the phenotypic consequences of regulatory variation in plants.Author summaryWe examined the ability to associate variability in gene expression directly with terminal phenotypes of interest, as a supplement linking genotype to phenotype. We found that transcriptome-wide association studies (TWAS) are a useful accessory to genome-wide association studies (GWAS). In a combined test with GWAS results, TWAS improves the capacity to re-detect genes known to underlie quantitative trait loci for kernel and agronomic phenotypes. This improves not only the capacity to link genes to phenotypes, but also illustrates the widespread importance of regulation for phenotype.



Sign in / Sign up

Export Citation Format

Share Document