Disease and phenotype relevant genetic variants identified from histone acetylomes in human hearts

AbstractIdentifying genetic markers for heterogeneous complex diseases such as heart failure has been challenging, and may require prohibitively large cohort sizes in genome-wide association studies (GWAS) in order to demonstrate statistical significance1. On the other hand, chromatin quantitative trait loci (QTL), elucidated by direct epigenetic profiling of specific human tissues, may contribute towards prioritising variants for disease-association. Here, we captured non-coding genetic variants by performing enhancer H3K27ac ChIP-seq in 70 human control and end-stage failing hearts, mapping out a comprehensive catalogue of 47,321 putative human heart enhancers. 3,897 differential acetylation peaks (FDR < 0.05) pointed to recognizable pathways altered in heart failure (HF). To identify cardiac histone acetylation QTLs (haQTLs), we regressed out confounding factors including HF disease status, and employed the G-SCI test2 to call out 1,680 haQTLs (FDR < 0.1). A subset of these showed significant association to gene expression, either in cis (180), or through long range interactions (81), identified by Hi-C and Hi-ChIP performed on a subset of hearts. Furthermore, a concordant relationship was found between the gain or disruption of specific transcription factor (TF) binding motifs, inferred from alternative alleles at the haQTLs, associated with altered H3K27ac peak heights. Finally, colocalisation of our haQTLs with heart-related GWAS datasets allowed us to identify 62 unique loci. Disease-association for these new loci may indeed be mediated through modification of H3K27-acetylation enrichment and their corresponding gene expression differences.

Download Full-text

Epigenomes of Human Hearts Reveal New Genetic Variants Relevant for Cardiac Disease and Phenotype

Circulation Research ◽

10.1161/circresaha.120.317254 ◽

2020 ◽

Vol 127 (6) ◽

pp. 761-777 ◽

Cited By ~ 1

Author(s):

Wilson Lek Wen Tan ◽

Chukwuemeka George Anene-Nzelu ◽

Eleanor Wong ◽

Chang Jie Mick Lee ◽

Hui San Tan ◽

...

Keyword(s):

Gene Expression ◽

Heart Failure ◽

Quantitative Trait Loci ◽

Genetic Variants ◽

Quantitative Trait ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Genome Wide ◽

Trait Loci

Rationale: Identifying genetic markers for heterogeneous complex diseases such as heart failure is challenging and requires prohibitively large cohort sizes in genome-wide association studies to meet the stringent threshold of genome-wide statistical significance. On the other hand, chromatin quantitative trait loci, elucidated by direct epigenetic profiling of specific human tissues, may contribute toward prioritizing subthreshold variants for disease association. Objective: Here, we captured noncoding genetic variants by performing epigenetic profiling for enhancer H3K27ac chromatin immunoprecipitation followed by sequencing in 70 human control and end-stage failing hearts. Methods and Results: We have mapped a comprehensive catalog of 47 321 putative human heart enhancers and promoters. Three thousand eight hundred ninety-seven differential acetylation peaks (FDR [false discovery rate], 5%) pointed to pathways altered in heart failure. To identify cardiac histone acetylation quantitative trait loci (haQTLs), we regressed out confounding factors including heart failure disease status and used the G-SCI (Genotype-independent Signal Correlation and Imbalance) test 1 to call out 1680 haQTLs (FDR, 10%). RNA sequencing performed on the same heart samples proved a subset of haQTLs to have significant association also to gene expression (expression quantitative trait loci), either in cis (180) or through long-range interactions (81), identified by Hi-C (high-throughput chromatin conformation assay) and HiChIP (high-throughput protein centric chromatin) performed on a subset of hearts. Furthermore, a concordant relationship between the gain or disruption of TF (transcription factor)-binding motifs, inferred from alternative alleles at the haQTLs, implied a surprising direct association between these specific TF and local histone acetylation in human hearts. Finally, 62 unique loci were identified by colocalization of haQTLs with the subthreshold loci of heart-related genome-wide association studies datasets. Conclusions: Disease and phenotype association for 62 unique loci are now implicated. These loci may indeed mediate their effect through modification of enhancer H3K27 acetylation enrichment and their corresponding gene expression differences (bioRxiv: https://doi.org/10.1101/536763 ). Graphical Abstract: A graphical abstract is available for this article.

Download Full-text

A transcriptome-wide Mendelian randomization study to uncover tissue-dependent regulatory mechanisms across the human phenome

10.1101/563379 ◽

2019 ◽

Cited By ~ 2

Author(s):

Tom G Richardson ◽

Gibran Hemani ◽

Tom R Gaunt ◽

Caroline L Relton ◽

George Davey Smith

Keyword(s):

Gene Expression ◽

Genetic Variants ◽

Complex Traits ◽

Mendelian Randomization ◽

Drug Repositioning ◽

Association Studies ◽

Thyroid Tissue ◽

Genome Wide Association Studies ◽

Tissue Specific ◽

Genome Wide

AbstractBackgroundDeveloping insight into tissue-specific transcriptional mechanisms can help improve our understanding of how genetic variants exert their effects on complex traits and disease. By applying the principles of Mendelian randomization, we have undertaken a systematic analysis to evaluate transcriptome-wide associations between gene expression across 48 different tissue types and 395 complex traits.ResultsOverall, we identified 100,025 gene-trait associations based on conventional genome-wide corrections (P < 5 × 10−08) that also provided evidence of genetic colocalization. These results indicated that genetic variants which influence gene expression levels in multiple tissues are more likely to influence multiple complex traits. We identified many examples of tissue-specific effects, such as genetically-predicted TPO, NR3C2 and SPATA13 expression only associating with thyroid disease in thyroid tissue. Additionally, FBN2 expression was associated with both cardiovascular and lung function traits, but only when analysed in heart and lung tissue respectively.We also demonstrate that conducting phenome-wide evaluations of our results can help flag adverse on-target side effects for therapeutic intervention, as well as propose drug repositioning opportunities. Moreover, we find that exploring the tissue-dependency of associations identified by genome-wide association studies (GWAS) can help elucidate the causal genes and tissues responsible for effects, as well as uncover putative novel associations.ConclusionsThe atlas of tissue-dependent associations we have constructed should prove extremely valuable to future studies investigating the genetic determinants of complex disease. The follow-up analyses we have performed in this study are merely a guide for future research. Conducting similar evaluations can be undertaken systematically at http://mrcieu.mrsoftware.org/Tissue_MR_atlas/.

Download Full-text

Disease association with frequented regions of genotype graphs

10.1101/2020.09.25.20201640 ◽

2020 ◽

Author(s):

Samuel Hokin ◽

Alan Cleary ◽

Joann Mudge

Keyword(s):

Rare Variants ◽

Disease Risk ◽

Association Studies ◽

Disease Status ◽

Disease Association ◽

Genome Wide Association Studies ◽

Entire Genome ◽

Machine Learning Classification ◽

Complementary Method ◽

Genome Wide

Complex diseases, with many associated genetic and environmental factors, are a challenging target for genomic risk assessment. Genome-wide association studies (GWAS) associate disease status with, and compute risk from, individual common variants, which can be problematic for diseases with many interacting or rare variants. In addition, GWAS typically employ a reference genome which is not built from the subjects of the study, whose genetic background may differ from the reference and whose genetic characterization may be limited. We present a complementary method based on disease association with collections of genotypes, called frequented regions, on a pangenomic graph built from subjects' genomes. We introduce the pangenomic genotype graph, which is better suited than sequence graphs to human disease studies. Our method draws out collections of features, across multiple genomic segments, which are associated with disease status. We show that the frequented regions method consistently improves machine-learning classification of disease status over GWAS classification, allowing incorporation of rare or interacting variants. Notably, genomic segments that have few or no variants of genome-wide significance (p<5x10-8) provide much-improved classification with frequented regions, encouraging their application across the entire genome. Frequented regions may also be utilized for purposes such as choice of treatment in addition to prediction of disease risk.

Download Full-text

A novel quantile regression approach for eQTL discovery

10.1101/070052 ◽

2016 ◽

Author(s):

Xiaoyu Song ◽

Gen Li ◽

Iuliana Ionita-Laza ◽

Ying Wei

Keyword(s):

Gene Expression ◽

Genetic Variation ◽

Linear Regression ◽

Genetic Variants ◽

Molecular Mechanisms ◽

Association Studies ◽

Expression Level ◽

Special Focus ◽

Genome Wide Association Studies ◽

Genome Wide

AbstractOver the past decade, there has been a remarkable improvement in our understanding of the role of genetic variation in complex human diseases, especially via genome-wide association studies. However, the underlying molecular mechanisms are still poorly characterized, impending the development of therapeutic interventions. Identifying genetic variants that influence the expression level of a gene, i.e. expression quantitative trait loci (eQTLs), can help us understand how genetic variants influence traits at the molecular level. While most eQTL studies focus on identifying mean effects on gene expression using linear regression, evidence suggests that genetic variation can impact the entire distribution of the expression level. Indeed, several studies have already investigated higher order associations with a special focus on detecting heteroskedasticity. In this paper, we develop a Quantile Rank-score Based Test (QRBT) to identify eQTLs that are associated with the conditional quantile functions of gene expression. We have applied the proposed QRBT to the Genotype-Tissue Expression project, an international tissue bank for studying the relationship between genetic variation and gene expression in human tissues, and found that the proposed QRBT complements the existing methods, and identifies new eQTLs with heterogeneous effects genome-wideacross different quantile levels. Notably, we show that the eQTLs identified by QRBT but missed by linear regression are more likely to be tissue specific, and also associated with greater enrichment in genome-wide significant SNPs from the GWAS catalog. An R package implementing QRBT is available on our website.

Download Full-text

Faculty Opinions recommendation of Meta-Analysis of Genome-Wide Association Studies and Network Analysis-Based Integration with Gene Expression Data Identify New Suggestive Loci and Unravel a Wnt-Centric Network Associated with Dupuytren's Disease.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.726582766.793522025 ◽

2016 ◽

Author(s):

Rik Lories

Keyword(s):

Gene Expression ◽

Network Analysis ◽

Gene Expression Data ◽

Association Studies ◽

Meta Analysis ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Expression Data ◽

Dupuytren's Disease ◽

Genome Wide

Download Full-text

Migraine: Genetic Variants and Clinical Phenotypes

Current Medicinal Chemistry ◽

10.2174/0929867325666180719120215 ◽

2019 ◽

Vol 26 (34) ◽

pp. 6207-6221 ◽

Cited By ~ 1

Author(s):

Innocenzo Rainero ◽

Alessandro Vacca ◽

Flora Govone ◽

Annalisa Gai ◽

Lorenzo Pinessi ◽

...

Keyword(s):

Genetic Variants ◽

Association Studies ◽

Mthfr Gene ◽

Genome Wide Association Studies ◽

Clinical Phenotypes ◽

Network Analyses ◽

Genome Wide ◽

Genomic Studies ◽

The Common ◽

The Relationship

Migraine is a common, chronic neurovascular disorder caused by a complex interaction between genetic and environmental risk factors. In the last two decades, molecular genetics of migraine have been intensively investigated. In a few cases, migraine is transmitted as a monogenic disorder, and the disease phenotype cosegregates with mutations in different genes like CACNA1A, ATP1A2, SCN1A, KCNK18, and NOTCH3. In the common forms of migraine, candidate genes as well as genome-wide association studies have shown that a large number of genetic variants may increase the risk of developing migraine. At present, few studies investigated the genotype-phenotype correlation in patients with migraine. The purpose of this review was to discuss recent studies investigating the relationship between different genetic variants and the clinical characteristics of migraine. Analysis of genotype-phenotype correlations in migraineurs is complicated by several confounding factors and, to date, only polymorphisms of the MTHFR gene have been shown to have an effect on migraine phenotype. Additional genomic studies and network analyses are needed to clarify the complex pathways underlying migraine and its clinical phenotypes.

Download Full-text

Editing GWAS: experimental approaches to dissect and exploit disease-associated genetic variation

Genome Medicine ◽

10.1186/s13073-021-00857-3 ◽

2021 ◽

Vol 13 (1) ◽

Author(s):

Shuquan Rao ◽

Yao Yao ◽

Daniel E. Bauer

Keyword(s):

Genome Editing ◽

Genetic Variants ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Functional Studies ◽

Functional Genetics ◽

Genome Wide ◽

Causal Variants ◽

Experimental Approaches

AbstractGenome-wide association studies (GWAS) have uncovered thousands of genetic variants that influence risk for human diseases and traits. Yet understanding the mechanisms by which these genetic variants, mainly noncoding, have an impact on associated diseases and traits remains a significant hurdle. In this review, we discuss emerging experimental approaches that are being applied for functional studies of causal variants and translational advances from GWAS findings to disease prevention and treatment. We highlight the use of genome editing technologies in GWAS functional studies to modify genomic sequences, with proof-of-principle examples. We discuss the challenges in interrogating causal variants, points for consideration in experimental design and interpretation of GWAS locus mechanisms, and the potential for novel therapeutic opportunities. With the accumulation of knowledge of functional genetics, therapeutic genome editing based on GWAS discoveries will become increasingly feasible.

Download Full-text

Transcriptome-wide Mendelian randomization study prioritising novel tissue-dependent genes for glioma susceptibility

Scientific Reports ◽

10.1038/s41598-021-82169-5 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Jamie W. Robinson ◽

Richard M. Martin ◽

Spiridon Tsavachidis ◽

Amy E. Howell ◽

Caroline L. Relton ◽

...

Keyword(s):

Gene Expression ◽

Association Studies ◽

Tissue Expression ◽

Tissue Type ◽

Mendelian Randomisation ◽

Genome Wide Association Studies ◽

Causal Pathways ◽

Genome Wide ◽

Glioma Risk ◽

Brain Tissues

AbstractGenome-wide association studies (GWAS) have discovered 27 loci associated with glioma risk. Whether these loci are causally implicated in glioma risk, and how risk differs across tissues, has yet to be systematically explored. We integrated multi-tissue expression quantitative trait loci (eQTLs) and glioma GWAS data using a combined Mendelian randomisation (MR) and colocalisation approach. We investigated how genetically predicted gene expression affects risk across tissue type (brain, estimated effective n = 1194 and whole blood, n = 31,684) and glioma subtype (all glioma (7400 cases, 8257 controls) glioblastoma (GBM, 3112 cases) and non-GBM gliomas (2411 cases)). We also leveraged tissue-specific eQTLs collected from 13 brain tissues (n = 114 to 209). The MR and colocalisation results suggested that genetically predicted increased gene expression of 12 genes were associated with glioma, GBM and/or non-GBM risk, three of which are novel glioma susceptibility genes (RETREG2/FAM134A, FAM178B and MVB12B/FAM125B). The effect of gene expression appears to be relatively consistent across glioma subtype diagnoses. Examining how risk differed across 13 brain tissues highlighted five candidate tissues (cerebellum, cortex, and the putamen, nucleus accumbens and caudate basal ganglia) and four previously implicated genes (JAK1, STMN3, PICK1 and EGFR). These analyses identified robust causal evidence for 12 genes and glioma risk, three of which are novel. The correlation of MR estimates in brain and blood are consistently low which suggested that tissue specificity needs to be carefully considered for glioma. Our results have implicated genes yet to be associated with glioma susceptibility and provided insight into putatively causal pathways for glioma risk.

Download Full-text

CAUSALdb: a database for disease/trait causal variants identified using summary statistics of genome-wide association studies

Nucleic Acids Research ◽

10.1093/nar/gkz1026 ◽

2019 ◽

Cited By ~ 2

Author(s):

Jianhua Wang ◽

Dandan Huang ◽

Yao Zhou ◽

Hongcheng Yao ◽

Huanhuan Liu ◽

...

Keyword(s):

Fine Mapping ◽

Genetic Variants ◽

Association Studies ◽

Complex Trait ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Summary Statistics ◽

Genome Wide ◽

Credible Sets ◽

Causal Variants

Abstract Genome-wide association studies (GWASs) have revolutionized the field of complex trait genetics over the past decade, yet for most of the significant genotype-phenotype associations the true causal variants remain unknown. Identifying and interpreting how causal genetic variants confer disease susceptibility is still a big challenge. Herein we introduce a new database, CAUSALdb, to integrate the most comprehensive GWAS summary statistics to date and identify credible sets of potential causal variants using uniformly processed fine-mapping. The database has six major features: it (i) curates 3052 high-quality, fine-mappable GWAS summary statistics across five human super-populations and 2629 unique traits; (ii) estimates causal probabilities of all genetic variants in GWAS significant loci using three state-of-the-art fine-mapping tools; (iii) maps the reported traits to a powerful ontology MeSH, making it simple for users to browse studies on the trait tree; (iv) incorporates highly interactive Manhattan and LocusZoom-like plots to allow visualization of credible sets in a single web page more efficiently; (v) enables online comparison of causal relations on variant-, gene- and trait-levels among studies with different sample sizes or populations and (vi) offers comprehensive variant annotations by integrating massive base-wise and allele-specific functional annotations. CAUSALdb is freely available at http://mulinlab.org/causaldb.

Download Full-text

Challenges of Adjusting Single-Nucleotide Polymorphism Effect Sizes for Linkage Disequilibrium

Human Heredity ◽

10.1159/000513303 ◽

2021 ◽

pp. 1-11

Author(s):

Valentina Escott-Price ◽

Karl Michael Schmidt

Keyword(s):

Linkage Disequilibrium ◽

Association Studies ◽

Statistical Significance ◽

Ordinary Least Squares ◽

Effect Sizes ◽

Risk Scores ◽

Genome Wide Association Studies ◽

Single Nucleotide ◽

Genome Wide ◽

Tikhonov Regularisation

Background: Genome-wide association studies (GWAS) were successful in identifying SNPs showing association with disease, but their individual effect sizes are small and require large sample sizes to achieve statistical significance. Methods of post-GWAS analysis, including gene-based, gene-set and polygenic risk scores, combine the SNP effect sizes in an attempt to boost the power of the analyses. To avoid giving undue weight to SNPs in linkage disequilibrium (LD), the LD needs to be taken into account in these analyses. Objectives: We review methods that attempt to adjust the effect sizes (β-coefficients) of summary statistics, instead of simple LD pruning. Methods: We subject LD adjustment approaches to a mathematical analysis, recognising Tikhonov regularisation as a framework for comparison. Results: Observing the similarity of the processes involved with the more straightforward Tikhonov-regularised ordinary least squares estimate for multivariate regression coefficients, we note that current methods based on a Bayesian model for the effect sizes effectively provide an implicit choice of the regularisation parameter, which is convenient, but at the price of reduced transparency and, especially in smaller LD blocks, a risk of incomplete LD correction. Conclusions: There is no simple answer to the question which method is best, but where interpretability of the LD adjustment is essential, as in research aiming at identifying the genomic aetiology of disorders, our study suggests that a more direct choice of mild regularisation in the correction of effect sizes may be preferable.

Download Full-text