Multiple genes in cis mediate the effects of a single chromatin accessibility variant on aberrant synaptic development and function in human neurons

Despite hundreds of risk loci from genome-wide association studies of neuropsychiatric disorders, causal variants/genes remain largely unknown. Here, in NEUROG2-induced human neurons, we identified 31 risk SNPs in 26 schizophrenia (SZ) risk loci that displayed allele-specific open chromatin (ASoC) and were likely to be functional. Editing the strongest ASoC SNP rs2027349 near vacuolar protein sorting 45 homolog (VPS45) altered the expression of VPS45, lncRNA AC244033.2, and a distal gene, C1orf54, in human neurons. Notably, the global gene expression changes in neurons were enriched for SZ risk and correlated with post-mortem brain gene expression signatures of neuropsychiatric disorders. Neurons carrying the risk allele exhibited increased dendritic complexity, synaptic puncta density, and hyperactivity, which were reversed by knocking-down distinct cis-regulated genes (VPS45, AC244033.2, or C1orf54), suggesting a phenotypic contribution from all three genes. Interestingly, transcriptomic analysis of knockdown cells suggested a non-additive effects of these genes. Our study reveals a compound effect of multiple genes at a single SZ locus on synaptic development and function, providing a mechanistic link between a non-coding SZ risk variant and disease-related cellular phenotypes.

Download Full-text

Faculty Opinions recommendation of Meta-Analysis of Genome-Wide Association Studies and Network Analysis-Based Integration with Gene Expression Data Identify New Suggestive Loci and Unravel a Wnt-Centric Network Associated with Dupuytren's Disease.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.726582766.793522025 ◽

2016 ◽

Author(s):

Rik Lories

Keyword(s):

Gene Expression ◽

Network Analysis ◽

Gene Expression Data ◽

Association Studies ◽

Meta Analysis ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Expression Data ◽

Dupuytren's Disease ◽

Genome Wide

Download Full-text

TCF19 Impacts a Network of Inflammatory and DNA Damage Response Genes in the Pancreatic β-Cell

Metabolites ◽

10.3390/metabo11080513 ◽

2021 ◽

Vol 11 (8) ◽

pp. 513

Author(s):

Grace H. Yang ◽

Danielle A. Fontaine ◽

Sukanya Lodh ◽

Joseph T. Blumer ◽

Avtar Roopra ◽

...

Keyword(s):

Gene Expression ◽

Dna Damage ◽

Dna Damage Response ◽

Association Studies ◽

Cell Cycle Gene ◽

Genome Wide Association Studies ◽

Β Cell ◽

Nucleotide Incorporation ◽

Damage Response ◽

The Impact

Transcription factor 19 (TCF19) is a gene associated with type 1 diabetes (T1DM) and type 2 diabetes (T2DM) in genome-wide association studies. Prior studies have demonstrated that Tcf19 knockdown impairs β-cell proliferation and increases apoptosis. However, little is known about its role in diabetes pathogenesis or the effects of TCF19 gain-of-function. The aim of this study was to examine the impact of TCF19 overexpression in INS-1 β-cells and human islets on proliferation and gene expression. With TCF19 overexpression, there was an increase in nucleotide incorporation without any change in cell cycle gene expression, alluding to an alternate process of nucleotide incorporation. Analysis of RNA-seq of TCF19 overexpressing cells revealed increased expression of several DNA damage response (DDR) genes, as well as a tightly linked set of genes involved in viral responses, immune system processes, and inflammation. This connectivity between DNA damage and inflammatory gene expression has not been well studied in the β-cell and suggests a novel role for TCF19 in regulating these pathways. Future studies determining how TCF19 may modulate these pathways can provide potential targets for improving β-cell survival.

Download Full-text

Transcriptome-wide Mendelian randomization study prioritising novel tissue-dependent genes for glioma susceptibility

Scientific Reports ◽

10.1038/s41598-021-82169-5 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Jamie W. Robinson ◽

Richard M. Martin ◽

Spiridon Tsavachidis ◽

Amy E. Howell ◽

Caroline L. Relton ◽

...

Keyword(s):

Gene Expression ◽

Association Studies ◽

Tissue Expression ◽

Tissue Type ◽

Mendelian Randomisation ◽

Genome Wide Association Studies ◽

Causal Pathways ◽

Genome Wide ◽

Glioma Risk ◽

Brain Tissues

AbstractGenome-wide association studies (GWAS) have discovered 27 loci associated with glioma risk. Whether these loci are causally implicated in glioma risk, and how risk differs across tissues, has yet to be systematically explored. We integrated multi-tissue expression quantitative trait loci (eQTLs) and glioma GWAS data using a combined Mendelian randomisation (MR) and colocalisation approach. We investigated how genetically predicted gene expression affects risk across tissue type (brain, estimated effective n = 1194 and whole blood, n = 31,684) and glioma subtype (all glioma (7400 cases, 8257 controls) glioblastoma (GBM, 3112 cases) and non-GBM gliomas (2411 cases)). We also leveraged tissue-specific eQTLs collected from 13 brain tissues (n = 114 to 209). The MR and colocalisation results suggested that genetically predicted increased gene expression of 12 genes were associated with glioma, GBM and/or non-GBM risk, three of which are novel glioma susceptibility genes (RETREG2/FAM134A, FAM178B and MVB12B/FAM125B). The effect of gene expression appears to be relatively consistent across glioma subtype diagnoses. Examining how risk differed across 13 brain tissues highlighted five candidate tissues (cerebellum, cortex, and the putamen, nucleus accumbens and caudate basal ganglia) and four previously implicated genes (JAK1, STMN3, PICK1 and EGFR). These analyses identified robust causal evidence for 12 genes and glioma risk, three of which are novel. The correlation of MR estimates in brain and blood are consistently low which suggested that tissue specificity needs to be carefully considered for glioma. Our results have implicated genes yet to be associated with glioma susceptibility and provided insight into putatively causal pathways for glioma risk.

Download Full-text

Analysis of chromatin organization and gene expression in T cells identifies functional genes for rheumatoid arthritis

10.1101/827923 ◽

2019 ◽

Author(s):

Jing Yang ◽

Amanda McGovern ◽

Paul Martin ◽

Kate Duffus ◽

Xiangyu Ge ◽

...

Keyword(s):

Rheumatoid Arthritis ◽

Gene Expression ◽

T Cells ◽

Complex Disease ◽

Target Genes ◽

Disease Risk ◽

Association Studies ◽

Dna Interaction ◽

Genome Wide Association Studies ◽

Causal Genes

AbstractGenome-wide association studies have identified genetic variation contributing to complex disease risk. However, assigning causal genes and mechanisms has been more challenging because disease-associated variants are often found in distal regulatory regions with cell-type specific behaviours. Here, we collect ATAC-seq, Hi-C, Capture Hi-C and nuclear RNA-seq data in stimulated CD4+ T-cells over 24 hours, to identify functional enhancers regulating gene expression. We characterise changes in DNA interaction and activity dynamics that correlate with changes gene expression, and find that the strongest correlations are observed within 200 kb of promoters. Using rheumatoid arthritis as an example of T-cell mediated disease, we demonstrate interactions of expression quantitative trait loci with target genes, and confirm assigned genes or show complex interactions for 20% of disease associated loci, including FOXO1, which we confirm using CRISPR/Cas9.

Download Full-text

A Network-based Deep Learning Framework Catalyzes GWAS and Multi-Omics Findings to Biology and Drug Repurposing for Alzheimer's Disease

10.1101/2021.10.20.465087 ◽

2021 ◽

Author(s):

Jielin Xu ◽

Yuan Hou ◽

Yadi Zhou ◽

Ming Hu ◽

Feixiong Cheng

Keyword(s):

Alzheimer’S Disease ◽

Alzheimer's Disease ◽

Deep Learning ◽

Disease Risk ◽

Association Studies ◽

Open Chromatin ◽

Genome Wide Association Studies ◽

Risk Genes ◽

Learning Framework ◽

Protein Interactome

Human genome sequencing studies have identified numerous loci associated with complex diseases, including Alzheimer's disease (AD). Translating human genetic findings (i.e., genome-wide association studies [GWAS]) to pathobiology and therapeutic discovery, however, remains a major challenge. To address this critical problem, we present a network topology-based deep learning framework to identify disease-associated genes (NETTAG). NETTAG is capable of integrating multi-genomics data along with the protein-protein interactome to infer putative risk genes and drug targets impacted by GWAS loci. Specifically, we leverage non-coding GWAS loci effects on expression quantitative trait loci (eQTLs), histone-QTLs, and transcription factor binding-QTLs, enhancers and CpG islands, promoter regions, open chromatin, and promoter flanking regions. The key premises of NETTAG are that the disease risk genes exhibit distinct functional characteristics compared to non-risk genes and therefore can be distinguished by their aggregated genomic features under the human protein interactome. Applying NETTAG to the latest AD GWAS data, we identified 156 putative AD-risk genes (i.e., APOE, BIN1, GSK3B, MARK4, and PICALM). We showed that predicted risk genes are: 1) significantly enriched in AD-related pathobiological pathways, 2) more likely to be differentially expressed regarding transcriptome and proteome of AD brains, and 3) enriched in druggable targets with approved medicines (i.e., choline and ibudilast). In summary, our findings suggest that understanding of human pathobiology and therapeutic development could benefit from a network-based deep learning methodology that utilizes GWAS findings under the multimodal genomic analyses.

Download Full-text

Integrating Transcriptomics, Genomics, and Imaging in Alzheimer's Disease: A Federated Model

10.1101/2021.09.14.460367 ◽

2021 ◽

Author(s):

Jianfeng Wu ◽

Yanxi Chen ◽

Panwen Wang ◽

Richard J Caselli ◽

Paul M Thompson ◽

...

Keyword(s):

Gene Expression ◽

Alzheimer’S Disease ◽

Alzheimer's Disease ◽

Imaging Modality ◽

Association Studies ◽

Imaging Genetics ◽

Health Concern ◽

Genome Wide Association Studies ◽

Imaging Data ◽

Stable Performance

Alzheimer's disease (AD) affects more than 1 in 9 people age 65 and older and becomes an urgent public health concern as the global population ages. In clinical practice, structural magnetic resonance imaging (sMRI) is the most accessible and widely used diagnostic imaging modality. Additionally, genome-wide association studies (GWAS) and transcriptomic, the study of gene expression, also play an important role in understanding AD etiology and progression. Sophisticated imaging genetics systems have been developed to discover genetic factors that consistently affect brain function and structure. However, most studies to date focused on the relationships between brain sMRI and GWAS or brain sMRI and transcriptomics. To our knowledge, few methods have been developed to discover and infer multimodal relationships among sMRI, GWAS, and transcriptomics. To address this, we propose a novel federated model, Genotype-Expression-Imaging Data Integration (GEIDI), to identify genetic and transcriptomic influences on brain sMRI measures. The relationships between brain imaging measures and gene expression are allowed to depend on a person's genotype at the single-nucleotide polymorphism (SNP) level, making the inferences adaptive and personalized. We performed extensive experiments on publicly available Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset. Experimental results demonstrated our proposed method outperformed state-of-the-art expression quantitative trait loci (eQTL) methods for detecting genetic and transcriptomic factors related to AD and has stable performance when data are integrated from multiple sites. Our GEIDI approach may offer novel insights into the relationship among image biomarkers, genotypes, and gene expression and help discover novel genetic targets for potential AD drug treatments.

Download Full-text

A transcriptome-wide Mendelian randomization study to uncover tissue-dependent regulatory mechanisms across the human phenome

10.1101/563379 ◽

2019 ◽

Cited By ~ 2

Author(s):

Tom G Richardson ◽

Gibran Hemani ◽

Tom R Gaunt ◽

Caroline L Relton ◽

George Davey Smith

Keyword(s):

Gene Expression ◽

Genetic Variants ◽

Complex Traits ◽

Mendelian Randomization ◽

Drug Repositioning ◽

Association Studies ◽

Thyroid Tissue ◽

Genome Wide Association Studies ◽

Tissue Specific ◽

Genome Wide

AbstractBackgroundDeveloping insight into tissue-specific transcriptional mechanisms can help improve our understanding of how genetic variants exert their effects on complex traits and disease. By applying the principles of Mendelian randomization, we have undertaken a systematic analysis to evaluate transcriptome-wide associations between gene expression across 48 different tissue types and 395 complex traits.ResultsOverall, we identified 100,025 gene-trait associations based on conventional genome-wide corrections (P < 5 × 10−08) that also provided evidence of genetic colocalization. These results indicated that genetic variants which influence gene expression levels in multiple tissues are more likely to influence multiple complex traits. We identified many examples of tissue-specific effects, such as genetically-predicted TPO, NR3C2 and SPATA13 expression only associating with thyroid disease in thyroid tissue. Additionally, FBN2 expression was associated with both cardiovascular and lung function traits, but only when analysed in heart and lung tissue respectively.We also demonstrate that conducting phenome-wide evaluations of our results can help flag adverse on-target side effects for therapeutic intervention, as well as propose drug repositioning opportunities. Moreover, we find that exploring the tissue-dependency of associations identified by genome-wide association studies (GWAS) can help elucidate the causal genes and tissues responsible for effects, as well as uncover putative novel associations.ConclusionsThe atlas of tissue-dependent associations we have constructed should prove extremely valuable to future studies investigating the genetic determinants of complex disease. The follow-up analyses we have performed in this study are merely a guide for future research. Conducting similar evaluations can be undertaken systematically at http://mrcieu.mrsoftware.org/Tissue_MR_atlas/.

Download Full-text

A systematic analysis of genetically regulated differences in gene expression and the role of co-expression networks across 16 psychiatric disorders and substance use phenotypes

10.1101/2021.01.28.428688 ◽

2021 ◽

Author(s):

Zachary F Gerring ◽

Jackson G Thorp ◽

Eric R Gamazon ◽

Eske M Derks

Keyword(s):

Gene Expression ◽

Mental Health ◽

Substance Use ◽

Prefrontal Cortex ◽

Psychiatric Disorders ◽

Developmental Disorders ◽

Association Studies ◽

Genetic Correlations ◽

Autism Spectrum ◽

Genome Wide Association Studies

ABSTRACTGenome-wide association studies (GWASs) have identified thousands of risk loci for many psychiatric and substance use phenotypes, however the biological consequences of these loci remain largely unknown. We performed a transcriptome-wide association study of 10 psychiatric disorders and 6 substance use phenotypes (collectively termed “mental health phenotypes”) using expression quantitative trait loci data from 532 prefrontal cortex samples. We estimated the correlation due to predicted genetically regulated expression between pairs of mental health phenotypes, and compared the results with the genetic correlations. We identified 1,645 genes with at least one significant trait association, comprising 2,176 significant associations across the 16 mental health phenotypes of which 572 (26%) are novel. Overall, the transcriptomic correlations for phenotype pairs were significantly higher than the respective genetic correlations. For example, attention deficit hyperactivity disorder and autism spectrum disorder, both childhood developmental disorders, showed a much higher transcriptomic correlation (r=0.84) than genetic correlation (r=0.35). Finally, we tested the enrichment of phenotype-associated genes in gene co-expression networks built from prefrontal cortex. Phenotype-associated genes were enriched in multiple gene co-expression modules and the implicated modules contained genes involved in mRNA splicing and glutamatergic receptors, among others. Together, our results highlight the utility of gene expression data in the understanding of functional gene mechanisms underlying psychiatric disorders and substance use phenotypes.

Download Full-text

Combinatorial and statistical prediction of gene expression from haplotype sequence

Bioinformatics ◽

10.1093/bioinformatics/btaa318 ◽

2020 ◽

Vol 36 (Supplement_1) ◽

pp. i194-i202

Author(s):

Berk A Alpay ◽

Pinar Demetci ◽

Sorin Istrail ◽

Derek Aguiar

Keyword(s):

Gene Expression ◽

Multiple Testing ◽

Association Studies ◽

Classification Problem ◽

Statistical Prediction ◽

Model Complexity ◽

Supplementary Information ◽

Prediction Methods ◽

Genome Wide Association Studies ◽

Regulatory Effects

Abstract Motivation Genome-wide association studies (GWAS) have discovered thousands of significant genetic effects on disease phenotypes. By considering gene expression as the intermediary between genotype and disease phenotype, expression quantitative trait loci studies have interpreted many of these variants by their regulatory effects on gene expression. However, there remains a considerable gap between genotype-to-gene expression association and genotype-to-gene expression prediction. Accurate prediction of gene expression enables gene-based association studies to be performed post hoc for existing GWAS, reduces multiple testing burden, and can prioritize genes for subsequent experimental investigation. Results In this work, we develop gene expression prediction methods that relax the independence and additivity assumptions between genetic markers. First, we consider gene expression prediction from a regression perspective and develop the HAPLEXR algorithm which combines haplotype clusterings with allelic dosages. Second, we introduce the new gene expression classification problem, which focuses on identifying expression groups rather than continuous measurements; we formalize the selection of an appropriate number of expression groups using the principle of maximum entropy. Third, we develop the HAPLEXD algorithm that models haplotype sharing with a modified suffix tree data structure and computes expression groups by spectral clustering. In both models, we penalize model complexity by prioritizing genetic clusters that indicate significant effects on expression. We compare HAPLEXR and HAPLEXD with three state-of-the-art expression prediction methods and two novel logistic regression approaches across five GTEx v8 tissues. HAPLEXD exhibits significantly higher classification accuracy overall; HAPLEXR shows higher prediction accuracy on approximately half of the genes tested and the largest number of best predicted genes (r2>0.1) among all methods. We show that variant and haplotype features selected by HAPLEXR are smaller in size than competing methods (and thus more interpretable) and are significantly enriched in functional annotations related to gene regulation. These results demonstrate the importance of explicitly modeling non-dosage dependent and intragenic epistatic effects when predicting expression. Availability and implementation Source code and binaries are freely available at https://github.com/rapturous/HAPLEX. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

A selective inference approach for false discovery rate control using multiomics covariates yields insights into disease risk

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1918862117 ◽

2020 ◽

Vol 117 (26) ◽

pp. 15028-15035 ◽

Cited By ~ 1

Author(s):

Ronald Yurko ◽

Max G’Sell ◽

Kathryn Roeder ◽

Bernie Devlin

Keyword(s):

Gene Expression ◽

Multiple Testing ◽

Disease Risk ◽

Association Studies ◽

Auxiliary Information ◽

Primary Data ◽

Genome Wide Association Studies ◽

Nucleotide Polymorphisms ◽

Hypothesis Tests ◽

Selective Inference

To correct for a large number of hypothesis tests, most researchers rely on simple multiple testing corrections. Yet, new methodologies of selective inference could potentially improve power while retaining statistical guarantees, especially those that enable exploration of test statistics using auxiliary information (covariates) to weight hypothesis tests for association. We explore one such method, adaptiveP-value thresholding (AdaPT), in the framework of genome-wide association studies (GWAS) and gene expression/coexpression studies, with particular emphasis on schizophrenia (SCZ). Selected SCZ GWAS associationPvalues play the role of the primary data for AdaPT; single-nucleotide polymorphisms (SNPs) are selected because they are gene expression quantitative trait loci (eQTLs). This natural pairing of SNPs and genes allow us to map the following covariate values to these pairs: GWAS statistics from genetically correlated bipolar disorder, the effect size of SNP genotypes on gene expression, and gene–gene coexpression, captured by subnetwork (module) membership. In all, 24 covariates per SNP/gene pair were included in the AdaPT analysis using flexible gradient boosted trees. We demonstrate a substantial increase in power to detect SCZ associations using gene expression information from the developing human prefrontal cortex. We interpret these results in light of recent theories about the polygenic nature of SCZ. Importantly, our entire process for identifying enrichment and creating features with independent complementary data sources can be implemented in many different high-throughput settings to ultimately improve power.

Download Full-text