scholarly journals Powerful Variance-Component TWAS method identifies novel and known risk genes for clinical and pathologic Alzheimer’s dementia phenotypes

2020 ◽  
Author(s):  
Shizhen Tang ◽  
Aron S. Buchman ◽  
Philip L. De Jager ◽  
David A. Bennett ◽  
Michael P. Epstein ◽  
...  

AbstractTranscriptome-wide association studies (TWAS) have been widely used to integrate transcriptomic and genetic data to study complex human diseases. Within a test dataset lacking transcriptomic data, existing TWAS methods impute gene expression by creating a weighted sum that aggregates SNPs with their corresponding cis-eQTL effects on transcriptome estimated from reference datasets. Existing TWAS methods then apply a linear regression model to assess the association between imputed gene expression and test phenotype, thereby assuming the effect of a cis-eQTL SNP on test phenotype is a linear function of the eQTL’s estimated effect on reference transcriptome. Thus, existing TWAS methods make a strong assumption that cis-eQTL effect sizes on reference transcriptome are reflective of their corresponding SNP effect sizes on test phenotype. To increase TWAS robustness to this assumption, we propose a Variance-Component TWAS procedure (VC-TWAS) that assumes the effects of cis-eQTL SNPs on phenotype are random (with variance proportional to corresponding cis-eQTL effects in reference dataset) rather than fixed. By doing so, we show VC-TWAS is more powerful than traditional TWAS when cis-eQTL SNP effects on test phenotype truly differ from their eQTL effects within reference dataset. We further applied VC-TWAS using cis-eQTL effect sizes estimated by a nonparametric Bayesian method to study Alzheimer’s dementia (AD) related phenotypes and detected 13 genes significantly associated with AD, including 6 known GWAS risk loci. All significant loci are proximal to the major known risk loci APOE for AD. Further, we add this VC-TWAS function into our previously developed tool TIGAR for public use.

PLoS Genetics ◽  
2021 ◽  
Vol 17 (4) ◽  
pp. e1009482
Author(s):  
Shizhen Tang ◽  
Aron S. Buchman ◽  
Philip L. De Jager ◽  
David A. Bennett ◽  
Michael P. Epstein ◽  
...  

Transcriptome-wide association studies (TWAS) have been widely used to integrate transcriptomic and genetic data to study complex human diseases. Within a test dataset lacking transcriptomic data, traditional two-stage TWAS methods first impute gene expression by creating a weighted sum that aggregates SNPs with their corresponding cis-eQTL effects on reference transcriptome. Traditional TWAS methods then employ a linear regression model to assess the association between imputed gene expression and test phenotype, thereby assuming the effect of a cis-eQTL SNP on test phenotype is a linear function of the eQTL’s estimated effect on reference transcriptome. To increase TWAS robustness to this assumption, we propose a novel Variance-Component TWAS procedure (VC-TWAS) that assumes the effects of cis-eQTL SNPs on phenotype are random (with variance proportional to corresponding reference cis-eQTL effects) rather than fixed. VC-TWAS is applicable to both continuous and dichotomous phenotypes, as well as individual-level and summary-level GWAS data. Using simulated data, we show VC-TWAS is more powerful than traditional TWAS methods based on a two-stage Burden test, especially when eQTL genetic effects on test phenotype are no longer a linear function of their eQTL genetic effects on reference transcriptome. We further applied VC-TWAS to both individual-level (N = ~3.4K) and summary-level (N = ~54K) GWAS data to study Alzheimer’s dementia (AD). With the individual-level data, we detected 13 significant risk genes including 6 known GWAS risk genes such as TOMM40 that were missed by traditional TWAS methods. With the summary-level data, we detected 57 significant risk genes considering only cis-SNPs and 71 significant genes considering both cis- and trans- SNPs, which also validated our findings with the individual-level GWAS data. Our VC-TWAS method is implemented in the TIGAR tool for public use.


Author(s):  
Arjun Bhattacharya ◽  
Yun Li ◽  
Michael I. Love

ABSTRACTTraditional predictive models for transcriptome-wide association studies (TWAS) consider only single nucleotide polymorphisms (SNPs) local to genes of interest and perform parameter shrinkage with a regularization process. These approaches ignore the effect of distal-SNPs or other molecular effects underlying the SNP-gene association. Here, we outline multi-omics strategies for transcriptome imputation from germline genetics to allow more powerful testing of gene-trait associations by prioritizing distal-SNPs to the gene of interest. In one extension, we identify mediating biomarkers (CpG sites, microRNAs, and transcription factors) highly associated with gene expression and train predictive models for these mediators using their local SNPs. Imputed values for mediators are then incorporated into the final predictive model of gene expression, along with local SNPs. In the second extension, we assess distal-eQTLs (SNPs associated with genes not in a local window around it) for their mediation effect through mediating biomarkers local to these distal-eSNPs. Distal-eSNPs with large indirect mediation effects are then included in the transcriptomic prediction model with the local SNPs around the gene of interest. Using simulations and real data from ROS/MAP brain tissue and TCGA breast tumors, we show considerable gains of percent variance explained (1-2% additive increase) of gene expression and TWAS power to detect gene-trait associations. This integrative approach to transcriptome-wide imputation and association studies aids in identifying the complex interactions underlying genetic regulation within a tissue and important risk genes for various traits and disorders.AUTHOR SUMMARYTranscriptome-wide association studies (TWAS) are a powerful strategy to study gene-trait associations by integrating genome-wide association studies (GWAS) with gene expression datasets. TWAS increases study power and interpretability by mapping genetic variants to genes. However, traditional TWAS consider only variants that are close to a gene and thus ignores important variants far away from the gene that may be involved in complex regulatory mechanisms. Here, we present MOSTWAS (Multi-Omic Strategies for TWAS), a suite of tools that extends the TWAS framework to include these distal variants. MOSTWAS leverages multi-omic data of regulatory biomarkers (transcription factors, microRNAs, epigenetics) and borrows from techniques in mediation analysis to prioritize distal variants that are around these regulatory biomarkers. Using simulations and real public data from brain tissue and breast tumors, we show that MOSTWAS improves upon traditional TWAS in both predictive performance and power to detect gene-trait associations. MOSTWAS also aids in identifying possible mechanisms for gene regulation using a novel added-last test that assesses the added information gained from the distal variants beyond the local association. In conclusion, our method aids in detecting important risk genes for traits and disorders and the possible complex interactions underlying genetic regulation within a tissue.


2010 ◽  
Vol 31 (11) ◽  
pp. 1835-1842 ◽  
Author(s):  
Megan Szymanski ◽  
Ruihua Wang ◽  
M. Danielle Fallin ◽  
Susan S. Bassett ◽  
Dimitrios Avramopoulos

2021 ◽  
Author(s):  
Roshni A. Patel ◽  
Shaila A. Musharoff ◽  
Jeffrey P. Spence ◽  
Harold Pimentel ◽  
Catherine Tcheandjieu ◽  
...  

Despite the growing number of genome-wide association studies (GWAS) for complex traits, it remains unclear whether effect sizes of causal genetic variants differ between populations. In principle, effect sizes of causal variants could differ between populations due to gene-by-gene or gene-by-environment interactions. However, comparing causal variant effect sizes is challenging: it is difficult to know which variants are causal, and comparisons of variant effect sizes are confounded by differences in linkage disequilibrium (LD) structure between ancestries. Here, we develop a method to assess causal variant effect size differences that overcomes these limitations. Specifically, we leverage the fact that segments of European ancestry shared between European-American and admixed African-American individuals have similar LD structure, allowing for unbiased comparisons of variant effect sizes in European ancestry segments. We apply our method to two types of traits: gene expression and low-density lipoprotein cholesterol (LDL-C). We find that causal variant effect sizes for gene expression are significantly different between European-Americans and African-Americans; for LDL-C, we observe a similar point estimate although this is not significant, likely due to lower statistical power. Cross-population differences in variant effect sizes highlight the role of genetic interactions in trait architecture and will contribute to the poor portability of polygenic scores across populations, reinforcing the importance of conducting GWAS on individuals of diverse ancestries and environments.


2020 ◽  
Author(s):  
Janet C. Harwood ◽  
Ganna Leonenko ◽  
Rebecca Sims ◽  
Valentina Escott-Price ◽  
Julie Williams ◽  
...  

AbstractMore than 50 genetic loci have been identified as being associated with Alzheimer’s disease (AD) from genome-wide association studies (GWAS) and many of these are involved in immune pathways and lipid metabolism. Therefore, we performed a transcriptome-wide association study (TWAS) of immune-relevant cells, to study the mis-regulation of genes implicated in AD. We used expression and genetic data from naive and induced CD14+ monocytes and two GWAS of AD to study genetically controlled gene expression in monocytes at different stages of differentiation and compared the results with those from TWAS of brain and blood. We identified nine genes with statistically independent TWAS signals, seven are known AD risk genes from GWAS: BIN1, PTK2B, SPI1, MS4A4A, MS4A6E, APOE and PVR and two, LACTB2 and PLIN2/ADRP, are novel candidate genes for AD. Three genes, SPI1, PLIN2 and LACTB2, are TWAS significant specifically in monocytes. LACTB2 is a mitochondrial endoribonuclease and PLIN2/ADRP associates with intracellular neutral lipid storage droplets (LSDs) which have been shown to play a role in the regulation of the immune response. Notably, LACTB2 and PLIN2 were not detected from GWAS alone.


2015 ◽  
Author(s):  
Eric R Gamazon ◽  
Heather E Wheeler ◽  
Kaanan Shah ◽  
Sahar V Mozaffari ◽  
Keston Aquino-Michaels ◽  
...  

Genome-wide association studies (GWAS) have identified thousands of variants robustly associated with complex traits. However, the biological mechanisms underlying these associations are, in general, not well understood. We propose a gene-based association method called PrediXcan that directly tests the molecular mechanisms through which genetic variation affects phenotype. The approach estimates the component of gene expression determined by an individual's genetic profile and correlates the “imputed” gene expression with the phenotype under investigation to identify genes involved in the etiology of the phenotype. The genetically regulated gene expression is estimated using whole-genome tissue-dependent prediction models trained with reference transcriptome datasets. PrediXcan enjoys the benefits of gene- based approaches such as reduced multiple testing burden, more comprehensive annotation of gene function compared to that derived from single variants, and a principled approach to the design of follow-up experiments while also integrating knowledge of regulatory function. Since no actual expression data are used in the analysis of GWAS data - only in silico expression - reverse causality problems are largely avoided. PrediXcan harnesses reference transcriptome data for disease mapping studies. Our results demonstrate that PrediXcan can detect known and novel genes associated with disease traits and provide insights into the mechanism of these associations.


2021 ◽  
Author(s):  
Sihan Liu ◽  
Yu Chen ◽  
Feiran Wang ◽  
Yi Jiang ◽  
Fangyuan Duan ◽  
...  

AbstractUnderstanding the genetic architecture of gene expression and splicing in human brain is critical to unlocking the mechanisms of complex neuropsychiatric disorders like schizophrenia (SCZ). Large-scale brain transcriptomic studies are based primarily on populations of European (EUR) ancestry. The uniformity of mono-racial resources may limit important insights into the disease etiology. Here, we characterized brain transcriptional regulatory architecture of East Asians (EAS; n=151), identifying 3,278 expression quantitative trait loci (eQTL) and 4,726 spliceQTL (sQTL). Comparing these to PsychENCODE/BrainGVEX confirmed our hypothesis that the transcriptional regulatory architecture in EAS and EUR brains align. Furthermore, distinctive allelic frequency and linkage disequilibrium impede QTL translation and gene-expression prediction accuracy. Integration of eQTL/sQTL with genome-wide association studies reveals common and novel SCZ risk genes. Pathway-based analyses showing shared SCZ biology point to synaptic and GTPase dysfunction as a prospective pathogenesis. This study elucidates the transcriptional landscape of the EAS brain and emphasizes an essential convergence between EAS and EUR populations.


2021 ◽  
Author(s):  
Yan Lv ◽  
Yukuan Huang ◽  
Xuejun Xu ◽  
Zhiwei Wang ◽  
Yunlong Ma ◽  
...  

Oral cavity cancer (OCC) is one of the most common carcinoma diseases. Recent genome-wide association studies (GWAS) have reported numerous genetic variants associated with OCC susceptibility. However, the regulatory mechanisms of these genetic variants underlying OCC remain largely unclear. By combining GWAS summary statistics (N = 4,151) with expression quantitative trait loci (eQTL) across 49 different tissues from the GTEx database, we performed an integrative genomics analysis to uncover novel risk genes associated with OCC. By leveraging various computational methods based on multi-omics data, risk genes were prioritized as promising candidate genes for drug repurposing in OCC.Using two independent computational algorithms, we found that 14 risk genes whose genetics-modulated expressions showed a notable association with OCC. Among them, nine genes were newly identified, such as IRF4 (P = 2.5x10-9 and P = 1.06x10-4), TNS3 (P = 1.44x10-6 and P = 4.45x10-3), ZFP90 (P = 2.37x10-6 and P = 2.93x10-4), and DRD2 (P = 2.0x10-5 and P = 6.12x10-3). These 14 genes were significantly overrepresented in several cancer-related terms, and 10 of 14 genes were enriched in 10 potential druggable gene categories. Based on differential gene expression analysis, the majority of these genes (71.43%) showed remarkable differential expressions between OCC patients and paracancerous controls. Integration of multi-omics-based evidence from genetics, eQTL, and gene expression, we identified that the novel risk gene of IRF4 exhibited the highest ranked risk score for OCC. Survival analysis showed that dysregulation of IRF4 expression was significantly associated with cancer patients outcomes (P = 8.1x10-5). In summary, we prioritized 14 OCC-associated genes with nine novel risk genes, especially the IRF4 gene, which provides a drug repurposing resource to develop therapeutic drugs for oral cancer.


2019 ◽  
Author(s):  
Paulo Czarnewski ◽  
Sara M. Parigi ◽  
Chiara Sorini ◽  
Oscar E. Diaz ◽  
Srustidhar Das ◽  
...  

AbstrasctDespite the fact that ulcerative colitis (UC) patients show heterogeneous clinical manifestation and diverse response to biological therapies, all UC patients are classified as one group. Therefore, there is a lack of tailored therapies. In order to design these, an unsupervised molecular re-classification of UC patients is evoked. Classical clustering approaches based on tissue transcriptomic data were not able to classify UC patients into subgroups, likely due to associated covariates. In addition, while genome wide association studies (GWAS) have identified potential new target genes, their temporal dynamic revealing the optimal therapeutic window of time remains to be elucidated. To overcome the limitations, we generated time-series transcriptome data from a mouse model of colitis, which was then cross-compared with human datasets. This allowed us to visualize IBD-risk gene expression kinetics and reveal that the expression of the majority of IBD-risk genes peak during the inflammatory phase, and not the recovery phase. Moreover, by restricting the analysis to the most differentially expressed genes shared between mouse and human, we were able to cluster UC patients into two subgroups, termed UC1 and UC2. We found that UC1 patients expressed higher copy of genes involved in neutrophil recruitment, activation and degranulation compared to UC2. Of note, we found that over 87% of UC1 patients failed to respond to two of the most widely-used biological therapies for UC.This study serves as a proof of concept that cross-species comparison of gene expression profiles enables the temporal annotation of disease-associated gene expression and the stratification of patients as of yet considered molecularly undistinguishable.


Sign in / Sign up

Export Citation Format

Share Document