Powerful Variance-Component TWAS method identifies novel and known risk genes for clinical and pathologic Alzheimer’s dementia phenotypes
AbstractTranscriptome-wide association studies (TWAS) have been widely used to integrate transcriptomic and genetic data to study complex human diseases. Within a test dataset lacking transcriptomic data, existing TWAS methods impute gene expression by creating a weighted sum that aggregates SNPs with their corresponding cis-eQTL effects on transcriptome estimated from reference datasets. Existing TWAS methods then apply a linear regression model to assess the association between imputed gene expression and test phenotype, thereby assuming the effect of a cis-eQTL SNP on test phenotype is a linear function of the eQTL’s estimated effect on reference transcriptome. Thus, existing TWAS methods make a strong assumption that cis-eQTL effect sizes on reference transcriptome are reflective of their corresponding SNP effect sizes on test phenotype. To increase TWAS robustness to this assumption, we propose a Variance-Component TWAS procedure (VC-TWAS) that assumes the effects of cis-eQTL SNPs on phenotype are random (with variance proportional to corresponding cis-eQTL effects in reference dataset) rather than fixed. By doing so, we show VC-TWAS is more powerful than traditional TWAS when cis-eQTL SNP effects on test phenotype truly differ from their eQTL effects within reference dataset. We further applied VC-TWAS using cis-eQTL effect sizes estimated by a nonparametric Bayesian method to study Alzheimer’s dementia (AD) related phenotypes and detected 13 genes significantly associated with AD, including 6 known GWAS risk loci. All significant loci are proximal to the major known risk loci APOE for AD. Further, we add this VC-TWAS function into our previously developed tool TIGAR for public use.