Controlling bias and inflation in epigenome- and transcriptome-wide association studies using the empirical null distribution
ABSTRACTAssociation studies on omic-level data other then genotypes (GWAS) are becoming increasingly common, i.e., epigenome-and transcriptome-wide association studies (EWAS/TWAS). However, a tool box for the analysis of EWAS and TWAS studies is largely lacking and often approaches from GWAS are applied despite the fact that epigenome and transcriptome data have vedifferent characteristics than genotypes. Here, we show that EWASs and TWASs are prone not only to significant inflation but also bias of the test statistics and that these are not properly addressed by GWAS-based methodology (i.e. genomic control) and state-of-the-art approaches to control for unmeasured confounding (i.e. RUV, sva and cate). We developed a novel approach that is based on the estimation of the empirical null distribution using Bayesian statistics. Using simulation studies and empirical data, we demonstrate that our approach maximizes power while properly controlling the false positive rate. Finally, we illustrate the utility of our method in the application of meta-analysis by performing EWASs and TWASs on age and smoking which highlighted an overlap in differential methylation and expression of associated genes. An implementation of our new method is available from http://bioconductor.org/packages/bacon/.