CSNet: Estimating cell-type-specific gene co-expression networks from bulk gene expression data
Inferring and characterizing gene co-expression networks have led to important insights on the molecular mechanisms and functional pathways in healthy and diseased individuals. Most co-expression analyses to date have been performed on gene expression data collected from bulk tissues with different cell type compositions across samples, resulting in co-expression estimates confounded by heterogeneity in cell type proportions. To address this limitation in co-expression analysis, we propose a flexible framework that estimates cell-type-specific gene co-expressions from bulk sample data, where the cell-type-specific distributions of gene expression levels are not assumed known. To overcome the computational challenge in estimating covariances and correlations from a convolution of high dimensional densities, we develop a novel thresholded least squares estimator, named CSNet, that is efficient to implement and has good theoretical properties. We further investigate the convergence rate of CSNet. The utility and efficacy of CSNet is demonstrated through simulation studies and an application to a gene co-expression study with bulk samples from Alzheimer's disease patients, where our analysis identified new cell-type-specific modules of AD risk genes.