Structure Learning for Hierarchical Regulatory Networks
Network analysis offers a powerful technique to model the relationships between genes within biological regulatory networks. Inference of biological network structures is often performed on high-dimensional data, yet is hindered by the limited sample size of high throughput "omics" data typically available. To overcome this challenge, we exploit known organizing principles of biological networks that are sparse, modular, and likely share a large portion of their underlying architecture. We present SHINE - Structure Learning for Hierarchical Networks - a framework for defining data-driven structural constraints and incorporating a shared learning paradigm for efficiently learning multiple networks from high-dimensional data. We show through simulations SHINE improves performance when relatively few samples are available and multiple networks are desired, by reducing the complexity of the graphical search space and by taking advantage of shared structural information. We evaluated SHINE on TCGA Pan-Cancer data and found learned tumor-specific networks exhibit expected graph properties of real biological networks, recapture previously validated interactions, and recapitulate findings in literature. Application of SHINE to the analysis of subtype-specific breast cancer networks identified key genes and biological processes for tumor maintenance and survival as well as potential therapeutic targets for modulating known breast cancer disease genes.