scholarly journals Causal integration of multi-omics data with prior knowledge to generate mechanistic hypotheses

Author(s):  
Aurelien Dugourd ◽  
Christoph Kuppe ◽  
Marco Sciacovelli ◽  
Enio Gjerga ◽  
Kristina B. Emdal ◽  
...  

AbstractMulti-omics datasets can provide molecular insights beyond the sum of individual omics. Diverse tools have been recently developed to integrate such datasets, but there are limited strategies to systematically extract mechanistic hypotheses from them. Here, we present COSMOS (Causal Oriented Search of Multi-Omics Space), a method that integrates phosphoproteomics, transcriptomics, and metabolics datasets. COSMOS combines extensive prior knowledge of signaling, metabolic, and gene regulatory networks with computational methods to estimate activities of transcription factors and kinases as well as network-level causal reasoning. COSMOS provides mechanistic hypotheses for experimental observations across multi-omics datasets. We applied COSMOS to a dataset comprising transcriptomics, phosphoproteomics, and metabolomics data from healthy and cancerous tissue from nine renal cell carcinoma patients. We used COSMOS to generate novel hypotheses such as the impact of Androgen Receptor on nucleoside metabolism and the influence of the JAK-STAT pathway on propionyl coenzyme A production. We expect that our freely available method will be broadly useful to extract mechanistic insights from multi-omics studies.Abstract Figure

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Neel Patel ◽  
William S. Bush

Abstract Background Transcriptional regulation is complex, requiring multiple cis (local) and trans acting mechanisms working in concert to drive gene expression, with disruption of these processes linked to multiple diseases. Previous computational attempts to understand the influence of regulatory mechanisms on gene expression have used prediction models containing input features derived from cis regulatory factors. However, local chromatin looping and trans-acting mechanisms are known to also influence transcriptional regulation, and their inclusion may improve model accuracy and interpretation. In this study, we create a general model of transcription factor influence on gene expression by incorporating both cis and trans gene regulatory features. Results We describe a computational framework to model gene expression for GM12878 and K562 cell lines. This framework weights the impact of transcription factor-based regulatory data using multi-omics gene regulatory networks to account for both cis and trans acting mechanisms, and measures of the local chromatin context. These prediction models perform significantly better compared to models containing cis-regulatory features alone. Models that additionally integrate long distance chromatin interactions (or chromatin looping) between distal transcription factor binding regions and gene promoters also show improved accuracy. As a demonstration of their utility, effect estimates from these models were used to weight cis-regulatory rare variants for sequence kernel association test analyses of gene expression. Conclusions Our models generate refined effect estimates for the influence of individual transcription factors on gene expression, allowing characterization of their roles across the genome. This work also provides a framework for integrating multiple data types into a single model of transcriptional regulation.


2020 ◽  
Author(s):  
Mufang Ying ◽  
Peter Rehani ◽  
Panagiotis Roussos ◽  
Daifeng Wang

AbstractStrong phenotype-genotype associations have been reported across brain diseases. However, understanding underlying gene regulatory mechanisms remains challenging, especially at the cellular level. To address this, we integrated the multi-omics data at the cellular resolution of the human brain: cell-type chromatin interactions, epigenomics and single cell transcriptomics, and predicted cell-type gene regulatory networks linking transcription factors, distal regulatory elements and target genes (e.g., excitatory and inhibitory neurons, microglia, oligodendrocyte). Using these cell-type networks and disease risk variants, we further identified the cell-type disease genes and regulatory networks for schizophrenia and Alzheimer’s disease. The celltype regulatory elements (e.g., enhancers) in the networks were also found to be potential pleiotropic regulatory loci for a variety of diseases. Further enrichment analyses including gene ontology and KEGG pathways revealed potential novel cross-disease and disease-specific molecular functions, advancing knowledge on the interplays among genetic, transcriptional and epigenetic risks at the cellular resolution between neurodegenerative and neuropsychiatric diseases. Finally, we summarized our computational analyses as a general-purpose pipeline for predicting gene regulatory networks via multi-omics data.


2018 ◽  
Vol 15 (138) ◽  
pp. 20170809 ◽  
Author(s):  
Zhipeng Wang ◽  
Davit A. Potoyan ◽  
Peter G. Wolynes

Gene regulatory networks must relay information from extracellular signals to downstream genes in an efficient, timely and coherent manner. Many complex functional tasks such as the immune response require system-wide broadcasting of information not to one but to many genes carrying out distinct functions whose dynamical binding and unbinding characteristics are widely distributed. In such broadcasting networks, the intended target sites are also often dwarfed in number by the even more numerous non-functional binding sites. Taking the genetic regulatory network of NF κ B as an exemplary system we explore the impact of having numerous distributed sites on the stochastic dynamics of oscillatory broadcasting genetic networks pointing out how resonances in binding cycles control the network's specificity and performance. We also show that active kinetic regulation of binding and unbinding through molecular stripping of DNA bound transcription factors can lead to a higher coherence of gene-co-expression and synchronous clearance.


2018 ◽  
Vol 17 (4) ◽  
pp. 246-254 ◽  
Author(s):  
Mark W E J Fiers ◽  
Liesbeth Minnoye ◽  
Sara Aibar ◽  
Carmen Bravo González-Blas ◽  
Zeynep Kalender Atak ◽  
...  

Author(s):  
Adriano V Werhli ◽  
Dirk Husmeier

There have been various attempts to reconstruct gene regulatory networks from microarray expression data in the past. However, owing to the limited amount of independent experimental conditions and noise inherent in the measurements, the results have been rather modest so far. For this reason it seems advisable to include biological prior knowledge, related, for instance, to transcription factor binding locations in promoter regions or partially known signalling pathways from the literature. In the present paper, we consider a Bayesian approach to systematically integrate expression data with multiple sources of prior knowledge. Each source is encoded via a separate energy function, from which a prior distribution over network structures in the form of a Gibbs distribution is constructed. The hyperparameters associated with the different sources of prior knowledge, which measure the influence of the respective prior relative to the data, are sampled from the posterior distribution with MCMC. We have evaluated the proposed scheme on the yeast cell cycle and the Raf signalling pathway. Our findings quantify to what extent the inclusion of independent prior knowledge improves the network reconstruction accuracy, and the values of the hyperparameters inferred with the proposed scheme were found to be close to optimal with respect to minimizing the reconstruction error.


2021 ◽  
Author(s):  
Deborah Weighill ◽  
Marouen Ben Guebila ◽  
Kimberly Glass ◽  
John Quackenbush ◽  
John Platig

AbstractThe majority of disease-associated genetic variants are thought to have regulatory effects, including the disruption of transcription factor (TF) binding and the alteration of downstream gene expression. Identifying how a person’s genotype affects their individual gene regulatory network has the potential to provide important insights into disease etiology and to enable improved genotype-specific disease risk assessments and treatments. However, the impact of genetic variants is generally not considered when constructing gene regulatory networks. To address this unmet need, we developed EGRET (Estimating the Genetic Regulatory Effect on TFs), which infers a genotype-specific gene regulatory network (GRN) for each individual in a study population by using message passing to integrate genotype-informed TF motif predictions - derived from individual genotype data, the predicted effects of variants on TF binding and gene expression, and TF motif predictions - with TF protein-protein interactions and gene expression. Comparing EGRET networks for two blood-derived cell lines identified genotype-associated cell-line specific regulatory differences which were subsequently validated using allele-specific expression, chromatin accessibility QTLs, and differential TF binding from ChIP-seq. In addition, EGRET GRNs for three cell types across 119 individuals captured regulatory differences associated with disease in a cell-type-specific manner. Our analyses demonstrate that EGRET networks can capture the impact of genetic variants on complex phenotypes, supporting a novel fine-scale stratification of individuals based on their genetic background. EGRET is available through the Network Zoo R package (netZooR v0.9; netzoo.github.io).


Sign in / Sign up

Export Citation Format

Share Document