scholarly journals Accurate differential analysis of transcription factor activity from gene expression

2019 ◽  
Vol 35 (23) ◽  
pp. 5018-5029
Author(s):  
Viren Amin ◽  
Didem Ağaç ◽  
Spencer D Barnes ◽  
Murat Can Çobanoğlu

Abstract Motivation Activity of transcriptional regulators is crucial in elucidating the mechanism of phenotypes. However regulatory activity hypotheses are difficult to experimentally test. Therefore, we need accurate and reliable computational methods for regulator activity inference. There is extensive work in this area, however, current methods have difficulty with one or more of the following: resolving activity of TFs with overlapping regulons, reflecting known regulatory relationships, or flexible modeling of TF activity over the regulon. Results We present Effector and Perturbation Estimation Engine (EPEE), a method for differential analysis of transcription factor (TF) activity from gene expression data. EPEE addresses each of these principal challenges in the field. Firstly, EPEE collectively models all TF activity in a single multivariate model, thereby accounting for the intrinsic coupling among TFs that share targets, which is highly frequent. Secondly, EPEE incorporates context-specific TF-gene regulatory networks and therefore adapts the analysis to each biological context. Finally, EPEE can flexibly reflect different regulatory activity of a single TF among its potential targets. This allows the flexibility to implicitly recover other regulatory influences such as co-activators or repressors. We comparatively validated EPEE in 15 datasets from three well-studied contexts, namely immunology, cancer, and hematopoiesis. We show that addressing the aforementioned challenges enable EPEE to outperform alternative methods and reliably produce accurate results. Availability and implementation https://github.com/Cobanoglu-Lab/EPEE. Supplementary information Supplementary data are available at Bioinformatics online.

2018 ◽  
Author(s):  
Viren Amin ◽  
Murat Can Cobanoglu

AbstractWe present EPEE (Effector and Perturbation Estimation Engine), a method for differential analysis of transcription factor (TF) activity from gene expression data. EPEE addresses two principal challenges in the field, namely incorporating context-specific TF-gene regulatory networks, and accounting for the fact that TF activity inference is intrinsically coupled for all TFs that share targets. Our validations in well-studied immune and cancer contexts show that addressing the overlap challenge and using state-of-the-art regulatory networks enable EPEE to consistently produce accurate results. (Accessible at: https://github.com/Cobanoglu-Lab/EPEE)


2017 ◽  
Author(s):  
Yijie Wang ◽  
Dong-Yeon Cho ◽  
Hangnoh Lee ◽  
Justin Fear ◽  
Brian Oliver ◽  
...  

AbstractUnderstanding gene regulation is a fundamental step towards understanding of how cells function and respond to environmental cues and perturbations. An important step in this direction is the ability to infer the transcription factor (TF)-gene regulatory network (GRN). However gene regulatory networks are typically constructed disregarding the fact that regulatory programs are conditioned on tissue type, developmental stage, sex, and other factors. Due to lack of the biological context specificity, these context-agnostic networks may not provide insight for revealing the precise actions of genes for a specific biological system under concern. Collecting multitude of features required for a reliable construction of GRNs such as physical features (TF binding, chromatin accessibility) and functional features (correlation of expression or chromatin patterns) for every context of interest is costly. Therefore we need methods that is able to utilize the knowledge about a context-agnostic network (or a network constructed in a related context) for construction of a context specific regulatory network.To address this challenge we developed a computational approach that utilizes expression data obtained in a specific biological context such as a particular development stage, sex, tissue type and a GRN constructed in a different but related context (alternatively an incomplete or a noisy network for the same context) to construct a context specific GRN. Our method, NetREX, is inspired by network component analysis (NCA) that estimates TF activities and their influences on target genes given predetermined topology of a TF-gene network. To predict a network under a different condition, NetREX removes the restriction that the topology of the TF-gene network is fixed and allows for adding and removing edges to that network. To solve the corresponding optimization problem, which is non-convex and non-smooth, we provide a general mathematical framework allowing use of the recently proposed Proximal Alternative Linearized Maximization technique and prove that our formulation has the properties required for convergence.We tested our NetREX on simulated data and subsequently applied it to gene expression data in adult females from 99 hemizygotic lines of the Drosophila deletion (DrosDel) panel. The networks predicted by NetREX showed higher biological consistency than alternative approaches. In addition, we used the list of recently identified targets of the Doublesex (DSX) transcription factor to demonstrate the predictive power of our method.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Neel Patel ◽  
William S. Bush

Abstract Background Transcriptional regulation is complex, requiring multiple cis (local) and trans acting mechanisms working in concert to drive gene expression, with disruption of these processes linked to multiple diseases. Previous computational attempts to understand the influence of regulatory mechanisms on gene expression have used prediction models containing input features derived from cis regulatory factors. However, local chromatin looping and trans-acting mechanisms are known to also influence transcriptional regulation, and their inclusion may improve model accuracy and interpretation. In this study, we create a general model of transcription factor influence on gene expression by incorporating both cis and trans gene regulatory features. Results We describe a computational framework to model gene expression for GM12878 and K562 cell lines. This framework weights the impact of transcription factor-based regulatory data using multi-omics gene regulatory networks to account for both cis and trans acting mechanisms, and measures of the local chromatin context. These prediction models perform significantly better compared to models containing cis-regulatory features alone. Models that additionally integrate long distance chromatin interactions (or chromatin looping) between distal transcription factor binding regions and gene promoters also show improved accuracy. As a demonstration of their utility, effect estimates from these models were used to weight cis-regulatory rare variants for sequence kernel association test analyses of gene expression. Conclusions Our models generate refined effect estimates for the influence of individual transcription factors on gene expression, allowing characterization of their roles across the genome. This work also provides a framework for integrating multiple data types into a single model of transcriptional regulation.


2019 ◽  
Vol 36 (1) ◽  
pp. 197-204 ◽  
Author(s):  
Xin Zhou ◽  
Xiaodong Cai

Abstract Motivation Gene regulatory networks (GRNs) of the same organism can be different under different conditions, although the overall network structure may be similar. Understanding the difference in GRNs under different conditions is important to understand condition-specific gene regulation. When gene expression and other relevant data under two different conditions are available, they can be used by an existing network inference algorithm to estimate two GRNs separately, and then to identify the difference between the two GRNs. However, such an approach does not exploit the similarity in two GRNs, and may sacrifice inference accuracy. Results In this paper, we model GRNs with the structural equation model (SEM) that can integrate gene expression and genetic perturbation data, and develop an algorithm named fused sparse SEM (FSSEM), to jointly infer GRNs under two conditions, and then to identify difference of the two GRNs. Computer simulations demonstrate that the FSSEM algorithm outperforms the approaches that estimate two GRNs separately. Analysis of a dataset of lung cancer and another dataset of gastric cancer with FSSEM inferred differential GRNs in cancer versus normal tissues, whose genes with largest network degrees have been reported to be implicated in tumorigenesis. The FSSEM algorithm provides a valuable tool for joint inference of two GRNs and identification of the differential GRN under two conditions. Availability and implementation The R package fssemR implementing the FSSEM algorithm is available at https://github.com/Ivis4ml/fssemR.git. It is also available on CRAN. Supplementary information Supplementary data are available at Bioinformatics online.


2011 ◽  
Vol 28 (2) ◽  
pp. 214-221 ◽  
Author(s):  
Geert Geeven ◽  
Ronald E. van Kesteren ◽  
August B. Smit ◽  
Mathisca C. M. de Gunst

Author(s):  
Rodrigo Santibáñez ◽  
Daniel Garrido ◽  
Alberto J M Martin

Abstract Motivation Cells are complex systems composed of hundreds of genes whose products interact to produce elaborated behaviors. To control such behaviors, cells rely on transcription factors to regulate gene expression, and gene regulatory networks (GRNs) are employed to describe and understand such behavior. However, GRNs are static models, and dynamic models are difficult to obtain due to their size, complexity, stochastic dynamics and interactions with other cell processes. Results We developed Atlas, a Python software that converts genome graphs and gene regulatory, interaction and metabolic networks into dynamic models. The software employs these biological networks to write rule-based models for the PySB framework. The underlying method is a divide-and-conquer strategy to obtain sub-models and combine them later into an ensemble model. To exemplify the utility of Atlas, we used networks of varying size and complexity of Escherichia coli and evaluated in silico modifications, such as gene knockouts and the insertion of promoters and terminators. Moreover, the methodology could be applied to the dynamic modeling of natural and synthetic networks of any bacteria. Availability and implementation Code, models and tutorials are available online (https://github.com/networkbiolab/atlas). Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Anastasiya Belyaeva ◽  
Chandler Squires ◽  
Caroline Uhler

Abstract Summary Designing interventions to control gene regulation necessitates modeling a gene regulatory network by a causal graph. Currently, large-scale gene expression datasets from different conditions, cell types, disease states, and developmental time points are being collected. However, application of classical causal inference algorithms to infer gene regulatory networks based on such data is still challenging, requiring high sample sizes and computational resources. Here, we describe an algorithm that efficiently learns the differences in gene regulatory mechanisms between different conditions. Our difference causal inference (DCI) algorithm infers changes (i.e. edges that appeared, disappeared, or changed weight) between two causal graphs given gene expression data from the two conditions. This algorithm is efficient in its use of samples and computation since it infers the differences between causal graphs directly without estimating each possibly large causal graph separately. We provide a user-friendly Python implementation of DCI and also enable the user to learn the most robust difference causal graph across different tuning parameters via stability selection. Finally, we show how to apply DCI to single-cell RNA-seq data from different conditions and cell states, and we also validate our algorithm by predicting the effects of interventions. Availability and implementation Python package freely available at http://uhlerlab.github.io/causaldag/dci. Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Author(s):  
Neel Patel ◽  
William Bush

Abstract BackgroundTranscriptional regulation is complex, requiring multiple cis(local) and trans acting mechanisms working in concert to drive gene expression, with disruption of these processes linked to multiple diseases. Previous computational attempts to understand the influence of regulatory mechanisms on gene expression have used prediction models containing input features derived from cis regulatory factors. However, local chromatin looping and trans-acting mechanisms are known to also influence transcriptional regulation, and their inclusion may improve model accuracy and interpretation. ResultsWe describe a computational framework to model gene expression for GM12878 and K562 cell lines. This framework weights the impact of transcription factor-based regulatory data using multi-omics gene regulatory networks to account for both cis and trans acting mechanisms, and the local chromatin context. These prediction models perform significantly better compared to models containing cis-regulatory features alone. Models that additionally integrate long distance chromatin interactions (or chromatin looping) between distal transcription factor binding regions and gene promoters also show improved accuracy. As a demonstration of their utility, effect estimates from these models were used to weight cis-regulatory rare variants for SKAT(sequence kernel association test) analyses of gene expression. ConclusionsOur models generate refined effect estimates for individual transcription factors, allow characterization of their roles across the genome, and provide a framework for integrating multiple data types into a single model of transcriptional regulation.


Sign in / Sign up

Export Citation Format

Share Document