scholarly journals Transcription factor enrichment analysis (TFEA): Quantifying the activity of hundreds of transcription factors from a single experiment

2020 ◽  
Author(s):  
Jonathan D. Rubin ◽  
Jacob T. Stanley ◽  
Rutendo F. Sigauke ◽  
Cecilia B. Levandowski ◽  
Zachary L. Maas ◽  
...  

1AbstractDetecting differential activation of transcription factors (TFs) in response to perturbation provides insight into cellular processes. Transcription Factor Enrichment Analysis (TFEA) is a robust and reliable computational method that detects differential activity of hundreds of TFs given any set of perturbation data. TFEA draws inspiration from GSEA and detects positional motif enrichment within a list of ranked regions of interest (ROIs). As ROIs are typically inferred from the data, we also introduce muMerge, a statistically principled method of generating a consensus list of ROIs from multiple replicates and conditions. TFEA is broadly applicable to data that informs on transcriptional regulation including nascent (eg. PRO-Seq), CAGE, ChIP-Seq, and accessibility (e.g. ATAC-Seq). TFEA not only identifies the key regulators responding to a perturbation, but also temporally unravels regulatory networks with time series data. Consequently, TFEA serves as a hypothesis-generating tool that provides an easy, rigorous, and cost-effective means to broadly assess TF activity yielding new biological insights.

2021 ◽  
Vol 4 (1) ◽  
Author(s):  
Jonathan D. Rubin ◽  
Jacob T. Stanley ◽  
Rutendo F. Sigauke ◽  
Cecilia B. Levandowski ◽  
Zachary L. Maas ◽  
...  

AbstractDetecting changes in the activity of a transcription factor (TF) in response to a perturbation provides insights into the underlying cellular process. Transcription Factor Enrichment Analysis (TFEA) is a robust and reliable computational method that detects positional motif enrichment associated with changes in transcription observed in response to a perturbation. TFEA detects positional motif enrichment within a list of ranked regions of interest (ROIs), typically sites of RNA polymerase initiation inferred from regulatory data such as nascent transcription. Therefore, we also introduce muMerge, a statistically principled method of generating a consensus list of ROIs from multiple replicates and conditions. TFEA is broadly applicable to data that informs on transcriptional regulation including nascent transcription (eg. PRO-Seq), CAGE, histone ChIP-Seq, and accessibility data (e.g., ATAC-Seq). TFEA not only identifies the key regulators responding to a perturbation, but also temporally unravels regulatory networks with time series data. Consequently, TFEA serves as a hypothesis-generating tool that provides an easy, rigorous, and cost-effective means to broadly assess TF activity yielding new biological insights.


2020 ◽  
Vol 36 (19) ◽  
pp. 4885-4893 ◽  
Author(s):  
Baoshan Ma ◽  
Mingkun Fang ◽  
Xiangtian Jiao

Abstract Motivation Gene regulatory networks (GRNs) capture the regulatory interactions between genes, resulting from the fundamental biological process of transcription and translation. In some cases, the topology of GRNs is not known, and has to be inferred from gene expression data. Most of the existing GRNs reconstruction algorithms are either applied to time-series data or steady-state data. Although time-series data include more information about the system dynamics, steady-state data imply stability of the underlying regulatory networks. Results In this article, we propose a method for inferring GRNs from time-series and steady-state data jointly. We make use of a non-linear ordinary differential equations framework to model dynamic gene regulation and an importance measurement strategy to infer all putative regulatory links efficiently. The proposed method is evaluated extensively on the artificial DREAM4 dataset and two real gene expression datasets of yeast and Escherichia coli. Based on public benchmark datasets, the proposed method outperforms other popular inference algorithms in terms of overall score. By comparing the performance on the datasets with different scales, the results show that our method still keeps good robustness and accuracy at a low computational complexity. Availability and implementation The proposed method is written in the Python language, and is available at: https://github.com/lab319/GRNs_nonlinear_ODEs Supplementary information Supplementary data are available at Bioinformatics online.


Entropy ◽  
2020 ◽  
Vol 22 (12) ◽  
pp. 1343 ◽  
Author(s):  
Robin A. Choudhury ◽  
Neil McRoberts

In a previous study, air sampling using vortex air samplers combined with species-specific amplification of pathogen DNA was carried out over two years in four or five locations in the Salinas Valley of California. The resulting time series data for the abundance of pathogen DNA trapped per day displayed complex dynamics with features of both deterministic (chaotic) and stochastic uncertainty. Methods of nonlinear time series analysis developed for the reconstruction of low dimensional attractors provided new insights into the complexity of pathogen abundance data. In particular, the analyses suggested that the length of time series data that it is practical or cost-effective to collect may limit the ability to definitively classify the uncertainty in the data. Over the two years of the study, five location/year combinations were classified as having stochastic linear dynamics and four were not. Calculation of entropy values for either the number of pathogen DNA copies or for a binary string indicating whether the pathogen abundance data were increasing revealed (1) some robust differences in the dynamics between seasons that were not obvious in the time series data themselves and (2) that the series were almost all at their theoretical maximum entropy value when considered from the simple perspective of whether instantaneous change along the sequence was positive.


2015 ◽  
Vol 13 (03) ◽  
pp. 1541006 ◽  
Author(s):  
Asako Komori ◽  
Yukihiro Maki ◽  
Isao Ono ◽  
Masahiro Okamoto

Biological systems are composed of biomolecules such as genes, proteins, metabolites, and signaling components, which interact in complex networks. To understand complex biological systems, it is important to be capable of inferring regulatory networks from experimental time series data. In previous studies, we developed efficient numerical optimization methods for inferring these networks, but we have yet to test the performance of our methods when considering the error (noise) that is inherent in experimental data. In this study, we investigated the noise tolerance of our proposed inferring engine. We prepared the noise data using the Langevin equation, and compared the performance of our method with that of alternative optimization methods.


2018 ◽  
Author(s):  
Philippa Borrill ◽  
Sophie A. Harrington ◽  
James Simmonds ◽  
Cristobal Uauy

AbstractSenescence is a tightly regulated developmental programme which is coordinated by transcription factors. Identifying these transcription factors in crops will provide opportunities to tailor the senescence process to different environmental conditions and regulate the balance between yield and grain nutrient content. Here we use ten time points of gene expression data alongside gene network modelling to identify transcription factors regulating senescence in polyploid wheat. We observe two main phases of transcription changes during senescence: early downregulation of housekeeping and metabolic processes followed by upregulation of transport and hormone related genes. We have identified transcription factor families associated with these early and later waves of differential expression. Using gene regulatory network modelling alongside complementary publicly available datasets we identified candidate transcription factors for controlling senescence. We validated the function of one of these candidate transcription factors in senescence using wheat chemically-induced mutants. This study lays the ground work to understand the transcription factors which regulate senescence in polyploid wheat and exemplifies the integration of time-series data with publicly available expression atlases and networks to identify candidate regulatory genes.


2021 ◽  
Author(s):  
Romain Bulteau ◽  
Mirko Francesconi

AbstractGenome-wide gene expression profiling is a powerful tool for exploratory analyses, providing a high dimensional picture of the state of a biological system. However, uncontrolled variation among samples can obscure and confound the effect of variables of interest. Uncontrolled developmental variation is often a major source of unknown expression variation in developmental systems. Existing methods to sort samples from transcriptomes require many samples to infer developmental trajectories and only provide a relative pseudo-time.Here we present RAPToR (Real Age Prediction from Transcriptome staging on Reference), a simple computational method to estimate the absolute developmental age of even a single sample from its gene expression with up to minutes precision. We achieve this by staging samples on high-resolution reference developmental expression profiles we build from existing time series data. We implemented RAPToR for the most common animal model systems: nematode, fruit fly, zebrafish, and mouse, and demonstrate application for non-model organisms. We show how developmental variation discovered by RAPToR can be exploited to increase power to detect differential expression and to untangle the signal of perturbations of interest even when it is completely confounded with development. We anticipate our RAPToR post-profiling staging strategy will be especially useful in large scale single organism profiling because it eliminates the need for synchronization or for a tedious and potentially difficult step of accurate staging before profiling.


Sign in / Sign up

Export Citation Format

Share Document