scholarly journals Parameter inference with analytical propagators for stochastic models of autoregulated gene expression

Author(s):  
Frits Veerman ◽  
Nikola Popović ◽  
Carsten Marr

Abstract Stochastic gene expression in regulatory networks is conventionally modelled via the chemical master equation (CME). As explicit solutions to the CME, in the form of so-called propagators, are oftentimes not readily available, various approximations have been proposed. A recently developed analytical method is based on a separation of time scales that assumes significant differences in the lifetimes of mRNA and protein in the network, allowing for the efficient approximation of propagators from asymptotic expansions for the corresponding generating functions. Here, we showcase the applicability of that method to simulated data from a ‘telegraph’ model for gene expression that is extended with an autoregulatory mechanism. We demonstrate that the resulting approximate propagators can be applied successfully for parameter inference in the non-regulated model; moreover, we show that, in the extended autoregulated model, autoactivation or autorepression may be refuted under certain assumptions on the model parameters. These results indicate that our approach may allow for successful parameter inference and model identification from longitudinal single cell data.

2018 ◽  
Author(s):  
Frits Veerman ◽  
Nikola Popović ◽  
Carsten Marr

ABSTRACTStochastic gene expression in regulatory networks is conventionally modelled via the Chemical Master Equation (CME). As explicit solutions to the CME, in the form of so-called propagators, are oftentimes not readily available, various approximations have been proposed. A recently developed analytical method is based on a separation of scales that assumes significant differences in the lifetimes of mRNA and protein in the network, allowing for the efficient approximation of propagators from asymptotic expansions for the corresponding generating functions. Here, we showcase the applicability of that method to simulated data from a ‘telegraph’ model for gene expression that is extended with an autoregulatory mechanism. We demonstrate that the resulting approximate propagators can be applied successfully for Bayesian parameter inference in the non-regulated model; moreover, we show that, in the extended autoregulated model, autoactivation or autorepression may be refuted under certain assumptions on the model parameters. Our results indicate that the method showcased here may allow for successful parameter inference and model identification from longitudinal single cell data.


2005 ◽  
Vol 15 (04) ◽  
pp. 297-310 ◽  
Author(s):  
WAI-KI CHING ◽  
MICHAEL M. NG ◽  
ERIC S. FUNG ◽  
TATSUYA AKUTSU

Reconstruction of genetic regulatory networks from time series data of gene expression patterns is an important research topic in bioinformatics. Probabilistic Boolean Networks (PBNs) have been proposed as an effective model for gene regulatory networks. PBNs are able to cope with uncertainty, corporate rule-based dependencies between genes and discover the sensitivity of genes in their interactions with other genes. However, PBNs are unlikely to use directly in practice because of huge amount of computational cost for obtaining predictors and their corresponding probabilities. In this paper, we propose a multivariate Markov model for approximating PBNs and describing the dynamics of a genetic network for gene expression sequences. The main contribution of the new model is to preserve the strength of PBNs and reduce the complexity of the networks. The number of parameters of our proposed model is O(n2) where n is the number of genes involved. We also develop efficient estimation methods for solving the model parameters. Numerical examples on synthetic data sets and practical yeast data sequences are given to demonstrate the effectiveness of the proposed model.


2018 ◽  
Vol 19 (4) ◽  
pp. 444-465
Author(s):  
William Chad Young ◽  
Ka Yee Yeung ◽  
Adrian E Raftery

Gene regulatory network reconstruction is an essential task of genomics in order to further our understanding of how genes interact dynamically with each other. The most readily available data, however, are from steady-state observations. These data are not as informative about the relational dynamics between genes as knockout or over-expression experiments, which attempt to control the expression of individual genes. We develop a new framework for network inference using samples from the equilibrium distribution of a vector autoregressive (VAR) time-series model which can be applied to steady-state gene expression data. We explore the theoretical aspects of our method and apply the method to synthetic gene expression data generated using GeneNetWeaver.


2009 ◽  
Vol 2009 ◽  
pp. 1-10
Author(s):  
Martina Bremer ◽  
R. W. Doerge

We present a statistical method to rank observed genes in gene expression time series experiments according to their degree of regulation in a biological process. The ranking may be used to focus on specific genes or to select meaningful subsets of genes from which gene regulatory networks can be built. Our approach is based on a state space model that incorporates hidden regulators of gene expression. Kalman (K) smoothing and maximum (M) likelihood estimation techniques are used to derive optimal estimates of the model parameters upon which a proposed regulation criterion is based. The statistical power of the proposed algorithm is investigated, and a real data set is analyzed for the purpose of identifying regulated genes in time dependent gene expression data. This statistical approach supports the concept that meaningful biological conclusions can be drawn from gene expression time series experiments by focusing on strong regulation rather than large expression values.


2017 ◽  
Author(s):  
Mariana Gómez-Schiavon ◽  
Liang-Fu Chen ◽  
Anne E. West ◽  
Nicolas E. Buchler

AbstractSingle-molecule RNA fluorescence in situ hybridization (smFISH) provides unparalleled resolution on the abundance and localization of nascent and mature transcripts in single cells. Gene expression dynamics are typically inferred by measuring mRNA abundance in small numbers of fixed cells sampled from a population at multiple time-points after induction. The sparse data that arise from the small number of cells obtained using smFISH present a challenge for inferring transcription dynamics. Here, we developed a computational pipeline (BayFish) to infer kinetic parameters of gene expression from smFISH data at multiple time points after induction. Given an underlying model of gene expression, BayFish uses a Monte Carlo method to estimate the Bayesian posterior probability of the model parameters and quantify the parameter uncertainty given the observed smFISH data. We tested BayFish on smFISH measurements of the neuronal activity inducible gene Npas4 in primary neurons. We showed that a 2-state promoter model can recapitulate Npas4 dynamics after induction and we inferred that the transition rate from the promoter OFF state to the ON state is increased by the stimulus.Author SummaryGene expression can exhibit cell-to-cell variability due to the stochastic nature of biochemical reactions. Single cell assays (e.g. smFISH) directly quantify stochastic gene expression by measuring the number of active promoters and transcripts per cell in a population of cells. The data are distributions and their shape and time-evolution contain critical information on the underlying process of gene expression. Recent work has combined models of stochastic gene expression with maximum likelihood methods to infer kinetic parameters from smFISH distributions. However, these approaches do not provide a probability distribution or likelihood of model parameters inferred from the smFISH data. This information is useful because it indicates which parameters are loosely constrained by the data and suggests follow up experiments. We developed a suite of MATLAB programs (BayFish) that estimate the Bayesian posterior probability of model parameters from smFISH data. The user specifies an underlying model of stochastic gene expression with unknown parameters (θ) and provides smFISH data (Y). BayFish uses a Monte Carlo algorithm to estimate the Bayesian posterior probability P(θ|Y) of model parameters. BayFish is easily modified and can be applied to other models of stochastic gene expression and smFISH data sets.


2020 ◽  
Author(s):  
S. Thomas Kelly ◽  
Michael A. Black

SummaryTranscriptomic analysis is used to capture the molecular state of a cell or sample in many biological and medical applications. In addition to identifying alterations in activity at the level of individual genes, understanding changes in the gene networks that regulate fundamental biological mechanisms is also an important objective of molecular analysis. As a result, databases that describe biological pathways are increasingly uesad to assist with the interpretation of results from large-scale genomics studies. Incorporating information from biological pathways and gene regulatory networks into a genomic data analysis is a popular strategy, and there are many methods that provide this functionality for gene expression data. When developing or comparing such methods, it is important to gain an accurate assessment of their performance. Simulation-based validation studies are frequently used for this. This necessitates the use of simulated data that correctly accounts for pathway relationships and correlations. Here we present a versatile statistical framework to simulate correlated gene expression data from biological pathways, by sampling from a multivariate normal distribution derived from a graph structure. This procedure has been released as the graphsim R package on CRAN and GitHub (https://github.com/TomKellyGenetics/graphsim) and is compatible with any graph structure that can be described using the igraph package. This package allows the simulation of biological pathways from a graph structure based on a statistical model of gene expression.


Sign in / Sign up

Export Citation Format

Share Document