scholarly journals Scalable machine learning-assisted model exploration and inference using Sciope

Author(s):  
Prashant Singh ◽  
Fredrik Wrede ◽  
Andreas Hellander

Abstract Summary Discrete stochastic models of gene regulatory networks are fundamental tools for in silico study of stochastic gene regulatory networks. Likelihood-free inference and model exploration are critical applications to study a system using such models. However, the massive computational cost of complex, high-dimensional and stochastic modelling currently limits systematic investigation to relatively simple systems. Recently, machine-learning-assisted methods have shown great promise to handle larger, more complex models. To support both ease-of-use of this new class of methods, as well as their further development, we have developed the scalable inference, optimization and parameter exploration (Sciope) toolbox. Sciope is designed to support new algorithms for machine-learning-assisted model exploration and likelihood-free inference. Moreover, it is built ground up to easily leverage distributed and heterogeneous computational resources for convenient parallelism across platforms from workstations to clouds. Availability and implementation The Sciope Python3 toolbox is freely available on https://github.com/Sciope/Sciope, and has been tested on Linux, Windows and macOS platforms. Supplementary information Supplementary information is available at Bioinformatics online.

Mathematics ◽  
2021 ◽  
Vol 9 (9) ◽  
pp. 1022
Author(s):  
Gianluca D’Addese ◽  
Martina Casari ◽  
Roberto Serra ◽  
Marco Villani

In many complex systems one observes the formation of medium-level structures, whose detection could allow a high-level description of the dynamical organization of the system itself, and thus to its better understanding. We have developed in the past a powerful method to achieve this goal, which however requires a heavy computational cost in several real-world cases. In this work we introduce a modified version of our approach, which reduces the computational burden. The design of the new algorithm allowed the realization of an original suite of methods able to work simultaneously at the micro level (that of the binary relationships of the single variables) and at meso level (the identification of dynamically relevant groups). We apply this suite to a particularly relevant case, in which we look for the dynamic organization of a gene regulatory network when it is subject to knock-outs. The approach combines information theory, graph analysis, and an iterated sieving algorithm in order to describe rather complex situations. Its application allowed to derive some general observations on the dynamical organization of gene regulatory networks, and to observe interesting characteristics in an experimental case.


2016 ◽  
Vol 7 ◽  
Author(s):  
Ying Ni ◽  
Delasa Aghamirzaie ◽  
Haitham Elmarakeby ◽  
Eva Collakova ◽  
Song Li ◽  
...  

2019 ◽  
Vol 36 (1) ◽  
pp. 197-204 ◽  
Author(s):  
Xin Zhou ◽  
Xiaodong Cai

Abstract Motivation Gene regulatory networks (GRNs) of the same organism can be different under different conditions, although the overall network structure may be similar. Understanding the difference in GRNs under different conditions is important to understand condition-specific gene regulation. When gene expression and other relevant data under two different conditions are available, they can be used by an existing network inference algorithm to estimate two GRNs separately, and then to identify the difference between the two GRNs. However, such an approach does not exploit the similarity in two GRNs, and may sacrifice inference accuracy. Results In this paper, we model GRNs with the structural equation model (SEM) that can integrate gene expression and genetic perturbation data, and develop an algorithm named fused sparse SEM (FSSEM), to jointly infer GRNs under two conditions, and then to identify difference of the two GRNs. Computer simulations demonstrate that the FSSEM algorithm outperforms the approaches that estimate two GRNs separately. Analysis of a dataset of lung cancer and another dataset of gastric cancer with FSSEM inferred differential GRNs in cancer versus normal tissues, whose genes with largest network degrees have been reported to be implicated in tumorigenesis. The FSSEM algorithm provides a valuable tool for joint inference of two GRNs and identification of the differential GRN under two conditions. Availability and implementation The R package fssemR implementing the FSSEM algorithm is available at https://github.com/Ivis4ml/fssemR.git. It is also available on CRAN. Supplementary information Supplementary data are available at Bioinformatics online.


2018 ◽  
Vol 5 (2) ◽  
pp. 171226 ◽  
Author(s):  
Faizan Ehsan Elahi ◽  
Ammar Hasan

Gene regulatory networks (GRNs) are quite large and complex. To better understand and analyse GRNs, mathematical models are being employed. Different types of models, such as logical, continuous and stochastic models, can be used to describe GRNs. In this paper, we present a new approach to identify continuous models, because they are more suitable for large number of genes and quantitative analysis. One of the most promising techniques for identifying continuous models of GRNs is based on Hill functions and the generalized profiling method (GPM). The advantage of this approach is low computational cost and insensitivity to initial conditions. In the GPM, a constrained nonlinear optimization problem has to be solved that is usually underdetermined. In this paper, we propose a new optimization approach in which we reformulate the optimization problem such that constraints are embedded implicitly in the cost function. Moreover, we propose to split the unknown parameter in two sets based on the structure of Hill functions. These two sets are estimated separately to resolve the issue of the underdetermined problem. As a case study, we apply the proposed technique on the SOS response in Escherichia coli and compare the results with the existing literature.


2021 ◽  
Vol 12 ◽  
Author(s):  
Jiyoung Lee ◽  
Shuo Geng ◽  
Song Li ◽  
Liwu Li

Subclinical doses of LPS (SD-LPS) are known to cause low-grade inflammatory activation of monocytes, which could lead to inflammatory diseases including atherosclerosis and metabolic syndrome. Sodium 4-phenylbutyrate is a potential therapeutic compound which can reduce the inflammation caused by SD-LPS. To understand the gene regulatory networks of these processes, we have generated scRNA-seq data from mouse monocytes treated with these compounds and identified 11 novel cell clusters. We have developed a machine learning method to integrate scRNA-seq, ATAC-seq, and binding motifs to characterize gene regulatory networks underlying these cell clusters. Using guided regularized random forest and feature selection, our method achieved high performance and outperformed a traditional enrichment-based method in selecting candidate regulatory genes. Our method is particularly efficient in selecting a few candidate genes to explain observed expression pattern. In particular, among 531 candidate TFs, our method achieves an auROC of 0.961 with only 10 motifs. Finally, we found two novel subpopulations of monocyte cells in response to SD-LPS and we confirmed our analysis using independent flow cytometry experiments. Our results suggest that our new machine learning method can select candidate regulatory genes as potential targets for developing new therapeutics against low grade inflammation.


2005 ◽  
Vol 15 (15) ◽  
pp. 691-711 ◽  
Author(s):  
Hana El Samad ◽  
Mustafa Khammash ◽  
Linda Petzold ◽  
Dan Gillespie

Patterns ◽  
2020 ◽  
Vol 1 (9) ◽  
pp. 100139
Author(s):  
Daniel Osorio ◽  
Yan Zhong ◽  
Guanxun Li ◽  
Jianhua Z. Huang ◽  
James J. Cai

Author(s):  
Anastasiya Belyaeva ◽  
Chandler Squires ◽  
Caroline Uhler

Abstract Summary Designing interventions to control gene regulation necessitates modeling a gene regulatory network by a causal graph. Currently, large-scale gene expression datasets from different conditions, cell types, disease states, and developmental time points are being collected. However, application of classical causal inference algorithms to infer gene regulatory networks based on such data is still challenging, requiring high sample sizes and computational resources. Here, we describe an algorithm that efficiently learns the differences in gene regulatory mechanisms between different conditions. Our difference causal inference (DCI) algorithm infers changes (i.e. edges that appeared, disappeared, or changed weight) between two causal graphs given gene expression data from the two conditions. This algorithm is efficient in its use of samples and computation since it infers the differences between causal graphs directly without estimating each possibly large causal graph separately. We provide a user-friendly Python implementation of DCI and also enable the user to learn the most robust difference causal graph across different tuning parameters via stability selection. Finally, we show how to apply DCI to single-cell RNA-seq data from different conditions and cell states, and we also validate our algorithm by predicting the effects of interventions. Availability and implementation Python package freely available at http://uhlerlab.github.io/causaldag/dci. Supplementary information Supplementary data are available at Bioinformatics online.


2018 ◽  
Author(s):  
P. Tsakanikas ◽  
D. Manatakis ◽  
E. S. Manolakos

ABSTRACTDeciphering the dynamic gene regulatory mechanisms driving cells to make fate decisions remains elusive. We present a novel unsupervised machine learning methodology that can be used to analyze a dataset of heterogeneous single-cell gene expressions profiles, determine the most probable number of states (major cellular phenotypes) represented and extract the corresponding cell sub-populations. Most importantly, for any transition of interest from a source to a destination state, our methodology can zoom in, identify the cells most specific for studying the dynamics of this transition, order them along a trajectory of biological progression in posterior probabilities space, determine the "key-player" genes governing the transition dynamics, partition the trajectory into consecutive phases (transition "micro-states"), and finally reconstruct causal gene regulatory networks for each phase. Application of the end-to-end methodology provides new insights on key-player genes and their dynamic interactions during the important HSC-to-LMPP cell state transition involved in hematopoiesis. Moreover, it allows us to reconstruct a probabilistic representation of the “epigenetic landscape” of transitions and identify correctly the major ones in the hematopoiesis hierarchy of states.


Sign in / Sign up

Export Citation Format

Share Document