Scalable machine learning-assisted model exploration and inference using Sciope

Bioinformatics ◽

10.1093/bioinformatics/btaa673 ◽

2020 ◽

Author(s):

Prashant Singh ◽

Fredrik Wrede ◽

Andreas Hellander

Keyword(s):

Machine Learning ◽

Gene Regulatory Networks ◽

Regulatory Networks ◽

Computational Cost ◽

Stochastic Modelling ◽

Systematic Investigation ◽

Ease Of Use ◽

Supplementary Information ◽

Great Promise ◽

Gene Regulatory

Abstract Summary Discrete stochastic models of gene regulatory networks are fundamental tools for in silico study of stochastic gene regulatory networks. Likelihood-free inference and model exploration are critical applications to study a system using such models. However, the massive computational cost of complex, high-dimensional and stochastic modelling currently limits systematic investigation to relatively simple systems. Recently, machine-learning-assisted methods have shown great promise to handle larger, more complex models. To support both ease-of-use of this new class of methods, as well as their further development, we have developed the scalable inference, optimization and parameter exploration (Sciope) toolbox. Sciope is designed to support new algorithms for machine-learning-assisted model exploration and likelihood-free inference. Moreover, it is built ground up to easily leverage distributed and heterogeneous computational resources for convenient parallelism across platforms from workstations to clouds. Availability and implementation The Sciope Python3 toolbox is freely available on https://github.com/Sciope/Sciope, and has been tested on Linux, Windows and macOS platforms. Supplementary information Supplementary information is available at Bioinformatics online.

Download Full-text

A Fast and Effective Method to Identify Relevant Sets of Variables in Complex Systems

Mathematics ◽

10.3390/math9091022 ◽

2021 ◽

Vol 9 (9) ◽

pp. 1022

Author(s):

Gianluca D’Addese ◽

Martina Casari ◽

Roberto Serra ◽

Marco Villani

Keyword(s):

Complex Systems ◽

Gene Regulatory Networks ◽

Regulatory Networks ◽

Computational Cost ◽

Graph Analysis ◽

The Past ◽

Medium Level ◽

Micro Level ◽

Gene Regulatory ◽

High Level

In many complex systems one observes the formation of medium-level structures, whose detection could allow a high-level description of the dynamical organization of the system itself, and thus to its better understanding. We have developed in the past a powerful method to achieve this goal, which however requires a heavy computational cost in several real-world cases. In this work we introduce a modified version of our approach, which reduces the computational burden. The design of the new algorithm allowed the realization of an original suite of methods able to work simultaneously at the micro level (that of the binary relationships of the single variables) and at meso level (the identification of dynamically relevant groups). We apply this suite to a particularly relevant case, in which we look for the dynamic organization of a gene regulatory network when it is subject to knock-outs. The approach combines information theory, graph analysis, and an iterated sieving algorithm in order to describe rather complex situations. Its application allowed to derive some general observations on the dynamical organization of gene regulatory networks, and to observe interesting characteristics in an experimental case.

Download Full-text

A Machine Learning Approach to Predict Gene Regulatory Networks in Seed Development in Arabidopsis

Frontiers in Plant Science ◽

10.3389/fpls.2016.01936 ◽

2016 ◽

Vol 7 ◽

Cited By ~ 19

Author(s):

Ying Ni ◽

Delasa Aghamirzaie ◽

Haitham Elmarakeby ◽

Eva Collakova ◽

Song Li ◽

...

Keyword(s):

Machine Learning ◽

Seed Development ◽

Gene Regulatory Networks ◽

Regulatory Networks ◽

Learning Approach ◽

Machine Learning Approach ◽

Gene Regulatory

Download Full-text

Inference of differential gene regulatory networks based on gene expression and genetic perturbation data

Bioinformatics ◽

10.1093/bioinformatics/btz529 ◽

2019 ◽

Vol 36 (1) ◽

pp. 197-204 ◽

Cited By ~ 2

Author(s):

Xin Zhou ◽

Xiaodong Cai

Keyword(s):

Gene Expression ◽

Gene Regulatory Networks ◽

Structural Equation ◽

Regulatory Networks ◽

Supplementary Information ◽

Specific Gene ◽

Joint Inference ◽

Perturbation Data ◽

Gene Regulatory ◽

The Difference

Abstract Motivation Gene regulatory networks (GRNs) of the same organism can be different under different conditions, although the overall network structure may be similar. Understanding the difference in GRNs under different conditions is important to understand condition-specific gene regulation. When gene expression and other relevant data under two different conditions are available, they can be used by an existing network inference algorithm to estimate two GRNs separately, and then to identify the difference between the two GRNs. However, such an approach does not exploit the similarity in two GRNs, and may sacrifice inference accuracy. Results In this paper, we model GRNs with the structural equation model (SEM) that can integrate gene expression and genetic perturbation data, and develop an algorithm named fused sparse SEM (FSSEM), to jointly infer GRNs under two conditions, and then to identify difference of the two GRNs. Computer simulations demonstrate that the FSSEM algorithm outperforms the approaches that estimate two GRNs separately. Analysis of a dataset of lung cancer and another dataset of gastric cancer with FSSEM inferred differential GRNs in cancer versus normal tissues, whose genes with largest network degrees have been reported to be implicated in tumorigenesis. The FSSEM algorithm provides a valuable tool for joint inference of two GRNs and identification of the differential GRN under two conditions. Availability and implementation The R package fssemR implementing the FSSEM algorithm is available at https://github.com/Ivis4ml/fssemR.git. It is also available on CRAN. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Hitoshi Iba: Evolutionary approach to machine learning and deep neural networks: neuro-evolution and gene regulatory networks

Genetic Programming and Evolvable Machines ◽

10.1007/s10710-019-09350-8 ◽

2019 ◽

Vol 20 (2) ◽

pp. 151-153

Author(s):

Petra Vidnerová

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Gene Regulatory Networks ◽

Regulatory Networks ◽

Deep Neural Networks ◽

Evolutionary Approach ◽

Gene Regulatory

Download Full-text

A method for estimating Hill function-based dynamic models of gene regulatory networks

Royal Society Open Science ◽

10.1098/rsos.171226 ◽

2018 ◽

Vol 5 (2) ◽

pp. 171226 ◽

Cited By ~ 2

Author(s):

Faizan Ehsan Elahi ◽

Ammar Hasan

Keyword(s):

Gene Regulatory Networks ◽

Regulatory Networks ◽

Optimization Problem ◽

Dynamic Models ◽

Initial Conditions ◽

Computational Cost ◽

Optimization Approach ◽

Continuous Models ◽

Gene Regulatory ◽

The Cost

Gene regulatory networks (GRNs) are quite large and complex. To better understand and analyse GRNs, mathematical models are being employed. Different types of models, such as logical, continuous and stochastic models, can be used to describe GRNs. In this paper, we present a new approach to identify continuous models, because they are more suitable for large number of genes and quantitative analysis. One of the most promising techniques for identifying continuous models of GRNs is based on Hill functions and the generalized profiling method (GPM). The advantage of this approach is low computational cost and insensitivity to initial conditions. In the GPM, a constrained nonlinear optimization problem has to be solved that is usually underdetermined. In this paper, we propose a new optimization approach in which we reformulate the optimization problem such that constraints are embedded implicitly in the cost function. Moreover, we propose to split the unknown parameter in two sets based on the structure of Hill functions. These two sets are estimated separately to resolve the issue of the underdetermined problem. As a case study, we apply the proposed technique on the SOS response in Escherichia coli and compare the results with the existing literature.

Download Full-text

Single Cell RNA-Seq and Machine Learning Reveal Novel Subpopulations in Low-Grade Inflammatory Monocytes With Unique Regulatory Circuits

Frontiers in Immunology ◽

10.3389/fimmu.2021.627036 ◽

2021 ◽

Vol 12 ◽

Author(s):

Jiyoung Lee ◽

Shuo Geng ◽

Song Li ◽

Liwu Li

Keyword(s):

Machine Learning ◽

Gene Regulatory Networks ◽

Regulatory Networks ◽

Regulatory Genes ◽

Low Grade ◽

Machine Learning Method ◽

Learning Method ◽

Cell Clusters ◽

Inflammatory Monocytes ◽

Gene Regulatory

Subclinical doses of LPS (SD-LPS) are known to cause low-grade inflammatory activation of monocytes, which could lead to inflammatory diseases including atherosclerosis and metabolic syndrome. Sodium 4-phenylbutyrate is a potential therapeutic compound which can reduce the inflammation caused by SD-LPS. To understand the gene regulatory networks of these processes, we have generated scRNA-seq data from mouse monocytes treated with these compounds and identified 11 novel cell clusters. We have developed a machine learning method to integrate scRNA-seq, ATAC-seq, and binding motifs to characterize gene regulatory networks underlying these cell clusters. Using guided regularized random forest and feature selection, our method achieved high performance and outperformed a traditional enrichment-based method in selecting candidate regulatory genes. Our method is particularly efficient in selecting a few candidate genes to explain observed expression pattern. In particular, among 531 candidate TFs, our method achieves an auROC of 0.961 with only 10 motifs. Finally, we found two novel subpopulations of monocyte cells in response to SD-LPS and we confirmed our analysis using independent flow cytometry experiments. Our results suggest that our new machine learning method can select candidate regulatory genes as potential targets for developing new therapeutics against low grade inflammation.

Download Full-text

Stochastic modelling of gene regulatory networks

International Journal of Robust and Nonlinear Control ◽

10.1002/rnc.1018 ◽

2005 ◽

Vol 15 (15) ◽

pp. 691-711 ◽

Cited By ~ 102

Author(s):

Hana El Samad ◽

Mustafa Khammash ◽

Linda Petzold ◽

Dan Gillespie

Keyword(s):

Gene Regulatory Networks ◽

Regulatory Networks ◽

Stochastic Modelling ◽

Gene Regulatory

Download Full-text

scTenifoldNet: A Machine Learning Workflow for Constructing and Comparing Transcriptome-wide Gene Regulatory Networks from Single-Cell Data

Patterns ◽

10.1016/j.patter.2020.100139 ◽

2020 ◽

Vol 1 (9) ◽

pp. 100139

Author(s):

Daniel Osorio ◽

Yan Zhong ◽

Guanxun Li ◽

Jianhua Z. Huang ◽

James J. Cai

Keyword(s):

Machine Learning ◽

Single Cell ◽

Gene Regulatory Networks ◽

Regulatory Networks ◽

Gene Regulatory ◽

Cell Data

Download Full-text

DCI: learning causal differences between gene regulatory networks

Bioinformatics ◽

10.1093/bioinformatics/btab167 ◽

2021 ◽

Author(s):

Anastasiya Belyaeva ◽

Chandler Squires ◽

Caroline Uhler

Keyword(s):

Gene Expression ◽

Causal Inference ◽

Gene Regulatory Networks ◽

Regulatory Networks ◽

Large Scale ◽

Developmental Time ◽

Supplementary Information ◽

Causal Graph ◽

Causal Graphs ◽

Gene Regulatory

Abstract Summary Designing interventions to control gene regulation necessitates modeling a gene regulatory network by a causal graph. Currently, large-scale gene expression datasets from different conditions, cell types, disease states, and developmental time points are being collected. However, application of classical causal inference algorithms to infer gene regulatory networks based on such data is still challenging, requiring high sample sizes and computational resources. Here, we describe an algorithm that efficiently learns the differences in gene regulatory mechanisms between different conditions. Our difference causal inference (DCI) algorithm infers changes (i.e. edges that appeared, disappeared, or changed weight) between two causal graphs given gene expression data from the two conditions. This algorithm is efficient in its use of samples and computation since it infers the differences between causal graphs directly without estimating each possibly large causal graph separately. We provide a user-friendly Python implementation of DCI and also enable the user to learn the most robust difference causal graph across different tuning parameters via stability selection. Finally, we show how to apply DCI to single-cell RNA-seq data from different conditions and cell states, and we also validate our algorithm by predicting the effects of interventions. Availability and implementation Python package freely available at http://uhlerlab.github.io/causaldag/dci. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Machine learning methods to reverse engineer dynamic gene regulatory networks governing cell state transitions

10.1101/264671 ◽

2018 ◽

Cited By ~ 2

Author(s):

P. Tsakanikas ◽

D. Manatakis ◽

E. S. Manolakos

Keyword(s):

Machine Learning ◽

Gene Regulatory Networks ◽

Regulatory Networks ◽

Most Probable Number ◽

State Transitions ◽

Gene Expressions ◽

Dynamic Interactions ◽

Probable Number ◽

Cell State ◽

Gene Regulatory

ABSTRACTDeciphering the dynamic gene regulatory mechanisms driving cells to make fate decisions remains elusive. We present a novel unsupervised machine learning methodology that can be used to analyze a dataset of heterogeneous single-cell gene expressions profiles, determine the most probable number of states (major cellular phenotypes) represented and extract the corresponding cell sub-populations. Most importantly, for any transition of interest from a source to a destination state, our methodology can zoom in, identify the cells most specific for studying the dynamics of this transition, order them along a trajectory of biological progression in posterior probabilities space, determine the "key-player" genes governing the transition dynamics, partition the trajectory into consecutive phases (transition "micro-states"), and finally reconstruct causal gene regulatory networks for each phase. Application of the end-to-end methodology provides new insights on key-player genes and their dynamic interactions during the important HSC-to-LMPP cell state transition involved in hematopoiesis. Moreover, it allows us to reconstruct a probabilistic representation of the “epigenetic landscape” of transitions and identify correctly the major ones in the hematopoiesis hierarchy of states.

Download Full-text