The use of machine learning to discover regulatory networks controlling biological systems

A Machine Learning Approach to Predict Gene Regulatory Networks in Seed Development in Arabidopsis

Frontiers in Plant Science ◽

10.3389/fpls.2016.01936 ◽

2016 ◽

Vol 7 ◽

Cited By ~ 19

Author(s):

Ying Ni ◽

Delasa Aghamirzaie ◽

Haitham Elmarakeby ◽

Eva Collakova ◽

Song Li ◽

...

Keyword(s):

Machine Learning ◽

Seed Development ◽

Gene Regulatory Networks ◽

Regulatory Networks ◽

Learning Approach ◽

Machine Learning Approach ◽

Gene Regulatory

Download Full-text

Special Issue on Uncertainty Quantification, Machine Learning, and Data-Driven Modeling of Biological Systems

Computer Methods in Applied Mechanics and Engineering ◽

10.1016/j.cma.2020.112832 ◽

2020 ◽

Vol 362 ◽

pp. 112832

Author(s):

Adrian Buganza Tepole ◽

David Nordsletten ◽

Krishna Garikipati ◽

Ellen Kuhl

Keyword(s):

Machine Learning ◽

Uncertainty Quantification ◽

Biological Systems ◽

Data Driven ◽

Special Issue ◽

Data Driven Modeling

Download Full-text

Hitoshi Iba: Evolutionary approach to machine learning and deep neural networks: neuro-evolution and gene regulatory networks

Genetic Programming and Evolvable Machines ◽

10.1007/s10710-019-09350-8 ◽

2019 ◽

Vol 20 (2) ◽

pp. 151-153

Author(s):

Petra Vidnerová

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Gene Regulatory Networks ◽

Regulatory Networks ◽

Deep Neural Networks ◽

Evolutionary Approach ◽

Gene Regulatory

Download Full-text

Investigating noise tolerance in an efficient engine for inferring biological regulatory networks

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720015410061 ◽

2015 ◽

Vol 13 (03) ◽

pp. 1541006 ◽

Cited By ~ 1

Author(s):

Asako Komori ◽

Yukihiro Maki ◽

Isao Ono ◽

Masahiro Okamoto

Keyword(s):

Regulatory Networks ◽

Time Series Data ◽

Biological Systems ◽

Optimization Methods ◽

Series Data ◽

Noise Tolerance ◽

Biological Regulatory Networks ◽

Experimental Time Series ◽

Noise Data ◽

Signaling Components

Biological systems are composed of biomolecules such as genes, proteins, metabolites, and signaling components, which interact in complex networks. To understand complex biological systems, it is important to be capable of inferring regulatory networks from experimental time series data. In previous studies, we developed efficient numerical optimization methods for inferring these networks, but we have yet to test the performance of our methods when considering the error (noise) that is inherent in experimental data. In this study, we investigated the noise tolerance of our proposed inferring engine. We prepared the noise data using the Langevin equation, and compared the performance of our method with that of alternative optimization methods.

Download Full-text

Prediction of Whole-Cell Transcriptional Response with Machine Learning

Bioinformatics ◽

10.1093/bioinformatics/btab676 ◽

2021 ◽

Author(s):

Mohammed Eslami ◽

Amin Espah-Borujeni ◽

Hamed Eramian ◽

Mark Weston ◽

George Zheng ◽

...

Keyword(s):

Machine Learning ◽

Differential Expression ◽

Regulatory Networks ◽

High Throughput Sequencing ◽

Differential Expression Analysis ◽

Transcriptional Response ◽

Predictive Performance ◽

Supplementary Information ◽

Whole Cell ◽

Using Data

Abstract Motivation Applications in synthetic and systems biology can benefit from measuring whole-cell response to biochemical perturbations. Execution of experiments to cover all possible combinations of perturbations is infeasible. In this paper, we present the host response model (HRM), a machine learning approach that maps response of single perturbations to transcriptional response of the combination of perturbations. Results The HRM combines high-throughput sequencing with machine learning to infer links between experimental context, prior knowledge of cell regulatory networks, and RNASeq data to predict a gene’s dysregulation. We find that the HRM can predict the directionality of dysregulation to a combination of inducers with an accuracy of > 90% using data from single inducers. We further find that the use of prior, known cell regulatory networks doubles the predictive performance of the HRM (an R2 from 0.3 to 0.65). The model was validated in two organisms, E. coli and B. subtilis, using new experiments conducted post training. Finally, while the HRM is trained on gene expression data, the direct prediction of differential expression makes it possible to also conduct enrichment analyses using its predictions. We show that the HRM can accurately classify >95% of the pathway regulations. The HRM reduces the number of RNASeq experiments needed as responses can be tested in-silico to focus experiments. Availability The HRM software and tutorial are available at https://github.com/sd2e/CDM and the configurable differential expression analysis tools and tutorials are available at https://github.com/SD2E/omics_tools. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Robustness and lethality in multilayer biological molecular networks

10.1101/818963 ◽

2019 ◽

Author(s):

Xueming Liu ◽

Enrico Maiorino ◽

Arda Halu ◽

Joseph Loscalzo ◽

Jianxi Gao ◽

...

Keyword(s):

Biological Networks ◽

Regulatory Networks ◽

Metabolic Diseases ◽

Biological Systems ◽

Interaction Network ◽

Network Models ◽

Global Network ◽

Molecular Networks ◽

Cancer Genes ◽

Gene Regulatory

AbstractRobustness is a prominent feature of most biological systems. In a cell, the structure of the interactions between genes, proteins, and metabolites has a crucial role in maintaining the cell’s functionality and viability in presence of external perturbations and noise. Despite advances in characterizing the robustness of biological systems, most of the current efforts have been focused on studying homogeneous molecular networks in isolation, such as protein-protein or gene regulatory networks, neglecting the interactions among different molecular substrates. Here we propose a comprehensive framework for understanding how the interactions between genes, proteins and metabolites contribute to the determinants of robustness in a heterogeneous biological network. We integrate heterogeneous sources of data to construct a multilayer interaction network composed of a gene regulatory layer, and protein-protein interaction layer and a metabolic layer. We design a simulated perturbation process to characterize the contribution of each gene to the overall system’s robustness, defined as its influence over the global network. We find that highly influential genes are enriched in essential and cancer genes, confirming the central role of these genes in critical cellular processes. Further, we determine that the metabolic layer is more vulnerable to perturbations involving genes associated to metabolic diseases. By comparing the robustness of the network to multiple randomized network models, we find that the real network is comparably or more robust than expected in the random realizations. Finally, we analytically derive the expected robustness of multilayer biological networks starting from the degree distributions within or between layers. These results provide new insights into the non-trivial dynamics occurring in the cell after a genetic perturbation is applied, confirming the importance of including the coupling between different layers of interaction in models of complex biological systems.

Download Full-text

rSeqTU – a machine-learning based R package for prediction of bacterial transcription units

10.1101/553057 ◽

2019 ◽

Author(s):

Sheng-Yong Niu ◽

Binqiang Liu ◽

Qin Ma ◽

Wen-Chi Chou

Keyword(s):

Machine Learning ◽

Random Forest ◽

Regulatory Networks ◽

Prediction Models ◽

R Package ◽

Transcription Unit ◽

Support Vector ◽

Rna Seq ◽

Accurate Identification ◽

Prediction Approach

AbstractA transcription unit (TU) is composed of one or multiple adjacent genes on the same strand that are co-transcribed in mostly prokaryotes. Accurate identification of TUs is a crucial first step to delineate the transcriptional regulatory networks and elucidate the dynamic regulatory mechanisms encoded in various prokaryotic genomes. Many genomic features, e.g., gene intergenic distance, and transcriptomic features including continuous and stable RNA-seq reads count signals, have been collected from a large amount of experimental data and integrated into classification techniques to computationally predict genome-wide TUs. Although some tools and web servers are able to predict TUs based on bacterial RNA-seq data and genome sequences, there is a need to have an improved machine-learning prediction approach and a better comprehensive pipeline handling QC, TU prediction, and TU visualization. To enable users to efficiently perform TU identification on their local computers or high-performance clusters and provide a more accurate prediction, we develop an R package, named rSeqTU. rSeqTU uses a random forest algorithm to select essential features describing TUs and then uses support vector machine (SVM) to build TU prediction models. rSeqTU (available at https://s18692001.github.io/rSeqTU/) has six computational functionalities including read quality control, read mapping, training set generation, random-forest-based feature selection, TU prediction, and TU visualization.

Download Full-text

Boolean factor graph model for biological systems: the yeast cell-cycle network

BMC Bioinformatics ◽

10.1186/s12859-021-04361-8 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Stephen Kotiang ◽

Ali Eslami

Keyword(s):

Cell Cycle ◽

Yeast Cell ◽

Gene Regulatory Networks ◽

Biological Networks ◽

Error Propagation ◽

Regulatory Networks ◽

Biological Systems ◽

Yeast Cell Cycle ◽

Computational Framework ◽

Gene Regulatory

Abstract Background The desire to understand genomic functions and the behavior of complex gene regulatory networks has recently been a major research focus in systems biology. As a result, a plethora of computational and modeling tools have been proposed to identify and infer interactions among biological entities. Here, we consider the general question of the effect of perturbation on the global dynamical network behavior as well as error propagation in biological networks to incite research pertaining to intervention strategies. Results This paper introduces a computational framework that combines the formulation of Boolean networks and factor graphs to explore the global dynamical features of biological systems. A message-passing algorithm is proposed for this formalism to evolve network states as messages in the graph. In addition, the mathematical formulation allows us to describe the dynamics and behavior of error propagation in gene regulatory networks by conducting a density evolution (DE) analysis. The model is applied to assess the network state progression and the impact of gene deletion in the budding yeast cell cycle. Simulation results show that our model predictions match published experimental data. Also, our findings reveal that the sample yeast cell-cycle network is not only robust but also consistent with real high-throughput expression data. Finally, our DE analysis serves as a tool to find the optimal values of network parameters for resilience against perturbations, especially in the inference of genetic graphs. Conclusion Our computational framework provides a useful graphical model and analytical tools to study biological networks. It can be a powerful tool to predict the consequences of gene deletions before conducting wet bench experiments because it proves to be a quick route to predicting biologically relevant dynamic properties without tunable kinetic parameters.

Download Full-text

Single Cell RNA-Seq and Machine Learning Reveal Novel Subpopulations in Low-Grade Inflammatory Monocytes With Unique Regulatory Circuits

Frontiers in Immunology ◽

10.3389/fimmu.2021.627036 ◽

2021 ◽

Vol 12 ◽

Author(s):

Jiyoung Lee ◽

Shuo Geng ◽

Song Li ◽

Liwu Li

Keyword(s):

Machine Learning ◽

Gene Regulatory Networks ◽

Regulatory Networks ◽

Regulatory Genes ◽

Low Grade ◽

Machine Learning Method ◽

Learning Method ◽

Cell Clusters ◽

Inflammatory Monocytes ◽

Gene Regulatory

Subclinical doses of LPS (SD-LPS) are known to cause low-grade inflammatory activation of monocytes, which could lead to inflammatory diseases including atherosclerosis and metabolic syndrome. Sodium 4-phenylbutyrate is a potential therapeutic compound which can reduce the inflammation caused by SD-LPS. To understand the gene regulatory networks of these processes, we have generated scRNA-seq data from mouse monocytes treated with these compounds and identified 11 novel cell clusters. We have developed a machine learning method to integrate scRNA-seq, ATAC-seq, and binding motifs to characterize gene regulatory networks underlying these cell clusters. Using guided regularized random forest and feature selection, our method achieved high performance and outperformed a traditional enrichment-based method in selecting candidate regulatory genes. Our method is particularly efficient in selecting a few candidate genes to explain observed expression pattern. In particular, among 531 candidate TFs, our method achieves an auROC of 0.961 with only 10 motifs. Finally, we found two novel subpopulations of monocyte cells in response to SD-LPS and we confirmed our analysis using independent flow cytometry experiments. Our results suggest that our new machine learning method can select candidate regulatory genes as potential targets for developing new therapeutics against low grade inflammation.

Download Full-text

Spatial heterogeneity of the cytosol revealed by machine learning-based 3D particle tracking

Molecular Biology of the Cell ◽

10.1091/mbc.e20-03-0210 ◽

2020 ◽

Vol 31 (14) ◽

pp. 1498-1511 ◽

Cited By ~ 1

Author(s):

Grace A. McLaughlin ◽

Erin M. Langdon ◽

John M. Crutchley ◽

Liam J. Holt ◽

M. Gregory Forest ◽

...

Keyword(s):

Machine Learning ◽

Spatial Heterogeneity ◽

Particle Tracking ◽

Cell Biology ◽

Physical State ◽

Biological Systems ◽

Physical Structure ◽

Length Scales ◽

3D Motion ◽

In Cells

The structure of the cytosol across different length scales is a debated topic in cell biology. Here we present tools to measure the physical state of the cytosol by analyzing the 3D motion of nanoparticles expressed in cells. We find evidence that the physical structure of the cytosol is a fundamental source of variability in biological systems.

Download Full-text