BioNetLink - An Architecture for Working with Network Data

Summary The visualization of biological data gained increasing importance in the last years. There is a large number of methods and software tools available that visualize biological data including the combination of measured experimental data and biological networks. With growing size of networks their handling and exploration becomes a challenging task for the user. In addition, scientists also have an interest in not just investigating a single kind of network, but on the combination of different types of networks, such as metabolic, gene regulatory and protein interaction networks. Therefore, fast access, abstract and dynamic views, and intuitive exploratory methods should be provided to search and extract information from the networks. This paper will introduce a conceptual framework for handling and combining multiple network sources that enables abstract viewing and exploration of large data sets including additional experimental data. It will introduce a three-tier structure that links network data to multiple network views, discuss a proof of concept implementation, and shows a specific visualization method for combining metabolic and gene regulatory networks in an example.

Download Full-text

Inference of gene regulatory networks using pseudo-time series data

Bioinformatics ◽

10.1093/bioinformatics/btab099 ◽

2021 ◽

Author(s):

Yuelei Zhang ◽

Xiao Chang ◽

Xiaoping Liu

Keyword(s):

Time Series ◽

Gene Regulatory Networks ◽

Biological Networks ◽

Gene Networks ◽

Regulatory Networks ◽

Time Series Data ◽

Biological Data ◽

Supplementary Information ◽

Series Data ◽

Gene Regulatory

Abstract Motivation Inferring gene regulatory networks (GRNs) from high-throughput data is an important and challenging problem in systems biology. Although numerous GRN methods have been developed, most have focused on the verification of the specific dataset. However, it is difficult to establish directed topological networks that are both suitable for time-series and non-time-series datasets due to the complexity and diversity of biological networks. Results Here, we proposed a novel method, GNIPLR (Gene networks inference based on projection and lagged regression) to infer GRNs from time-series or non-time-series gene expression data. GNIPLR projected gene data twice using the LASSO projection (LSP) algorithm and the linear projection (LP) approximation to produce a linear and monotonous pseudo-time series, and then determined the direction of regulation in combination with lagged regression analyses. The proposed algorithm was validated using simulated and real biological data. Moreover, we also applied the GNIPLR algorithm to the liver hepatocellular carcinoma (LIHC) and bladder urothelial carcinoma (BLCA) cancer expression datasets. These analyses revealed significantly higher accuracy and AUC values than other popular methods. Availabilityand implementation The GNIPLR tool is freely available at https://github.com/zyllluck/GNIPLR. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Degeneracy measures in biologically plausible random Boolean networks

10.1101/2021.04.29.441989 ◽

2021 ◽

Author(s):

Basak Kocaoglu ◽

William Alexander

Keyword(s):

Gene Regulatory Networks ◽

Biological Networks ◽

Regulatory Networks ◽

Biological Systems ◽

Boolean Networks ◽

Data Sets ◽

Random Boolean Networks ◽

Weighted Networks ◽

Underlying Network ◽

Gene Regulatory

Degeneracy, the ability of structurally different elements to perform similar functions, is a property of many biological systems. Systems exhibiting a high degree of degeneracy continue to exhibit the same macroscopic behavior following a lesion even though the underlying network dynamics are significantly different. Degeneracy thus suggests how biological systems can thrive despite changes to internal and external demands. Although degeneracy is a feature of network topologies and seems to be implicated in a wide variety of biological processes, research on degeneracy in biological networks is mostly limited to weighted networks (e.g., neural networks). To date, there has been no extensive investigation of information theoretic measures of degeneracy in other types of biological networks. In this paper, we apply existing approaches for quantifying degeneracy to random Boolean networks used for modeling biological gene regulatory networks. Using random Boolean networks with randomly generated rulesets to generate synthetic gene expression data sets, we systematically investigate the effect of network lesions on measures of degeneracy. Our results are comparable to measures of degeneracy using weighted networks, and this suggests that degeneracy measures may be a useful tool for investigating gene regulatory networks.

Download Full-text

A single-cell expression simulator guided by gene regulatory networks

10.1101/716811 ◽

2019 ◽

Cited By ~ 5

Author(s):

Payam Dibaeinia ◽

Saurabh Sinha

Keyword(s):

Experimental Data ◽

Single Cell ◽

Regulatory Networks ◽

Single Cell Analysis ◽

Single Cells ◽

Synthetic Data ◽

Cell Types ◽

Data Sets ◽

Statistical Measures ◽

Gene Regulatory

AbstractA common approach to benchmarking of single-cell transcriptomics tools is to generate synthetic data sets that resemble experimental data in their statistical properties. However, existing single-cell simulators do not incorporate known principles of transcription factor-gene regulatory interactions that underlie expression dynamics. Here we present SERGIO, a simulator of single-cell gene expression data that models the stochastic nature of transcription as well as linear and non-linear influences of multiple transcription factors on genes according to a user-provided gene regulatory network. SERGIO is capable of simulating any number of cell types in steady-state or cells differentiating to multiple fates according to a provided trajectory, reporting both unspliced and spliced transcript counts in single-cells. We show that data sets generated by SERGIO are comparable with experimental data in terms of multiple statistical measures. We also illustrate the use of SERGIO to benchmark several popular single-cell analysis tools, including GRN inference methods.

Download Full-text

ReactomeFIViz: the Reactome FI Cytoscape app for pathway and network-based data analysis

F1000Research ◽

10.12688/f1000research.4431.1 ◽

2014 ◽

Vol 3 ◽

pp. 146 ◽

Cited By ~ 2

Author(s):

Guanming Wu ◽

Eric Dawson ◽

Adrian Duong ◽

Robin Haw ◽

Lincoln Stein

Keyword(s):

Experimental Data ◽

Data Analysis ◽

Graphical Models ◽

High Throughput ◽

Interaction Network ◽

Large Data ◽

Relevant Information ◽

Data Sets ◽

Data Types ◽

Biological Studies

High-throughput experiments are routinely performed in modern biological studies. However, extracting meaningful results from massive experimental data sets is a challenging task for biologists. Projecting data onto pathway and network contexts is a powerful way to unravel patterns embedded in seemingly scattered large data sets and assist knowledge discovery related to cancer and other complex diseases. We have developed a Cytoscape app called “ReactomeFIViz”, which utilizes a highly reliable gene functional interaction network and human curated pathways from Reactome and other pathway databases. This app provides a suite of features to assist biologists in performing pathway- and network-based data analysis in a biologically intuitive and user-friendly way. Biologists can use this app to uncover network and pathway patterns related to their studies, search for gene signatures from gene expression data sets, reveal pathways significantly enriched by genes in a list, and integrate multiple genomic data types into a pathway context using probabilistic graphical models. We believe our app will give researchers substantial power to analyze intrinsically noisy high-throughput experimental data to find biologically relevant information.

Download Full-text

Discriminating the Single-cell Gene Regulatory Networks of Human Pancreatic Islets: A Novel Deep Learning Application

10.1101/2020.08.30.273839 ◽

2020 ◽

Author(s):

Turki Turki ◽

Y-h. Taguchi

Keyword(s):

Deep Learning ◽

Single Cell ◽

Gene Regulatory Networks ◽

Regulatory Networks ◽

Metabolic Diseases ◽

Large Data ◽

Data Repositories ◽

Cell Gene Expression ◽

Gene Regulatory ◽

Cell Gene

AbstractAnalyzing single-cell pancreatic data would play an important role in understanding various metabolic diseases and health conditions. Due to the sparsity and noise present in such single-cell gene expression data, analyzing various functions related to the inference of gene regulatory networks, derived from single-cell data, remains difficult, thereby posing a barrier to the deepening of understanding of cellular metabolism. Since recent studies have led to the reliable inference of single-cell gene regulatory networks (SCGRNs), the challenge of discriminating between SCGRNs has now arisen. By accurately discriminating between SCGRNs (e.g., distinguishing SCGRNs of healthy pancreas from those of T2D pancreas), biologists would be able to annotate, organize, visualize, and identify common patterns of SCGRNs for metabolic diseases. Such annotated SCGRNs could play an important role in speeding up the process of building large data repositories. In this study, we aimed to contribute to the development of a novel deep learning (DL) application. First, we generated a dataset consisting of 224 SCGRNs belonging to both T2D and healthy pancreas and made it freely available. Next, we chose seven DL architectures, including VGG16, VGG19, Xception, ResNet50, ResNet101, DenseNet121, and DenseNet169, trained each of them on the dataset, and checked prediction based on a test set. We evaluated the DL architectures on an HP workstation platform with a single NVIDIA GeForce RTX 2080Ti GPU. Experimental results on the whole dataset, using several performance measures, demonstrated the superiority of VGG19 DL model in the automatic classification of SCGRNs, derived from the single-cell pancreatic data.

Download Full-text

Application of deep learning methods in biological networks

Briefings in Bioinformatics ◽

10.1093/bib/bbaa043 ◽

2020 ◽

Cited By ~ 1

Author(s):

Shuting Jin ◽

Xiangxiang Zeng ◽

Feng Xia ◽

Wei Huang ◽

Xiangrong Liu

Keyword(s):

Deep Learning ◽

Biological Networks ◽

Learning Algorithm ◽

Biological Data ◽

Training Data ◽

Learning Ability ◽

Network Data ◽

Complex Data ◽

Layer By Layer ◽

Information Layer

Abstract The increase in biological data and the formation of various biomolecule interaction databases enable us to obtain diverse biological networks. These biological networks provide a wealth of raw materials for further understanding of biological systems, the discovery of complex diseases and the search for therapeutic drugs. However, the increase in data also increases the difficulty of biological networks analysis. Therefore, algorithms that can handle large, heterogeneous and complex data are needed to better analyze the data of these network structures and mine their useful information. Deep learning is a branch of machine learning that extracts more abstract features from a larger set of training data. Through the establishment of an artificial neural network with a network hierarchy structure, deep learning can extract and screen the input information layer by layer and has representation learning ability. The improved deep learning algorithm can be used to process complex and heterogeneous graph data structures and is increasingly being applied to the mining of network data information. In this paper, we first introduce the used network data deep learning models. After words, we summarize the application of deep learning on biological networks. Finally, we discuss the future development prospects of this field.

Download Full-text

An artificial-intelligence technique for qualitatively deriving enzyme kinetic mechanisms from initial-velocity measurements and its application to hexokinase

Biochemical Journal ◽

10.1042/bj2640175 ◽

1989 ◽

Vol 264 (1) ◽

pp. 175-184 ◽

Cited By ~ 2

Author(s):

L Garfinkel ◽

D M Cohen ◽

V W Soo ◽

D Garfinkel ◽

C A Kulikowski

Keyword(s):

Artificial Intelligence ◽

Experimental Data ◽

Ionic Strength ◽

Initial Velocity ◽

Large Data ◽

Data Sets ◽

Enzyme Kinetic ◽

Experimental Conditions ◽

Enzyme Preparations ◽

Kinetic Mechanisms

We have developed a computer method based on artificial-intelligence techniques for qualitatively analysing steady-state initial-velocity enzyme kinetic data. We have applied our system to experiments on hexokinase from a variety of sources: yeast, ascites and muscle. Our system accepts qualitative stylized descriptions of experimental data, infers constraints from the observed data behaviour and then compares the experimentally inferred constraints with corresponding theoretical model-based constraints. It is desirable to have large data sets which include the results of a variety of experiments. Human intervention is needed to interpret non-kinetic information, differences in conditions, etc. Different strategies were used by the several experimenters whose data was studied to formulate mechanisms for their enzyme preparations, including different methods (product inhibitors or alternate substrates), different experimental protocols (monitoring enzyme activity differently), or different experimental conditions (temperature, pH or ionic strength). The different ordered and rapid-equilibrium mechanisms proposed by these experimenters were generally consistent with their data. On comparing the constraints derived from the several experimental data sets, they are found to be in much less disagreement than the mechanisms published, and some of the disagreement can be ascribed to different experimental conditions (especially ionic strength).

Download Full-text

CytoMCS: A Multiple Maximum Common Subgraph Detection Tool for Cytoscape

Journal of Integrative Bioinformatics ◽

10.1515/jib-2017-0014 ◽

2017 ◽

Vol 14 (2) ◽

Cited By ~ 3

Author(s):

Simon J. Larsen ◽

Jan Baumbach

Keyword(s):

Biological Networks ◽

Regulatory Networks ◽

Conservation Score ◽

Common Edge ◽

Simple Graphs ◽

Protein Protein Interaction ◽

Local Search Heuristic ◽

Gene Regulatory ◽

Protein Protein Interaction Networks ◽

Or Gene

AbstractComparative analysis of biological networks is a major problem in computational integrative systems biology. By computing the maximum common edge subgraph between a set of networks, one is able to detect conserved substructures between them and quantify their topological similarity. To aid such analyses we have developed CytoMCS, a Cytoscape app for computing inexact solutions to the maximum common edge subgraph problem for two or more graphs. Our algorithm uses an iterative local search heuristic for computing conserved subgraphs, optimizing a squared edge conservation score that is able to detect not only fully conserved edges but also partially conserved edges. It can be applied to any set of directed or undirected, simple graphs loaded as networks into Cytoscape, e.g. protein-protein interaction networks or gene regulatory networks. CytoMCS is available as a Cytoscape app at http://apps.cytoscape.org/apps/cytomcs.

Download Full-text

Combining multiple types of biological data in constraint-based learning of gene regulatory networks

2008 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology ◽

10.1109/cibcb.2008.4675764 ◽

2008 ◽

Cited By ~ 11

Author(s):

Mehmet Tan ◽

Mohammed AlShalalfa ◽

Reda Alhajj ◽

Faruk Polat

Keyword(s):

Gene Regulatory Networks ◽

Regulatory Networks ◽

Biological Data ◽

Gene Regulatory

Download Full-text

Robustness and lethality in multilayer biological molecular networks

10.1101/818963 ◽

2019 ◽

Author(s):

Xueming Liu ◽

Enrico Maiorino ◽

Arda Halu ◽

Joseph Loscalzo ◽

Jianxi Gao ◽

...

Keyword(s):

Biological Networks ◽

Regulatory Networks ◽

Metabolic Diseases ◽

Biological Systems ◽

Interaction Network ◽

Network Models ◽

Global Network ◽

Molecular Networks ◽

Cancer Genes ◽

Gene Regulatory

AbstractRobustness is a prominent feature of most biological systems. In a cell, the structure of the interactions between genes, proteins, and metabolites has a crucial role in maintaining the cell’s functionality and viability in presence of external perturbations and noise. Despite advances in characterizing the robustness of biological systems, most of the current efforts have been focused on studying homogeneous molecular networks in isolation, such as protein-protein or gene regulatory networks, neglecting the interactions among different molecular substrates. Here we propose a comprehensive framework for understanding how the interactions between genes, proteins and metabolites contribute to the determinants of robustness in a heterogeneous biological network. We integrate heterogeneous sources of data to construct a multilayer interaction network composed of a gene regulatory layer, and protein-protein interaction layer and a metabolic layer. We design a simulated perturbation process to characterize the contribution of each gene to the overall system’s robustness, defined as its influence over the global network. We find that highly influential genes are enriched in essential and cancer genes, confirming the central role of these genes in critical cellular processes. Further, we determine that the metabolic layer is more vulnerable to perturbations involving genes associated to metabolic diseases. By comparing the robustness of the network to multiple randomized network models, we find that the real network is comparably or more robust than expected in the random realizations. Finally, we analytically derive the expected robustness of multilayer biological networks starting from the degree distributions within or between layers. These results provide new insights into the non-trivial dynamics occurring in the cell after a genetic perturbation is applied, confirming the importance of including the coupling between different layers of interaction in models of complex biological systems.

Download Full-text