An Overview of the Statistical Methods Used for Inferring Gene Regulatory Networks and Protein-Protein Interaction Networks

The large influx of data from high-throughput genomic and proteomic technologies has encouraged the researchers to seek approaches for understanding the structure of gene regulatory networks and proteomic networks. This work reviews some of the most important statistical methods used for modeling of gene regulatory networks (GRNs) and protein-protein interaction (PPI) networks. The paper focuses on the recent advances in the statistical graphical modeling techniques, state-space representation models, and information theoretic methods that were proposed for inferring the topology of GRNs. It appears that the problem of inferring the structure of PPI networks is quite different from that of GRNs. Clustering and probabilistic graphical modeling techniques are of prime importance in the statistical inference of PPI networks, and some of the recent approaches using these techniques are also reviewed in this paper. Performance evaluation criteria for the approaches used for modeling GRNs and PPI networks are also discussed.

Download Full-text

Kaposi’s sarcoma: a computational approach through protein–protein interaction and gene regulatory networks analysis

Virus Genes ◽

10.1007/s11262-012-0865-z ◽

2012 ◽

Vol 46 (2) ◽

pp. 242-254 ◽

Cited By ~ 4

Author(s):

Aubhishek Zaman ◽

Md. Habibur Rahaman ◽

Samsad Razzaque

Keyword(s):

Gene Regulatory Networks ◽

Protein Interaction ◽

Regulatory Networks ◽

Kaposi's Sarcoma ◽

Kaposi’S Sarcoma ◽

Computational Approach ◽

Protein Protein Interaction ◽

Networks Analysis ◽

Gene Regulatory

Download Full-text

A Linear Programming Framework for Inferring Gene Regulatory Networks by Integrating Heterogeneous Data

Handbook of Research on Computational Methodologies in Gene Regulatory Networks ◽

10.4018/978-1-60566-685-3.ch019 ◽

2010 ◽

pp. 450-475 ◽

Cited By ~ 1

Author(s):

Yong Wang ◽

Rui-Sheng Wang ◽

Trupti Joshi ◽

Dong Xu ◽

Xiang-Sun Zhang ◽

...

Keyword(s):

Linear Programming ◽

Gene Regulatory Networks ◽

Regulatory Networks ◽

Heterogeneous Data ◽

Data Sources ◽

Protein Interaction Data ◽

Interaction Data ◽

Protein Protein Interaction ◽

Programming Framework ◽

Gene Regulatory

There exist many heterogeneous data sources that are closely related to gene regulatory networks. These data sources provide rich information for depicting complex biological processes at different levels and from different aspects. Here, we introduce a linear programming framework to infer the gene regulatory networks. Within this framework, we extensively integrate the available information derived from multiple time-course expression datasets, ChIP-chip data, regulatory motif-binding patterns, protein-protein interaction data, protein-small molecule interaction data, and documented regulatory relationships in literature and databases. Results on synthetic and real experimental data both demonstrate that the linear programming framework allows us to recover gene regulations in a more robust and reliable manner.

Download Full-text

Common ancestry of heterodimerizing TALE homeobox transcription factors across Metazoa and Archaeplastida

10.1101/389700 ◽

2018 ◽

Author(s):

Sunjoo Joo ◽

Ming Hsiu Wang ◽

Gary Lui ◽

Jenny Lee ◽

Andrew Barnas ◽

...

Keyword(s):

Transcription Factors ◽

Gene Regulatory Networks ◽

Regulatory Networks ◽

Bioinformatics Analysis ◽

Land Plants ◽

Protein Protein Interaction ◽

Functional Conservation ◽

Developmental Mechanisms ◽

Gene Regulatory ◽

Homeobox Transcription Factors

AbstractHomeobox transcription factors (TFs) in the TALE superclass are deeply embedded in the gene regulatory networks that orchestrate embryogenesis. Knotted-like homeobox (KNOX) TFs, homologous to animal MEIS, have been found to drive the haploid-to-diploid transition in both unicellular green algae and land plants via heterodimerization with other TALE superclass TFs, representing remarkable functional conservation of a developmental TF across lineages that diverged one billion years ago. To delineate the ancestry of TALE-TALE heterodimerization, we analyzed TALE endowment in the algal radiations of Archaeplastida, ancestral to land plants. Homeodomain phylogeny and bioinformatics analysis partitioned TALEs into two broad groups, KNOX and non-KNOX. Each group shares previously defined heterodimerization domains, plant KNOX-homology in the KNOX group and animal PBC-homology in the non-KNOX group, indicating their deep ancestry. Protein-protein interaction experiments showed that the TALEs in the two groups all participated in heterodimerization. These results indicate that the TF dyads consisting of KNOX/MEIS and PBC-containing TALEs must have evolved early in eukaryotic evolution, a likely function being to accurately execute the haploid-to-diploid transitions during sexual development.Author summaryComplex multicellularity requires elaborate developmental mechanisms, often based on the versatility of heterodimeric transcription factor (TF) interactions. Highly conserved TALE-superclass homeobox TF networks in major eukaryotic lineages suggest deep ancestry of developmental mechanisms. Our results support the hypothesis that in early eukaryotes, the TALE heterodimeric configuration provided transcription-on switches via dimerization-dependent subcellular localization, ensuring execution of the haploid-to-diploid transition only when the gamete fusion is correctly executed between appropriate partner gametes, a system that then diversified in the several lineages that engage in complex multicellular organization.

Download Full-text

Validation of Gene Regulatory Networks from Protein-Protein Interaction Data: Application to Cell-Cycle Regulation

Pattern Recognition in Bioinformatics - Lecture Notes in Computer Science ◽

10.1007/978-3-540-75286-8_29 ◽

2007 ◽

pp. 300-310 ◽

Cited By ~ 3

Author(s):

Iti Chaturvedi ◽

Meena Kishore Sakharkar ◽

Jagath C. Rajapakse

Keyword(s):

Cell Cycle ◽

Protein Interaction ◽

Cell Cycle Regulation ◽

Regulatory Networks ◽

Protein Interaction Data ◽

Interaction Data ◽

Protein Protein Interaction ◽

Data Application ◽

Gene Regulatory ◽

Cycle Regulation

Download Full-text

Network inference with ensembles of bi-clustering trees

BMC Bioinformatics ◽

10.1186/s12859-019-3104-y ◽

2019 ◽

Vol 20 (1) ◽

Cited By ~ 2

Author(s):

Konstantinos Pliakos ◽

Celine Vens

Keyword(s):

Machine Learning ◽

Gene Regulatory Networks ◽

Protein Interaction ◽

Protein Interaction Network ◽

Regulatory Networks ◽

Network Inference ◽

Interaction Network ◽

Inference Problem ◽

Gene Regulatory ◽

Biological Entities

Abstract Background Network inference is crucial for biomedicine and systems biology. Biological entities and their associations are often modeled as interaction networks. Examples include drug protein interaction or gene regulatory networks. Studying and elucidating such networks can lead to the comprehension of complex biological processes. However, usually we have only partial knowledge of those networks and the experimental identification of all the existing associations between biological entities is very time consuming and particularly expensive. Many computational approaches have been proposed over the years for network inference, nonetheless, efficiency and accuracy are still persisting open problems. Here, we propose bi-clustering tree ensembles as a new machine learning method for network inference, extending the traditional tree-ensemble models to the global network setting. The proposed approach addresses the network inference problem as a multi-label classification task. More specifically, the nodes of a network (e.g., drugs or proteins in a drug-protein interaction network) are modelled as samples described by features (e.g., chemical structure similarities or protein sequence similarities). The labels in our setting represent the presence or absence of links connecting the nodes of the interaction network (e.g., drug-protein interactions in a drug-protein interaction network). Results We extended traditional tree-ensemble methods, such as extremely randomized trees (ERT) and random forests (RF) to ensembles of bi-clustering trees, integrating background information from both node sets of a heterogeneous network into the same learning framework. We performed an empirical evaluation, comparing the proposed approach to currently used tree-ensemble based approaches as well as other approaches from the literature. We demonstrated the effectiveness of our approach in different interaction prediction (network inference) settings. For evaluation purposes, we used several benchmark datasets that represent drug-protein and gene regulatory networks. We also applied our proposed method to two versions of a chemical-protein association network extracted from the STITCH database, demonstrating the potential of our model in predicting non-reported interactions. Conclusions Bi-clustering trees outperform existing tree-based strategies as well as machine learning methods based on other algorithms. Since our approach is based on tree-ensembles it inherits the advantages of tree-ensemble learning, such as handling of missing values, scalability and interpretability.

Download Full-text