scholarly journals TuxNet: A simple interface to process RNA sequencing data and infer gene regulatory networks

2018 ◽  
Author(s):  
Maria Angels de Luis Balaguer ◽  
Ryan J. Spurney ◽  
Natalie M. Clark ◽  
Adam P. Fisher ◽  
Rosangela Sozzani

ABSTRACTPredicting gene regulatory networks (GRNs) from gene expression profiles has become a common approach for identifying important biological regulators. Despite the increase in the use of inference methods, existing computational approaches do not integrate RNA-sequencing data analysis, are often not automated, and are restricted to users with bioinformatics and programming backgrounds. To address these limitations, we have developed TuxNet, an integrated user-friendly platform, which, with just a few selections, allows to process raw RNA-sequencing data (using the Tuxedo pipeline) and infer GRNs from these processed data. TuxNet is implemented as a graphical user interface and, using expression data from any organism with an existing reference genome, can mine the regulations among genes either by applying a dynamic Bayesian network inference algorithm, GENIST, or a regression tree-based pipeline that uses spatiotemporal data, RTP-STAR. To illustrate the use of TuxNet while getting insight into the regulatory cascade downstream of the Arabidopsis root stem cell regulator PERIANTHIA (PAN), we obtained time course gene expression data of a PAN inducible line and inferred a GRN using GENIST. Using RTP-STAR, we then inferred the network of a PAN secondary downstream gene, ATHB13, for which we obtained wildtype and mutant expression profiles. Our case studies feature the versatility of TuxNet to infer networks using different types of gene expression data (i.e time course and steady-state data) as well as how inference networks are used to identify important regulators.SUMMARYTuxNet offers a simple interface for non-computational biologists to infer GRNs from raw RNA-seq data.

2021 ◽  
Author(s):  
Hakimeh Khojasteh ◽  
Mohammad Hossein Olyaee ◽  
Alireza Khanteymoori

The development of computational methods to predict gene regulatory networks (GRNs) from gene expression data is a challenging task. Many machine learning methods have been developed, including supervised, unsupervised, and semi-supervised to infer gene regulatory networks. Most of these methods ignore the class imbalance problem which can lead to decreasing the accuracy of predicting regulatory interactions in the network. Therefore, developing an effective method considering imbalanced data is a challenging task. In this paper, we propose EnGRNT approach to infer GRNs with high accuracy that uses ensemble-based methods. The proposed approach, as well as the gene expression data, considers the topological features of GRN. We applied our approach to the simulated Escherichia coli dataset. Experimental results demonstrate that the appropriateness of the inference method relies on the size and type of expression profiles in microarray data. Except for multifactorial experimental conditions, the proposed approach outperforms unsupervised methods. The obtained results recommend the application of EnGRNT on the imbalanced datasets.


2018 ◽  
Vol 27 (7) ◽  
pp. 1930-1955 ◽  
Author(s):  
Michelle Carey ◽  
Juan Camilo Ramírez ◽  
Shuang Wu ◽  
Hulin Wu

A biological host response to an external stimulus or intervention such as a disease or infection is a dynamic process, which is regulated by an intricate network of many genes and their products. Understanding the dynamics of this gene regulatory network allows us to infer the mechanisms involved in a host response to an external stimulus, and hence aids the discovery of biomarkers of phenotype and biological function. In this article, we propose a modeling/analysis pipeline for dynamic gene expression data, called Pipeline4DGEData, which consists of a series of statistical modeling techniques to construct dynamic gene regulatory networks from the large volumes of high-dimensional time-course gene expression data that are freely available in the Gene Expression Omnibus repository. This pipeline has a consistent and scalable structure that allows it to simultaneously analyze a large number of time-course gene expression data sets, and then integrate the results across different studies. We apply the proposed pipeline to influenza infection data from nine studies and demonstrate that interesting biological findings can be discovered with its implementation.


2019 ◽  
Vol 101 (3) ◽  
pp. 716-730 ◽  
Author(s):  
Ryan J. Spurney ◽  
Lisa Van den Broeck ◽  
Natalie M. Clark ◽  
Adam P. Fisher ◽  
Maria A. de Luis Balaguer ◽  
...  

Author(s):  
Gustavo H. Esteves ◽  
Luiz F. L. Reis

Abstract Motivation: Gene expression data analysis is of great importance for modern molecular biology, given our ability to measure the expression profiles of thousands of genes and enabling studies rooted in systems biology. In this work, we propose a simple statistical model for the activation measuring of gene regulatory networks, instead of the traditional gene co-expression networks. Results: We present the mathematical construction of a statistical procedure for testing hypothesis regarding gene regulatory network activation. The real probability distribution for the test statistic is evaluated by a permutation based study. To illustrate the functionality of the proposed methodology, we also present a simple example based on a small hypothetical network and the activation measuring of two KEGG networks, both based on gene expression data collected from gastric and esophageal samples. The two KEGG networks were also analyzed for a public database, available through NCBI-GEO, presented as Supplementary Material. Availability: This method was implemented in an R package that is available at the BioConductor project website under the name maigesPack.


Sign in / Sign up

Export Citation Format

Share Document