BioNERO: an all-in-one R/Bioconductor package for comprehensive and easy biological network reconstruction

Mapping Intimacies ◽

10.1101/2021.04.10.439287 ◽

2021 ◽

Author(s):

Fabricio Almeida-Silva ◽

Thiago M. Venancio

Keyword(s):

Network Analysis ◽

Network Inference ◽

Biological Network ◽

State Of The Art ◽

R Package ◽

Expression Data ◽

Bioconductor Package ◽

User Friendly ◽

Functional Analyses ◽

Network Comparisons

Currently, standard network analysis workflows rely on many different packages, often requiring users to have a solid statistics and programming background. Here, we present BioNERO, an R package that aims to integrate all aspects of network analysis workflows, including expression data preprocessing, gene coexpression and regulatory network inference, functional analyses, and intra and interspecies network comparisons. The state-of-the-art methods implemented in BioNERO ensure that users can perform all analyses with a single package in a simple pipeline, without needing to learn a myriad of package-specific syntaxes. BioNERO offers a user-friendly framework that can be easily incorporated in systems biology pipelines. Availability and implementation The package is available at Bioconductor (http://bioconductor.org/packages/BioNERO).

Download Full-text

Modular network inference between miRNA–mRNA expression profiles using weighted co-expression network analysis

Journal of Integrative Bioinformatics ◽

10.1515/jib-2021-0029 ◽

2021 ◽

Vol 0 (0) ◽

Author(s):

Nisar Wani ◽

Debmalya Barh ◽

Khalid Raza

Keyword(s):

Network Analysis ◽

Mrna Expression ◽

Mirna Expression ◽

Regulatory Networks ◽

Network Inference ◽

Expression Profiles ◽

Interaction Network ◽

Pathway Enrichment Analysis ◽

Expression Data ◽

Cancer Data

Abstract Connecting transcriptional and post-transcriptional regulatory networks solves an important puzzle in the elucidation of gene regulatory mechanisms. To decipher the complexity of these connections, we build co-expression network modules for mRNA as well as miRNA expression profiles of breast cancer data. We construct gene and miRNA co-expression modules using the weighted gene co-expression network analysis (WGCNA) method and establish the significance of these modules (Genes/miRNAs) for cancer phenotype. This work also infers an interaction network between the genes of the turquoise module from mRNA expression data and hubs of the turquoise module from miRNA expression data. A pathway enrichment analysis using a miRsystem web tool for miRNA hubs and some of their targets, reveal their enrichment in several important pathways associated with the progression of cancer.

Download Full-text

target: an R package to predict combined function of transcription factors

F1000Research ◽

10.12688/f1000research.52173.2 ◽

2021 ◽

Vol 10 ◽

pp. 344

Author(s):

Mahmoud Ahmed ◽

Deok Ryong Kim

Keyword(s):

Gene Expression ◽

Transcription Factor ◽

Transcription Factors ◽

Binding Sites ◽

R Package ◽

Expression Data ◽

Bioconductor Package ◽

Chip Experiment ◽

Binding Data ◽

Two Factors

Researchers use ChIP binding data to identify potential transcription factor binding sites. Similarly, they use gene expression data from sequencing or microarrays to quantify the effect of the factor overexpression or knockdown on its targets. Therefore, the integration of the binding and expression data can be used to improve the understanding of a transcription factor function. Here, we implemented the binding and expression target analysis (BETA) in an R/Bioconductor package. This algorithm ranks the targets based on the distances of their assigned peaks from the factor ChIP experiment and the signed statistics from gene expression profiling with factor perturbation. We further extend BETA to integrate two sets of data from two factors to predict their targets and their combined functions. In this article, we briefly describe the workings of the algorithm and provide a workflow with a real dataset for using it. The gene targets and the aggregate functions of transcription factors YY1 and YY2 in HeLa cells were identified. Using the same datasets, we identified the shared targets of the two factors, which were found to be, on average, more cooperatively regulated.

Download Full-text

ParTBC: Faster Estimation of Top- k Betweenness Centrality Vertices on GPU

ACM Transactions on Design Automation of Electronic Systems ◽

10.1145/3486613 ◽

2022 ◽

Vol 27 (2) ◽

pp. 1-25

Author(s):

Somesh Singh ◽

Tejas Shah ◽

Rupesh Nasre

Keyword(s):

Network Analysis ◽

Real World ◽

Betweenness Centrality ◽

Biological Network ◽

State Of The Art ◽

Shortest Paths ◽

Weighted Graphs ◽

Content Type ◽

Practical Applications ◽

Biological Network Analysis

Betweenness centrality (BC) is a popular centrality measure, based on shortest paths, used to quantify the importance of vertices in networks. It is used in a wide array of applications including social network analysis, community detection, clustering, biological network analysis, and several others. The state-of-the-art Brandes’ algorithm for computing BC has time complexities of and for unweighted and weighted graphs, respectively. Brandes’ algorithm has been successfully parallelized on multicore and manycore platforms. However, the computation of vertex BC continues to be time-consuming for large real-world graphs. Often, in practical applications, it suffices to identify the most important vertices in a network; that is, those having the highest BC values. Such applications demand only the top vertices in the network as per their BC values but do not demand their actual BC values. In such scenarios, not only is computing the BC of all the vertices unnecessary but also exact BC values need not be computed. In this work, we attempt to marry controlled approximations with parallelization to estimate the k -highest BC vertices faster, without having to compute the exact BC scores of the vertices. We present a host of techniques to determine the top- k vertices faster , with a small inaccuracy, by computing approximate BC scores of the vertices. Aiding our techniques is a novel vertex-renumbering scheme to make the graph layout more structured , which results in faster execution of parallel Brandes’ algorithm on GPU. Our experimental results, on a suite of real-world and synthetic graphs, show that our best performing technique computes the top- k vertices with an average speedup of 2.5× compared to the exact parallel Brandes’ algorithm on GPU, with an error of less than 6%. Our techniques also exhibit high precision and recall, both in excess of 94%.

Download Full-text

COSIFER: a Python package for the consensus inference of molecular interaction networks

Bioinformatics ◽

10.1093/bioinformatics/btaa942 ◽

2020 ◽

Author(s):

Matteo Manica ◽

Charlotte Bunne ◽

Roland Mathis ◽

Joris Cadow ◽

Mehmet Eren Ahsen ◽

...

Keyword(s):

High Throughput ◽

Network Inference ◽

State Of The Art ◽

Interaction Network ◽

Molecular Networks ◽

Supplementary Information ◽

Web Based ◽

Molecular Interaction Networks ◽

Inference Methods ◽

User Friendly

Abstract Summary The advent of high-throughput technologies has provided researchers with measurements of thousands of molecular entities and enable the investigation of the internal regulatory apparatus of the cell. However, network inference from high-throughput data is far from being a solved problem. While a plethora of different inference methods have been proposed, they often lead to non-overlapping predictions, and many of them lack user-friendly implementations to enable their broad utilization. Here, we present Consensus Interaction Network Inference Service (COSIFER), a package and a companion web-based platform to infer molecular networks from expression data using state-of-the-art consensus approaches. COSIFER includes a selection of state-of-the-art methodologies for network inference and different consensus strategies to integrate the predictions of individual methods and generate robust networks. Availability and implementation COSIFER Python source code is available at https://github.com/PhosphorylatedRabbits/cosifer. The web service is accessible at https://ibm.biz/cosifer-aas. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

PoLoBag: Polynomial Lasso Bagging for signed gene regulatory network inference from expression data

Bioinformatics ◽

10.1093/bioinformatics/btaa651 ◽

2020 ◽

Author(s):

Gourab Ghosh Roy ◽

Nicholas Geard ◽

Karin Verspoor ◽

Shan He

Keyword(s):

Gene Regulatory Networks ◽

Regulatory Networks ◽

Network Inference ◽

State Of The Art ◽

Supplementary Information ◽

Expression Data ◽

Gene Regulatory Network Inference ◽

Regulatory Interactions ◽

Inference Algorithms ◽

Gene Regulatory

Abstract Motivation Inferring gene regulatory networks (GRNs) from expression data is a significant systems biology problem. A useful inference algorithm should not only unveil the global structure of the regulatory mechanisms but also the details of regulatory interactions such as edge direction (from regulator to target) and sign (activation/inhibition). Many popular GRN inference algorithms cannot infer edge signs, and those that can infer signed GRNs cannot simultaneously infer edge directions or network cycles. Results To address these limitations of existing algorithms, we propose Polynomial Lasso Bagging (PoLoBag) for signed GRN inference with both edge directions and network cycles. PoLoBag is an ensemble regression algorithm in a bagging framework where Lasso weights estimated on bootstrap samples are averaged. These bootstrap samples incorporate polynomial features to capture higher-order interactions. Results demonstrate that PoLoBag is consistently more accurate for signed inference than state-of-the-art algorithms on simulated and real-world expression datasets. Availability and implementation Algorithm and data are freely available at https://github.com/gourabghoshroy/PoLoBag. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Ensemble-based network aggregation improves the accuracy of gene network reconstruction

10.7287/peerj.preprints.40v1 ◽

2013 ◽

Author(s):

Jeffrey D. Allen ◽

Yang Xie ◽

Guanghua Xiao

Keyword(s):

Mrna Expression ◽

Regulatory Networks ◽

Network Inference ◽

R Package ◽

Network Reconstruction ◽

Specific Gene ◽

Expression Data ◽

Mrna Expression Data ◽

Network Aggregation ◽

Context Specific

Reverse engineering approaches to construct context-specific gene regulatory networks (GRNs) based on genome-wide mRNA expression data have led to significant biological findings. However, the reliability and reproducibility of the reconstructed GRNs needs to be improved. Here, we propose an ensemble-based network aggregation approach to improve the accuracy of the network topology constructed from mRNA expression data. To evaluate the performance of different approaches, we created dozens of simulated networks and also tested our methods on three Escherichia coli datasets. We demonstrate three novel applications from this development. First, bootstrapping can be done on the available samples, turning any network reconstruction approach into an ensemble method. Second, this aggregation approach can be used to combine GRNs from different network inference methods, creating a novel network reconstruction approach that consistently outperforms any constituent method. Third, the approach can be used to effectively integrate GRNs constructed from different studies – producing more accurate networks. We are releasing an implementation of these techniques as an R package “ENA” which is able to run network inference in parallel across multiple servers. We made all of the code and data used in our simulations and analysis available online at https://github.com/QBRC/ENA-Research to ensure the reproducibility of our results.

Download Full-text

target: an R package to predict combined function of transcription factors

F1000Research ◽

10.12688/f1000research.52173.3 ◽

2021 ◽

Vol 10 ◽

pp. 344

Author(s):

Mahmoud Ahmed ◽

Deok Ryong Kim

Keyword(s):

Gene Expression ◽

Transcription Factor ◽

Transcription Factors ◽

Binding Sites ◽

R Package ◽

Expression Data ◽

Bioconductor Package ◽

Chip Experiment ◽

Binding Data ◽

Factor Function

Researchers use ChIP binding data to identify potential transcription factor binding sites. Similarly, they use gene expression data from sequencing or microarrays to quantify the effect of the transcription factor overexpression or knockdown on its targets. Therefore, the integration of the binding and expression data can be used to improve the understanding of a transcription factor function. Here, we implemented the binding and expression target analysis (BETA) in an R/Bioconductor package. This algorithm ranks the targets based on the distances of their assigned peaks from the transcription factor ChIP experiment and the signed statistics from gene expression profiling with transcription factor perturbation. We further extend BETA to integrate two sets of data from two transcription factors to predict their targets and their combined functions. In this article, we briefly describe the workings of the algorithm and provide a workflow with a real dataset for using it. The gene targets and the aggregate functions of transcription factors YY1 and YY2 in HeLa cells were identified. Using the same datasets, we identified the shared targets of the two transcription factors, which were found to be, on average, more cooperatively regulated.

Download Full-text

Improving Gene Regulatory Network Inference by Incorporating Rates of Transcriptional Changes

10.1101/093807 ◽

2016 ◽

Author(s):

Jigar S. Desai ◽

Ryan C. Sartor ◽

Lovely Mae Lawas ◽

SV Krishna Jagadish ◽

Colleen J. Doherty

Keyword(s):

Regulatory Networks ◽

Network Inference ◽

R Package ◽

Rate Of Change ◽

Model Systems ◽

Data Sets ◽

Expression Data ◽

Model Species ◽

Gene Regulatory Network Inference ◽

Transcriptional Regulatory Networks

AbstractOrganisms respond to changes in their environment through transcriptional regulatory networks (TRNs). The regulatory hierarchy of these networks can be inferred from expression data. Computational approaches to identify TRNs can be applied in any species where quality RNA can be acquired, However, ChIP-Seq and similar validation methods are challenging to employ in non-model species. Improving the accuracy of computational inference methods can significantly reduce the cost and time of subsequent validation experiments. We have developed ExRANGES, an approach that improves the ability to computationally infer TRN from time series expression data. ExRANGES utilizes both the rate of change in expression and the absolute expression level to identify TRN connections. We evaluated ExRANGES in five data sets from different model systems. ExRANGES improved the identification of experimentally validated transcription factor targets for all species tested, even in unevenly spaced and sparse data sets. This improved ability to predict known regulator-target relationships enhances the utility of network inference approaches in non-model species where experimental validation is challenging. We integrated ExRANGES with two different network construction approaches and it has been implemented as an R package available here: http://github.com/DohertyLab/ExRANGES. To install the package type: devtools::install_github(“DohertyLab/ExRANGES”)

Download Full-text

TSSr: an R package for comprehensive analyses of TSS sequencing data

NAR Genomics and Bioinformatics ◽

10.1093/nargab/lqab108 ◽

2021 ◽

Vol 3 (4) ◽

Author(s):

Zhaolian Lu ◽

Keenan Berry ◽

Zhenbin Hu ◽

Yu Zhan ◽

Tae-Hyuk Ahn ◽

...

Keyword(s):

Transcription Initiation ◽

Core Promoter ◽

R Package ◽

Sequencing Data ◽

Cellular Functions ◽

Accurate Identification ◽

Transcription Start Sites ◽

Core Promoters ◽

User Friendly ◽

Functional Analyses

Abstract Transcription initiation is regulated in a highly organized fashion to ensure proper cellular functions. Accurate identification of transcription start sites (TSSs) and quantitative characterization of transcription initiation activities are fundamental steps for studies of regulated transcriptions and core promoter structures. Several high-throughput techniques have been developed to sequence the very 5′end of RNA transcripts (TSS sequencing) on the genome scale. Bioinformatics tools are essential for processing, analysis, and visualization of TSS sequencing data. Here, we present TSSr, an R package that provides rich functions for mapping TSS and characterizations of structures and activities of core promoters based on all types of TSS sequencing data. Specifically, TSSr implements several newly developed algorithms for accurately identifying TSSs from mapped sequencing reads and inference of core promoters, which are a prerequisite for subsequent functional analyses of TSS data. Furthermore, TSSr also enables users to export various types of TSS data that can be visualized by genome browser for inspection of promoter activities in association with other genomic features, and to generate publication-ready TSS graphs. These user-friendly features could greatly facilitate studies of transcription initiation based on TSS sequencing data. The source code and detailed documentations of TSSr can be freely accessed at https://github.com/Linlab-slu/TSSr.

Download Full-text

Correcting gene expression data when neither the unwanted variation nor the factor of interest are observed

Biostatistics ◽

10.1093/biostatistics/kxv026 ◽

2015 ◽

Vol 17 (1) ◽

pp. 16-28 ◽

Cited By ~ 43

Author(s):

Laurent Jacob ◽

Johann A. Gagnon-Bartsch ◽

Terence P. Speed

Keyword(s):

Gene Expression ◽

Large Scale ◽

State Of The Art ◽

Synthetic Data ◽

Negative Control ◽

Expression Data ◽

Bioconductor Package ◽

Expression Studies ◽

Unobserved Factor ◽

Gene Expression Studies

Abstract When dealing with large scale gene expression studies, observations are commonly contaminated by sources of unwanted variation such as platforms or batches. Not taking this unwanted variation into account when analyzing the data can lead to spurious associations and to missing important signals. When the analysis is unsupervised, e.g. when the goal is to cluster the samples or to build a corrected version of the dataset—as opposed to the study of an observed factor of interest—taking unwanted variation into account can become a difficult task. The factors driving unwanted variation may be correlated with the unobserved factor of interest, so that correcting for the former can remove the latter if not done carefully. We show how negative control genes and replicate samples can be used to estimate unwanted variation in gene expression, and discuss how this information can be used to correct the expression data. The proposed methods are then evaluated on synthetic data and three gene expression datasets. They generally manage to remove unwanted variation without losing the signal of interest and compare favorably to state-of-the-art corrections. All proposed methods are implemented in the bioconductor package RUVnormalize.

Download Full-text

BioNERO: an all-in-one R/Bioconductor package for comprehensive and easy biological network reconstruction

Modular network inference between miRNA–mRNA expression profiles using weighted co-expression network analysis

target: an R package to predict combined function of transcription factors

ParTBC: Faster Estimation of Top- k Betweenness Centrality Vertices on GPU

COSIFER: a Python package for the consensus inference of molecular interaction networks

PoLoBag: Polynomial Lasso Bagging for signed gene regulatory network inference from expression data

­­Ensemble-based network aggregation improves the accuracy of gene network reconstruction

target: an R package to predict combined function of transcription factors

Improving Gene Regulatory Network Inference by Incorporating Rates of Transcriptional Changes

TSSr: an R package for comprehensive analyses of TSS sequencing data

Correcting gene expression data when neither the unwanted variation nor the factor of interest are observed

Ensemble-based network aggregation improves the accuracy of gene network reconstruction