scholarly journals COSIFER: a Python package for the consensus inference of molecular interaction networks

Author(s):  
Matteo Manica ◽  
Charlotte Bunne ◽  
Roland Mathis ◽  
Joris Cadow ◽  
Mehmet Eren Ahsen ◽  
...  

Abstract Summary The advent of high-throughput technologies has provided researchers with measurements of thousands of molecular entities and enable the investigation of the internal regulatory apparatus of the cell. However, network inference from high-throughput data is far from being a solved problem. While a plethora of different inference methods have been proposed, they often lead to non-overlapping predictions, and many of them lack user-friendly implementations to enable their broad utilization. Here, we present Consensus Interaction Network Inference Service (COSIFER), a package and a companion web-based platform to infer molecular networks from expression data using state-of-the-art consensus approaches. COSIFER includes a selection of state-of-the-art methodologies for network inference and different consensus strategies to integrate the predictions of individual methods and generate robust networks. Availability and implementation COSIFER Python source code is available at https://github.com/PhosphorylatedRabbits/cosifer. The web service is accessible at https://ibm.biz/cosifer-aas. Supplementary information Supplementary data are available at Bioinformatics online.

2020 ◽  
Vol 36 (12) ◽  
pp. 3913-3915
Author(s):  
Hemi Luan ◽  
Xingen Jiang ◽  
Fenfen Ji ◽  
Zhangzhang Lan ◽  
Zongwei Cai ◽  
...  

Abstract Motivation Liquid chromatography–mass spectrometry-based non-targeted metabolomics is routinely performed to qualitatively and quantitatively analyze a tremendous amount of metabolite signals in complex biological samples. However, false-positive peaks in the datasets are commonly detected as metabolite signals by using many popular software, resulting in non-reliable measurement. Results To reduce false-positive calling, we developed an interactive web tool, termed CPVA, for visualization and accurate annotation of the detected peaks in non-targeted metabolomics data. We used a chromatogram-centric strategy to unfold the characteristics of chromatographic peaks through visualization of peak morphology metrics, with additional functions to annotate adducts, isotopes and contaminants. CPVA is a free, user-friendly tool to help users to identify peak background noises and contaminants, resulting in decrease of false-positive or redundant peak calling, thereby improving the data quality of non-targeted metabolomics studies. Availability and implementation The CPVA is freely available at http://cpva.eastus.cloudapp.azure.com. Source code and installation instructions are available on GitHub: https://github.com/13479776/cpva. Supplementary information Supplementary data are available at Bioinformatics online.


2015 ◽  
Author(s):  
Aurélie Pirayre ◽  
Camille Couprie ◽  
Frédérique Bidard ◽  
Laurent Duval ◽  
Jean-Christophe Pesquet

Background: Inferring gene networks from high-throughput data constitutes an important step in the discovery of relevant regulatory relationships in organism cells. Despite the large number of available Gene Regulatory Network inference methods, the problem remains challenging: the underdetermination in the space of possible solutions requires additional constraints that incorporate a priori information on gene interactions. Methods: Weighting all possible pairwise gene relationships by a probability of edge presence, we formulate the regulatory network inference as a discrete variational problem on graphs. We enforce biologically plausible coupling between groups and types of genes by minimizing an edge labeling functional coding for a priori structures. The optimization is carried out with Graph cuts, an approach popular in image processing and computer vision. We compare the inferred regulatory networks to results achieved by the mutual-information-based Context Likelihood of Relatedness (CLR) method and by the state-of-the-art GENIE3, winner of the DREAM4 multifactorial challenge. Results: Our BRANE Cut approach infers more accurately the five DREAM4 in silico networks (with improvements from 6% to 11%). On a real Escherichia coli compendium, an improvement of 11.8% compared to CLR and 3% compared to GENIE3 is obtained in terms of Area Under Precision-Recall curve. Up to 48 additional verified interactions are obtained over GENIE3 for a given precision. On this dataset involving 4345 genes, our method achieves a performance similar to that of GENIE3, while being more than seven times faster. The BRANE Cut code is available at: http://www-syscom.univ-mlv.fr/~pirayre/Codes-GRN-BRANE-cut.html Conclusions: BRANE Cut is a weighted graph thresholding method. Using biologically sound penalties and data-driven parameters, it improves three state-of-the-art GRN inference methods. It is applicable as a generic network inference post-processing, due its computational efficiency.


2020 ◽  
Author(s):  
Jianhao Peng ◽  
Ullas V. Chembazhi ◽  
Sushant Bangru ◽  
Ian M. Traniello ◽  
Auinash Kalsotra ◽  
...  

AbstractMotivationWith the use of single-cell RNA sequencing (scRNA-Seq) technologies, it is now possible to acquire gene expression data for each individual cell in samples containing up to millions of cells. These cells can be further grouped into different states along an inferred cell differentiation path, which are potentially characterized by similar, but distinct enough, gene regulatory networks (GRNs). Hence, it would be desirable for scRNA-Seq GRN inference methods to capture the GRN dynamics across cell states. However, current GRN inference methods produce a unique GRN per input dataset (or independent GRNs per cell state), failing to capture these regulatory dynamics.ResultsWe propose a novel single-cell GRN inference method, named SimiC, that jointly infers the GRNs corresponding to each state. SimiC models the GRN inference problem as a LASSO optimization problem with an added similarity constraint, on the GRNs associated to contiguous cell states, that captures the inter-cell-state homogeneity. We show on a mouse hepatocyte single-cell data generated after partial hepatectomy that, contrary to previous GRN methods for scRNA-Seq data, SimiC is able to capture the transcription factor (TF) dynamics across liver regeneration, as well as the cell-level behavior for the regulatory program of each TF across cell states. In addition, on a honey bee scRNA-Seq experiment, SimiC is able to capture the increased heterogeneity of cells on whole-brain tissue with respect to a regional analysis tissue, and the TFs associated specifically to each sequenced tissue.AvailabilitySimiC is written in Python and includes an R API. It can be downloaded from https://github.com/jianhao2016/[email protected], [email protected] informationSupplementary data are available at the code repository.


2008 ◽  
Vol 3 ◽  
pp. BMI.S467 ◽  
Author(s):  
Bernett T.K. Lee ◽  
Lailing Liew ◽  
Jiahao Lim ◽  
Jonathan K.L. Tan ◽  
Tze Chuen Lee ◽  
...  

CLUB (“Candidate List of yoUr Biomarkers”) is a freely available, web-based resource designed to support Cancer biomarker research. It is targeted to provide a comprehensive list of candidate biomarkers for various cancers that have been reported by the research community. CLUB provides tools for comparison of marker candidates from different experimental platforms, with the ability to filter, search, query and explore, molecular interaction networks associated with cancer biomarkers from the published literature and from data uploaded by the community. This complex and ambitious project is implemented in phases. As a first step, we have compiled from the literature an initial set of differentially expressed human candidate cancer biomarkers. Each candidate is annotated with information from publicly available databases such as Gene Ontology, Swiss-Prot database, National Center for Biotechnology Information's reference sequences, Biomolecular Interaction Network Database and IntAct interaction. The user has the option to maintain private lists of biomarker candidates or share and export these for use by the community. Furthermore, users may customize and combine commonly used sets of selection procedures and apply them as a stored workflow using selected candidate lists. To enable an assessment by the user before taking a candidate biomarker to the experimental validation stage, the platform contains the functionality to identify pathways associated with cancer risk, staging, prognosis, outcome in cancer and other clinically associated phenotypes. The system is available at http://club.bii.a-star.edu.sg .


2020 ◽  
Vol 36 (10) ◽  
pp. 3246-3247
Author(s):  
Vaclav Brazda ◽  
Jan Kolomaznik ◽  
Jean-Louis Mergny ◽  
Jiri Stastny

Abstract Motivation G-quadruplexes (G4) are important regulatory non-B DNA structures with therapeutic potential. A tool for rational design of mutations leading to decreased propensity for G4 formation should be useful in studying G4 functions. Although tools exist for G4 prediction, no easily accessible tool for the rational design of G4 mutations has been available. Results We developed a web-based tool termed G4Killer that is based on the G4Hunter algorithm. This new tool is a platform-independent and user-friendly application to design mutations crippling G4 propensity in a parsimonious way (i.e., keeping the primary sequence as close as possible to the original one). The tool is integrated into our DNA analyzer server and allows for generating mutated DNA sequences having the desired lowered G4Hunter score with minimal mutation steps. Availability and implementation The G4Killer web tool can be accessed at: http://bioinformatics.ibp.cz. Supplementary information Supplementary data are available at Bioinformatics online.


2019 ◽  
Vol 35 (18) ◽  
pp. 3527-3529 ◽  
Author(s):  
David Aparício ◽  
Pedro Ribeiro ◽  
Tijana Milenković ◽  
Fernando Silva

Abstract Motivation Network alignment (NA) finds conserved regions between two networks. NA methods optimize node conservation (NC) and edge conservation. Dynamic graphlet degree vectors are a state-of-the-art dynamic NC measure, used within the fastest and most accurate NA method for temporal networks: DynaWAVE. Here, we use graphlet-orbit transitions (GoTs), a different graphlet-based measure of temporal node similarity, as a new dynamic NC measure within DynaWAVE, resulting in GoT-WAVE. Results On synthetic networks, GoT-WAVE improves DynaWAVE’s accuracy by 30% and speed by 64%. On real networks, when optimizing only dynamic NC, the methods are complementary. Furthermore, only GoT-WAVE supports directed edges. Hence, GoT-WAVE is a promising new temporal NA algorithm, which efficiently optimizes dynamic NC. We provide a user-friendly user interface and source code for GoT-WAVE. Availability and implementation http://www.dcc.fc.up.pt/got-wave/ Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Ruoyi Cai ◽  
Cécile Ané

Abstract Motivation With growing genome-wide molecular datasets from next-generation sequencing, phylogenetic networks can be estimated using a variety of approaches. These phylogenetic networks include events like hybridization, gene flow or horizontal gene transfer explicitly. However, the most accurate network inference methods are computationally heavy. Methods that scale to larger datasets do not calculate a full likelihood, such that traditional likelihood-based tools for model selection are not applicable to decide how many past hybridization events best fit the data. We propose here a goodness-of-fit test to quantify the fit between data observed from genome-wide multi-locus data, and patterns expected under the multi-species coalescent model on a candidate phylogenetic network. Results We identified weaknesses in the previously proposed TICR test, and proposed corrections. The performance of our new test was validated by simulations on real-world phylogenetic networks. Our test provides one of the first rigorous tools for model selection, to select the adequate network complexity for the data at hand. The test can also work for identifying poorly inferred areas on a network. Availability and implementation Software for the goodness-of-fit test is available as a Julia package at https://github.com/cecileane/QuartetNetworkGoodnessFit.jl. Supplementary information Supplementary data are available at Bioinformatics online.


2018 ◽  
Author(s):  
Hyun-Hwan Jeong ◽  
Seon Young Kim ◽  
Maxime W.C. Rousseaux ◽  
Huda Y. Zoghbi ◽  
Zhandong Liu

AbstractThe simplicity and cost-effectiveness of CRISPR technology have made high-throughput pooled screening approaches available to many. However, the large amount of sequencing data derived from these studies yields often unwieldy datasets requiring considerable bioinformatic resources to deconvolute data; a feature which is simply not accessible to many wet labs. To address these needs, we have developed a cloud-based webtool CRISPRCloud2 that provides a state-of-the-art accuracy in mapping short reads to CRISPR library, a powerful statistical test that aggregates information across multiple sgRNAs targeting the same gene, a user-friendly data visualization and query interface, as well as easy linking to other CRISPR tools and bioinformatics resources for target prioritization. CRISPRCloud2 is a one-stop shop for labs analyzing CRISPR screen data.


2019 ◽  
Vol 20 (S9) ◽  
Author(s):  
Salvatore Alaimo ◽  
Antonio Di Maria ◽  
Dennis Shasha ◽  
Alfredo Ferro ◽  
Alfredo Pulvirenti

Abstract Background Several large public repositories of microarray datasets and RNA-seq data are available. Two prominent examples include ArrayExpress and NCBI GEO. Unfortunately, there is no easy way to import and manipulate data from such resources, because the data is stored in large files, requiring large bandwidth to download and special purpose data manipulation tools to extract subsets relevant for the specific analysis. Results TACITuS is a web-based system that supports rapid query access to high-throughput microarray and NGS repositories. The system is equipped with modules capable of managing large files, storing them in a cloud environment and extracting subsets of data in an easy and efficient way. The system also supports the ability to import data into Galaxy for further analysis. Conclusions TACITuS automates most of the pre-processing needed to analyze high-throughput microarray and NGS data from large publicly-available repositories. The system implements several modules to manage large files in an easy and efficient way. Furthermore, it is capable deal with Galaxy environment allowing users to analyze data through a user-friendly interface.


2019 ◽  
Vol 35 (18) ◽  
pp. 3493-3495 ◽  
Author(s):  
Václav Brázda ◽  
Jan Kolomazník ◽  
Jiří Lýsek ◽  
Martin Bartas ◽  
Miroslav Fojta ◽  
...  

Abstract Motivation Expanding research highlights the importance of guanine quadruplex structures. Therefore, easy-accessible tools for quadruplex analyses in DNA and RNA molecules are important for the scientific community. Results We developed a web version of the G4Hunter application. This new web-based server is a platform-independent and user-friendly application for quadruplex analyses. It allows retrieval of gene/nucleotide sequence entries from NCBI databases and provides complete characterization of localization and quadruplex propensity of quadruplex-forming sequences. The G4Hunter web application includes an interactive graphical data representation with many useful options including visualization, sorting, data storage and export. Availability and implementation G4Hunter web application can be accessed at: http://bioinformatics.ibp.cz. Supplementary information Supplementary data are available at Bioinformatics online.


Sign in / Sign up

Export Citation Format

Share Document