Latest Publications





Published By Oxford University Press

1460-2059, 1367-4803
Updated Sunday, 17 October 2021

Chuming Chen ◽  
Karen E Ross ◽  
Sachin Gavali ◽  
Julie E Cowart ◽  
Cathy H Wu

Abstract Summary The global response to the COVID-19 pandemic has led to a rapid increase of scientific literature on this deadly disease. Extracting knowledge from biomedical literature and integrating it with relevant information from curated biological databases is essential to gain insight into COVID-19 etiology, diagnosis, and treatment. We used Semantic Web technology RDF to integrate COVID-19 knowledge mined from literature by iTextMine, PubTator, and SemRep with relevant biological databases and formalized the knowledge in a standardized and computable COVID-19 Knowledge Graph (KG). We published the COVID-19 KG via a SPARQL endpoint to support federated queries on the Semantic Web and developed a knowledge portal with browsing and searching interfaces. We also developed a RESTful API to support programmatic access and provided RDF dumps for download. Availability and implementation The COVID-19 Knowledge Graph is publicly available under CC-BY 4.0 license at

Anton Pirogov ◽  
Peter Pfaffelhuber ◽  
Angelika Börsch-Haubold ◽  
Bernhard Haubold

Valentine U Nlebedim ◽  
Roy R Chaudhuri ◽  
Kevin Walters

Jae Yong Ryu ◽  
Jeong Hyun Lee ◽  
Byung Ho Lee ◽  
Jin Sook Song ◽  
Sunjoo Ahn ◽  

Abstract Motivation Poor metabolic stability leads to drug development failure. Therefore, it is essential to evaluate the metabolic stability of small compounds for successful drug discovery and development. However, evaluating metabolic stability in vitro and in vivo is expensive, time-consuming, and laborious. Additionally, only a few free software programs are available for metabolic stability data and prediction. Therefore, in this study, we aimed to develop a prediction model that predicts the metabolic stability of small compounds. Results We developed a computational model, PredMS, which predicts the metabolic stability of small compounds as stable or unstable in human liver microsomes. PredMS is based on a random forest model using an in-house database of metabolic stability data of 1,917 compounds. To validate the prediction performance of PredMS, we generated external test data of 61 compounds. PredMS achieved an accuracy of 0.74, Matthew’s correlation coefficient of 0.48, sensitivity of 0.70, specificity of 0.86, positive predictive value of 0.94, and negative predictive value of 0.46 on the external test dataset. PredMS will be a useful tool to predict the metabolic stability of small compounds in the early stages of drug discovery and development. Availability and implementation The source code for PredMS is available at, and the PredMS web server is available at Supplementary information Supplementary data are available at Bioinformatics online.

Shuxia Guo ◽  
Xuan Zhao ◽  
Shengdian Jiang ◽  
Liya Ding ◽  
Hanchuan Peng

Abstract Motivation To digitally reconstruct the 3D neuron morphologies has long been a major bottleneck in neuroscience. One of the obstacles to automate the procedure is the low signal-background contrast and the large dynamic range of signal and background both within and across images. Results We developed a pipeline to enhance the neurite signal and to suppress the background, with the goal of high signal-background contrast and better within- and between image homogeneity. The performance of the image enhancement was quantitatively verified according to the different figures of merit benchmarking the image quality. Additionally, the method could improve the neuron reconstruction in approximately 1/3 of the cases, with very few cases of degrading the reconstruction. This significantly outperformed three other approaches of image enhancement. Moreover, the compression rate was increased 5 times by average comparing the enhanced to the raw image. All results demonstrated the potential of the proposed method in leveraging the neuroscience providing better 3D morphological reconstruction and lower cost of data storage and transfer. Availability The study is conducted based on the Vaa3D platform and python 3.7.9. The Vaa3D platform is available on the GitHub ( The source code of the proposed image enhancement as a Vaa3D plugin, the source code to benchmark the image quality, and the example image blocks are available under the repository of vaa3d_tools/hackathon/SGuo/imPreProcess. The original fMost images of mouse brains can be found at the BICCN’s Brain Image Library (BIL) ( Supplementary information Supplementary data are available at Bioinformatics online.

Seungyoon Nam ◽  
Sungyoung Lee ◽  
Sungjin Park ◽  
Jinhyuk Lee ◽  
Aron Park ◽  

Abstract Motivation Drug repositioning reveals novel indications for existing drugs and in particular, diseases with no available drugs. Diverse computational drug repositioning methods have been proposed by measuring either drug-treated gene expression signatures or the proximity of drug targets and disease proteins found in prior networks. However, these methods do not explain which signaling subparts allow potential drugs to be selected, and do not consider polypharmacology, i.e., multiple targets of a known drug, in specific subparts. Results Here, to address the limitations, we developed a subpathway-based polypharmacology drug repositioning method, PATHOME-Drug, based on drug-associated transcriptomes. Specifically, this tool locates subparts of signaling cascading related to phenotype changes (e.g., disease status changes), and identifies existing approved drugs such that their multiple targets are enriched in the subparts. We show that our method demonstrated better performance for detecting signaling context and specific drugs/compounds, compared to WebGestalt and clusterProfiler, for both real biological and simulated datasets. We believe that our tool can successfully address the current shortage of targeted therapy agents. Availability The web-service is available at The source codes and data are available at Supplementary information Supplementary data are available at Bioinformatics online.

Lyla Atta ◽  
Arpan Sahoo ◽  
Jean Fan

Abstract Motivation Single cell transcriptomics profiling technologies enable genome-wide gene expression measurements in individual cells but can currently only provide a static snapshot of cellular transcriptional states. RNA velocity analysis can help infer cell state changes using such single cell transcriptomics data. To interpret these cell state changes inferred from RNA velocity as part of underlying cellular trajectories, current approaches rely on visualization with principal components, t-distributed stochastic neighbor embedding, and other 2D embeddings derived from the observed single cell transcriptional states. However, these 2D embeddings can yield different representations of the underlying cellular trajectories, hindering the interpretation of cell state changes. Results We developed VeloViz to create RNA-velocity-informed 2D and 3D embeddings from single cell transcriptomics data. Using both real and simulated data, we demonstrate that VeloViz embeddings are able to capture underlying cellular trajectories across diverse trajectory topologies, even when intermediate cell states may be missing. By taking into consideration the predicted future transcriptional states from RNA velocity analysis, VeloViz can help visualize a more reliable representation of underlying cellular trajectories. Availability Source code is available on GitHub ( and Bioconductor ( with additional tutorials at Supplementary information Supplementary data are available at Bioinformatics online.

Haitao Fu ◽  
Feng Huang ◽  
Xuan Liu ◽  
Yang Qiu ◽  
Wen Zhang

Abstract Motivation There are various interaction/association bipartite networks in biomolecular systems. Identifying unobserved links in biomedical bipartite networks helps to understand the underlying molecular mechanisms of human complex diseases and thus benefits the diagnosis and treatment of diseases. Although a great number of computational methods have been proposed to predict links in biomedical bipartite networks, most of them heavily depend on features and structures involving the bioentities in one specific bipartite network, which limits the generalization capacity of applying the models to other bipartite networks. Meanwhile, bioentities usually have multiple features, and how to leverage them has also been challenging. Results In this study, we propose a novel multi-view graph convolution network (MVGCN) framework for link prediction in biomedical bipartite networks. We first construct a multi-view heterogeneous network (MVHN) by combining the similarity networks with the biomedical bipartite network, and then perform a self-supervised learning strategy on the bipartite network to obtain node attributes as initial embeddings. Further, a neighborhood information aggregation (NIA) layer is designed for iteratively updating the embeddings of nodes by aggregating information from inter- and intra-domain neighbors in every view of the MVHN. Next, we combine embeddings of multiple NIA layers in each view, and integrate multiple views to obtain the final node embeddings, which are then fed into a discriminator to predict the existence of links. Extensive experiments show MVGCN performs better than or on par with baseline methods and has the generalization capacity on six benchmark datasets involving three typical tasks. Availability Source code and data can be downloaded from Supplementary information Supplementary data are available at Bioinformatics online.

Gianvito Pio ◽  
Paolo Mignone ◽  
Giuseppe Magazzù ◽  
Guido Zampieri ◽  
Michelangelo Ceci ◽  

Abstract Motivation Gene regulation is responsible for controlling numerous physiological functions and dynamically responding to environmental fluctuations. Reconstructing the human network of gene regulatory interactions is thus paramount to understanding the cell functional organisation across cell types, as well as to elucidating pathogenic processes and identifying molecular drug targets. Although significant effort has been devoted towards this direction, existing computational methods mainly rely on gene expression levels, possibly ignoring the information conveyed by mechanistic biochemical knowledge. Moreover, except for a few recent attempts, most of the existing approaches only consider the information of the organism under analysis, without exploiting the information of related model organisms. Results We propose a novel method for the reconstruction of the human gene regulatory network, based on a transfer learning strategy that synergically exploits information from human and mouse, conveyed by gene-related metabolic features generated in-silico from gene expression data. Specifically, we learn a predictive model from metabolic activity inferred via tissue-specific metabolic modelling of artificial gene knockouts. Our experiments show that the combination of our transfer learning approach with the constructed metabolic features provides a significant advantage in terms of reconstruction accuracy, as well as additional clues on the contribution of each constructed metabolic feature. Availability The system, the datasets and all the results obtained in this study are available at: Supplementary information Supplementary data are available at Bioinformatics online.

Qing Cheng ◽  
Tingting Qiu ◽  
Xiaoran Chai ◽  
Baoluo Sun ◽  
Yingcun Xia ◽  

Abstract Motivation Mendelian randomization (MR) is a valuable tool to examine the causal relationships between health risk factors and outcomes from observational studies. Along with the proliferation of genome-wide association studies (GWASs), a variety of two-sample MR methods for summary data have been developed to account for horizontal pleiotropy (HP), primarily based on the assumption that the effects of variants on exposure (γ) and horizontal pleiotropy (α) are independent. In practice, this assumption is too strict and can be easily violated because of the correlated HP. Results To account for this correlated HP, we propose a Bayesian approach, MR-Corr2, that uses the orthogonal projection to reparameterize the bivariate normal distribution for γ and α, and a spike-slab prior to mitigate the impact of correlated HP. We have also developed an efficient algorithm with paralleled Gibbs sampling. To demonstrate the advantages of MR-Corr2 over existing methods, we conducted comprehensive simulation studies to compare for both type-I error control and point estimates in various scenarios. By applying MR-Corr2 to study the relationships between exposure-outcome pairs in complex traits, we did not identify the contradictory causal relationship between HDL-c and CAD. Moreover, the results provide a new perspective of the causal network among complex traits. Availability The developed R package and code to reproduce all the results are available at Supplementary information Supplementary data are available at Bioinformatics online.

Sign in / Sign up

Export Citation Format

Share Document