Multi-label Metabolic Pathway Prediction with Auto Molecular Structure Representation Learning

Author(s):  
Jiamin Chen ◽  
Jianliang Gao ◽  
Tengfei Lyu ◽  
Babatounde Moctard Oloulade ◽  
Xiaohua Hu
2010 ◽  
Vol 38 (Web Server) ◽  
pp. W138-W143 ◽  
Author(s):  
Y. Moriya ◽  
D. Shigemizu ◽  
M. Hattori ◽  
T. Tokimatsu ◽  
M. Kotera ◽  
...  

2010 ◽  
Vol 11 (1) ◽  
pp. 15 ◽  
Author(s):  
Joseph M Dale ◽  
Liviu Popescu ◽  
Peter D Karp

2012 ◽  
Vol 90 (8) ◽  
pp. 640-651
Author(s):  
Jing Song ◽  
Ying Zhang ◽  
Hui Hu ◽  
Hui Zhang ◽  
Lin Lin ◽  
...  

Quantitative structure–property relationship (QSPR) studies were performed for the prediction of gas-phase reduced ion mobility constants (K0) of diverse compounds based on three-dimensional (3D) molecular structure representation. The entire set of 159 compounds was divided into a training set of 120 compounds and a test set of 39 compounds according to Kennard and Stones algorithm. Multiple linear regression (MLR) analysis was employed to select the best subset of descriptors and to build linear models, whereas nonlinear models were developed by means of an artificial neural network (ANN). The obtained models with five descriptors involved show good predictive power for the test set: a squared correlation coefficient (R2) of 0.9029 and a standard error of estimation (s) of 0.0549 were achieved by the MLR model, whereas by the ANN model, R2 and s were 0.9292 and 0.496, respectively. The results of this study compare favorably to previously reported prediction methods for the ion mobility constants. In addition, the descriptors used in the models are discussed with respect to the structural features governing the mobility of the compounds.


2006 ◽  
Vol 34 (Web Server) ◽  
pp. W714-W719 ◽  
Author(s):  
L. Pireddu ◽  
D. Szafron ◽  
P. Lu ◽  
R. Greiner

2020 ◽  
Author(s):  
Abdur Rahman M. A. Basher ◽  
Steven J. Hallam

AbstractWe present reMap (relabeling multi-label pathway data based on bag approach), a simple, and yet, generic framework, that performs relabeling examples to a different set of labels, characterized as bags. A bag is comprised of a subset of correlated pathways, and a pathway is allowed to be mixed over bags, constituting an overlapping pathway over a subset of bags. Bag based approach was followed to overcome low sensitivity scores of triUMPF for the pathway prediction task. The relabeling process in reMap is achieved by alternating between 1) assigning bags to each sample and 2) updating reMap’s parameters. reMap’s effectiveness was evaluated on metabolic pathway prediction where resulting performance metrics equaled or exceeded other prediction methods on organismal genomes with improved sensitivity score.Availability and implementationThe software package is published on github.com/[email protected]


2020 ◽  
Author(s):  
Abdur Rahman M. A. Basher ◽  
Steven J. Hallam

AbstractMetabolic pathway reconstruction from genomic sequence information is a key step in predicting regulatory and functional potential of cells at the individual, population and community levels of organization. Although the most common methods for metabolic pathway reconstruction are gene-centric e.g. mapping annotated proteins onto known pathways using a reference database, pathway-centric methods based on heuristics or machine learning to infer pathway presence provide a powerful engine for hypothesis generation in biological systems. Such methods rely on rule sets or rich feature information that may not be known or readily accessible. Here, we present pathway2vec, a software package consisting of six representational learning based modules used to automatically generate features for pathway inference. Specifically, we build a three layered network composed of compounds, enzymes, and pathways, where nodes within a layer manifest inter-interactions and nodes between layers manifest betweenness interactions. This layered architecture captures relevant relationships used to learn a neural embedding-based low-dimensional space of metabolic features. We benchmark pathway2vec performance based on node-clustering, embedding visualization and pathway prediction using MetaCyc as a trusted source. In the pathway prediction task, results indicate that it is possible to leverage embeddings to improve pathway prediction outcomes.Availability and implementationThe software package, and installation instructions are published on github.com/[email protected]


Author(s):  
Shuangjia Zheng ◽  
Xin Yan ◽  
Yuedong Yang ◽  
Jun Xu

<p>Recognizing substructures and their relations embedded in a molecular structure representation is a key process for <a></a><a>structure-activity</a> or structure-property relationship (SAR/SPR) studies. A molecular structure can be either explicitly represented as a connection table (CT) or linear notation, such as SMILES, which is a language describing the connectivity of atoms in the molecular structure. Conventional SAR/SPR approaches rely on partitioning the CT into a set of predefined substructures as structural descriptors. In this work, we propose a new method to identifying SAR/SPR through linear notation (for example, SMILES) syntax analysis with self-attention mechanism, an interpretable deep learning architecture. The method has been evaluated by predicting chemical property, toxicology, and bioactivity from experimental data sets. Our results demonstrate that the method yields superior performance comparing with state-of-art methods. Moreover, the method can produce chemically interpretable results, which can be used for a chemist to design, and synthesize the activity/property improved compounds.</p>


Sign in / Sign up

Export Citation Format

Share Document