Global and Local Computational Models for Aqueous Solubility Prediction of Drug-Like Molecules

Christel A. S. Bergström; Carola M. Wassvik; Ulf Norinder; Kristina Luthman; Per Artursson

doi:10.1021/ci049909h

Global and Local Computational Models for Aqueous Solubility Prediction of Drug-Like Molecules.

ChemInform ◽

10.1002/chin.200439214 ◽

2004 ◽

Vol 35 (39) ◽

Author(s):

Christel A. S. Bergstroem ◽

Carola M. Wassvik ◽

Ulf Norinder ◽

Kristina Luthman ◽

Per Artusson

Keyword(s):

Computational Models ◽

Aqueous Solubility ◽

Solubility Prediction ◽

Global And Local

Download Full-text

Erratum: Predicting Melting Points of Organic Molecules: Applications to Aqueous Solubility Prediction Using the General Solubility Equation

Molecular Informatics ◽

10.1002/minf.201681041 ◽

2016 ◽

Vol 35 (10) ◽

pp. 538-538 ◽

Cited By ~ 1

Author(s):

J. L. McDonagh ◽

T. van Mourik ◽

J. B. O. Mitchell

Keyword(s):

Organic Molecules ◽

Aqueous Solubility ◽

Solubility Prediction ◽

Melting Points ◽

Solubility Equation

Download Full-text

Antimetastatic Potential of Quercetin Analogues with Improved Pharmacokinetic Profile: Pharmacoinformatic Preliminary Study

Anti-Cancer Agents in Medicinal Chemistry ◽

10.2174/1871520621666210608102452 ◽

2021 ◽

Vol 21 ◽

Author(s):

Nebojša Pavlović ◽

Nastasija Milošević ◽

Maja Đjanić ◽

Svetlana Goločorbin-Kon ◽

Bojan Stanimirov ◽

...

Keyword(s):

Membrane Permeability ◽

Anticancer Agents ◽

Inhibitory Activity ◽

Computational Models ◽

Aqueous Solubility ◽

Pharmacokinetic Profile ◽

Limiting Factor ◽

Hydroxyl Groups ◽

Molegro Virtual Docker ◽

Upa Receptor

Background: Urokinase-type plasminogen activator (uPA) system is a crucial pathway for tumor invasion and metastasis. Recently, multiple anticancer effects of quercetin have been described, including inhibitory activity against uPA. However, the clinical use of this flavonoid has been limited due to its low oral bioavailability. Objective: The objectives of the study were to assess the antimetastatic potential of quercetin analogues by analyzing their binding affinity for uPA and to select the compounds with improved pharmacological profiles. Methods: Binding affinities of structural analogues of quercetin to uPA receptor were determined by molecular docking analysis using Molegro Virtual Docker software, and molecular descriptors relevant for estimating pharmacological profile were calculated from ligand structures using computational models. Results: Among 44 quercetin analogues, only one quercetin analogue (3,6,2’,4’,5’-pentahydroxyflavone) was found to possess both higher aqueous solubility and membrane permeability, and a stronger affinity for uPA than quercetin, which makes it the potential lead compound for anticancer drug development. Like quercetin, this compound has five hydroxyl groups but is arranged differently, which contributes to the higher aqueous solubility and higher amphiphilic moment compared to quercetin. Since membrane permeability is not recognized as the limiting factor for quercetin absorption, analogues with higher aqueous solubility and retained or stronger uPA inhibitory activity should also be further experimentally validated for potential therapeutic use. Conclusion: Identified quercetin analogues with better physicochemical and pharmacological properties have a high potential to succeed in later stages of research in biological systems as potential anticancer agents with antimetastatic activity.

Download Full-text

Aqueous Solubility Prediction of Drugs Based on Molecular Topology and Neural Network Modeling

Journal of Chemical Information and Computer Sciences ◽

10.1021/ci970100x ◽

1998 ◽

Vol 38 (3) ◽

pp. 450-456 ◽

Cited By ~ 117

Author(s):

Jarmo Huuskonen ◽

Marja Salo ◽

Jyrki Taskinen

Keyword(s):

Neural Network ◽

Network Modeling ◽

Aqueous Solubility ◽

Neural Network Modeling ◽

Molecular Topology ◽

Solubility Prediction

Download Full-text

Improved Prediction of miRNA-Disease Associations Based on Matrix Completion with Network Regularization

Cells ◽

10.3390/cells9040881 ◽

2020 ◽

Vol 9 (4) ◽

pp. 881 ◽

Cited By ~ 1

Author(s):

Jihwan Ha ◽

Chihyun Park ◽

Chanyoung Park ◽

Sanghyun Park

Keyword(s):

Computational Models ◽

Matrix Completion ◽

Area Under The Curve ◽

Excellent Performance ◽

Novel Mirna ◽

Lack Of Information ◽

Disease Associations ◽

Roc Area ◽

Leave One Out ◽

Global And Local

The identification of potential microRNA (miRNA)-disease associations enables the elucidation of the pathogenesis of complex human diseases owing to the crucial role of miRNAs in various biologic processes and it yields insights into novel prognostic markers. In the consideration of the time and costs involved in wet experiments, computational models for finding novel miRNA-disease associations would be a great alternative. However, computational models, to date, are biased towards known miRNA-disease associations; this is not suitable for rare miRNAs (i.e., miRNAs with a few known disease associations) and uncommon diseases (i.e., diseases with a few known miRNA associations). This leads to poor prediction accuracies. The most straightforward way of improving the performance is by increasing the number of known miRNA-disease associations. However, due to lack of information, increasing attention has been paid to developing computational models that can handle insufficient data via a technical approach. In this paper, we present a general framework—improved prediction of miRNA-disease associations (IMDN)—based on matrix completion with network regularization to discover potential disease-related miRNAs. The success of adopting matrix factorization is demonstrated by its excellent performance in recommender systems. This approach considers a miRNA network as additional implicit feedback and makes predictions for disease associations relevant to a given miRNA based on its direct neighbors. Our experimental results demonstrate that IMDN achieved excellent performance with reliable area under the receiver operating characteristic (ROC) area under the curve (AUC) values of 0.9162 and 0.8965 in the frameworks of global and local leave-one-out cross-validations (LOOCV), respectively. Further, case studies demonstrated that our method can not only validate true miRNA-disease associations but also suggest novel disease-related miRNA candidates.

Download Full-text

Computational aqueous solubility prediction for drug-like compounds in congeneric series

European Journal of Medicinal Chemistry ◽

10.1016/j.ejmech.2007.04.009 ◽

2008 ◽

Vol 43 (3) ◽

pp. 501-512 ◽

Cited By ~ 26

Author(s):

Lei Du-Cuny ◽

Jörg Huwyler ◽

Michael Wiese ◽

Manfred Kansy

Keyword(s):

Aqueous Solubility ◽

Solubility Prediction

Download Full-text

Pushing the limits of solubility prediction via quality-oriented data selection

10.21203/rs.3.rs-84771/v1 ◽

2020 ◽

Author(s):

Murat Sorkun ◽

J. M. Koelman ◽

Süleyman Er

Keyword(s):

Prediction Models ◽

Aqueous Solubility ◽

Data Selection ◽

Data Driven ◽

Solubility Prediction ◽

Quality Of Data ◽

Statistical Validation ◽

Solubility Predictions ◽

Machine Learning Approach

Abstract Accurate prediction of the solubility of chemical substances in solvents remains a challenge. The sparsity of high-quality solubility data is recognized as the biggest hurdle in the development of robust data-driven methods for practical use. Nonetheless, the effects of the quality and quantity of data on aqueous solubility predictions have not yet been scrutinized. In this study, the roles of the size and the quality of datasets on the performances of the solubility prediction models are unraveled, and the concepts of actual and observed performances are introduced. In an effort to curtail the gap between actual and observed performances, a quality-oriented data selection method, which evaluates the quality of data and extracts the most accurate part of it through statistical validation, is designed. Applying this method on the largest publicly available solubility database and using a consensus machine learning approach, a top-performing solubility prediction model is achieved.

Download Full-text

RedDB, a Computational Database of Electroactive Molecules for Aqueous Redox Flow Batteries

10.26434/chemrxiv.14398067 ◽

2021 ◽

Author(s):

Elif Sorkun ◽

Qi Zhang ◽

Abhishek Khetan ◽

murat cihan sorkun ◽

Süleyman Er

Keyword(s):

High Performance ◽

Chemical Space ◽

Aqueous Solubility ◽

Solubility Prediction ◽

Chemical Library ◽

Redox Flow Batteries ◽

Flow Batteries ◽

Virtual Chemical Libraries ◽

Electroactive Molecules ◽

Screening Approaches

An increasing number of electroactive compounds have recently been explored for their use in high-performance redox flow batteries for grid-scale energy storage. Given the vast and highly diverse chemical space of the candidate compounds, it is alluring to access their physicochemical properties in a speedy way. High-throughput virtual screening approaches, which use powerful combinatorial techniques for systematic enumerations of large virtual chemical libraries and respective property evaluations, are indispensable tools for an agile exploration of the designated chemical space. Herein, RedDB: a computational database that contains 31,677 molecules from two prominent classes of organic electroactive compounds, quinones and aza-aromatics, has been presented. RedDB incorporates miscellaneous physicochemical property information of the compounds that can potentially be employed as battery performance descriptors. RedDB’s development steps, including: i)chemical library generation, ii) molecular property prediction based on quantum chemical calculations, iii) aqueous solubility prediction using machine learning, and iv) data processing and database creation, have been described.

Download Full-text

Six global and local QSPR models of aqueous solubility at pH = 7.4 based on structural similarity and physicochemical descriptors

SAR and QSAR in Environmental Research ◽

10.1080/1062936x.2017.1368704 ◽

2017 ◽

Vol 28 (8) ◽

pp. 661-676 ◽

Cited By ~ 3

Author(s):

O. A. Raevsky ◽

V. Y. Grigorev ◽

D. E. Polianczyk ◽

O. E. Raevskaja ◽

J. C. Dearden

Keyword(s):

Aqueous Solubility ◽

Structural Similarity ◽

Physicochemical Descriptors ◽

Global And Local

Download Full-text

Different molecular enumeration influences in deep learning: an example using aqueous solubility

Briefings in Bioinformatics ◽

10.1093/bib/bbaa092 ◽

2020 ◽

Cited By ~ 1

Author(s):

Jen-Hao Chen ◽

Yufeng Jane Tseng

Keyword(s):

Neural Network ◽

Deep Learning ◽

Single Molecule ◽

Chemical Structure ◽

Molecular Graph ◽

Aqueous Solubility ◽

Solubility Prediction ◽

Input Line ◽

Biological Phenomena ◽

Feature Based

Abstract Aqueous solubility is the key property driving many chemical and biological phenomena and impacts experimental and computational attempts to assess those phenomena. Accurate prediction of solubility is essential and challenging, even with modern computational algorithms. Fingerprint-based, feature-based and molecular graph-based representations have all been used with different deep learning methods for aqueous solubility prediction. It has been clearly demonstrated that different molecular representations impact the model prediction and explainability. In this work, we reviewed different representations and also focused on using graph and line notations for modeling. In general, one canonical chemical structure is used to represent one molecule when computing its properties. We carefully examined the commonly used simplified molecular-input line-entry specification (SMILES) notation representing a single molecule and proposed to use the full enumerations in SMILES to achieve better accuracy. A convolutional neural network (CNN) was used. The full enumeration of SMILES can improve the presentation of a molecule and describe the molecule with all possible angles. This CNN model can be very robust when dealing with large datasets since no additional explicit chemistry knowledge is necessary to predict the solubility. Also, traditionally it is hard to use a neural network to explain the contribution of chemical substructures to a single property. We demonstrated the use of attention in the decoding network to detect the part of a molecule that is relevant to solubility, which can be used to explain the contribution from the CNN.

Download Full-text