common substructure
Recently Published Documents


TOTAL DOCUMENTS

57
(FIVE YEARS 14)

H-INDEX

11
(FIVE YEARS 2)

2021 ◽  
Author(s):  
José Jiménez Luna ◽  
Miha Skalic ◽  
Nils Weskamp

Feature attribution techniques are popular choices within the explainable artificial intelligence toolbox, as they can help elucidate which parts of the provided inputs used by an underlying supervised-learning method are considered relevant for a specific prediction. In the context of molecular design, these approaches typically involve the coloring of molecular graphs, whose presentation to medicinal chemists can be useful for making a decision of which compounds to synthesize or prioritize. The consistency of the highlighted moieties alongside expert background knowledge is expected to contribute to the understanding of machine-learning models in drug design. Quantitative evaluation of such coloring approaches, however, has so far been limited to substructure identification tasks. We here present an approach that is based on maximum common substructure algorithms applied to experimentally-determined activity cliffs. Using the proposed benchmark, we found that molecule coloring approaches in conjunction with classical machine-learning models tend to outperform more modern, deep-learning-based alternatives. However, none of the tested feature attribution methods sufficiently and consistently generalized when confronted with unseen examples.


2021 ◽  
Vol 12 ◽  
Author(s):  
Ewa D. Micewicz ◽  
Robert D. Damoiseaux ◽  
Gang Deng ◽  
Adrian Gomez ◽  
Keisuke S. Iwamoto ◽  
...  

We previously reported several vignettes on types and classes of drugs able to mitigate acute and, in at least one case, late radiation syndromes in mice. Most of these had emerged from high throughput screening (HTS) of bioactive and chemical drug libraries using ionizing radiation-induced lymphocytic apoptosis as a readout. Here we report the full analysis of the HTS screen of libraries with 85,000 small molecule chemicals that identified 220 “hits.” Most of these hits could be allocated by maximal common substructure analysis to one of 11 clusters each containing at least three active compounds. Further screening validated 23 compounds as being most active; 15 of these were cherry-picked based on drug availability and tested for their ability to mitigate acute hematopoietic radiation syndrome (H-ARS) in mice. Of these, five bore a 4-nitrophenylsulfonamide motif while 4 had a quinoline scaffold. All but two of the 15 significantly (p < 0.05) mitigated H-ARS in mice. We had previously reported that the lead 4-(nitrophenylsulfonyl)-4-phenylpiperazine compound (NPSP512), was active in mitigating multiple acute and late radiation syndromes in mice of more than one sex and strain. Unfortunately, the formulation of this drug had to be changed for regulatory reasons and we report here on the synthesis and testing of active analogs of NPSP512 (QS1 and 52A1) that have increased solubility in water and in vivo bioavailability while retaining mitigator activity against H-ARS (p < 0.0001) and other radiation syndromes. The lead quinoline 057 was also active in multiple murine models of radiation damage. Taken together, HTS of a total of 150,000 bioactive or chemical substances, combined with maximal common substructure analysis has resulted in the discovery of diverse groups of compounds that can mitigate H-ARS and at least some of which can mitigate multiple radiation syndromes when given starting 24 h after exposure. We discuss what is known about how these agents might work, and the importance of formulation and bioavailability.


Author(s):  
M. Vijey Aanandhi ◽  
Anbhule Sachin J

Benzimidazole and its derivatives are used in organic synthesis and vermicides or fungicides as they inhibit the action of certain microorganisms. The molecules to be analysed were aligned on an appropriate template, which is considered to be common substructure. The protein structure of PDB name along with their inhibitor was retrieved from RCSB Protein Data Bank (PDB entry code: 6T1O). The protein structure were subjected to energy minimization and charge calculation (AMBER7FF99), docking score of compounds on 6T1O PDB describe the ligand interaction. Virtual library of benzimidazoles derivatives to find lead structures to test against C. albicans. Twenty compounds were designed in which heterocyclic ring is substituted at NH group of Substituted ortho-phenylenediamine moiety while some compound also bearing chloro and nitro group on para position of aromatic ring.


2021 ◽  
Author(s):  
Drazen Petrov

Free-energy calculations play an important role in the application of computational chemistry to a range of fields, including protein biochemistry, rational drug design or material science. Importantly, the free energy difference is directly related to experimentally measurable quantities such as partition and adsorption coefficients, water activity and binding affinities. Among several techniques aimed at predicting the free-energy differences, perturbation approaches, involving alchemical transformation of one molecule into another through intermediate states, stand out as rigorous methods based on statistical mechanics. However, despite the importance of efficient and accurate free energy predictions, applicability of the perturbation approaches is still largely impeded by a number of challenges. This study aims at addressing two of them: 1) the definition of the perturbation path, i.e., alchemical changes leading to the transformation of one molecule to the other, and 2) determining the amount of sampling along the path to reach desired convergence. In particular, an automatic perturbation builder based on a graph matching algorithm is developed, that is able to identify the maximum common substructure of two molecules and provide the perturbation topologies suitable for free-energy calculations using GROMOS and GROMACS simulation packages. Moreover, it was used to calculate the changes in free energy of a set of post-translational modifications and analyze their convergence behavior. Different methods were tested, which showed that MBAR and extended thermodynamic integration (TI) in combination with MBAR show better performance as compared to BAR, extended TI with linear interpolation and plain TI. Also, a number of error estimators were explored and how they relate to the true error, estimated as the difference in free energy from an extensive set of simulation data. This analysis shows that most of the estimators provide only a qualitative agreement to the true error, with little quantitative predictive power. This notwithstanding, the preformed analyses provided insight into the convergence of free-energy calculations, which allowed for development of an iterative update scheme for perturbation simulations that aims at minimizing the simulation time to reach the convergence, i.e., optimizing the efficiency. Importantly, this toolkit is made available online as an open-source python package (https://github.com/drazen-petrov/SMArt).


2021 ◽  
Author(s):  
Drazen Petrov

Free-energy calculations play an important role in the application of computational chemistry to a range of fields, including protein biochemistry, rational drug design or material science. Importantly, the free energy difference is directly related to experimentally measurable quantities such as partition and adsorption coefficients, water activity and binding affinities. Among several techniques aimed at predicting the free-energy differences, perturbation approaches, involving alchemical transformation of one molecule into another through intermediate states, stand out as rigorous methods based on statistical mechanics. However, despite the importance of efficient and accurate free energy predictions, applicability of the perturbation approaches is still largely impeded by a number of challenges. This study aims at addressing two of them: 1) the definition of the perturbation path, i.e., alchemical changes leading to the transformation of one molecule to the other, and 2) determining the amount of sampling along the path to reach desired convergence. In particular, an automatic perturbation builder based on a graph matching algorithm is developed, that is able to identify the maximum common substructure of two molecules and provide the perturbation topologies suitable for free-energy calculations using GROMOS and GROMACS simulation packages. Moreover, it was used to calculate the changes in free energy of a set of post-translational modifications and analyze their convergence behavior. Different methods were tested, which showed that MBAR and extended thermodynamic integration (TI) in combination with MBAR show better performance as compared to BAR, extended TI with linear interpolation and plain TI. Also, a number of error estimators were explored and how they relate to the true error, estimated as the difference in free energy from an extensive set of simulation data. This analysis shows that most of the estimators provide only a qualitative agreement to the true error, with little quantitative predictive power. This notwithstanding, the preformed analyses provided insight into the convergence of free-energy calculations, which allowed for development of an iterative update scheme for perturbation simulations that aims at minimizing the simulation time to reach the convergence, i.e., optimizing the efficiency. Importantly, this toolkit is made available online as an open-source python package (https://github.com/drazen-petrov/SMArt).


2020 ◽  
Author(s):  
Vinita Periwal ◽  
Stefan Bassler ◽  
Sergej Andrejev ◽  
Natalia Gabrielli ◽  
Athanasios Typas ◽  
...  

SummaryNatural products constitute a vast yet largely untapped resource of molecules with therapeutic properties. Computational approaches based on structural similarity offer a scalable approach for evaluating their bioactivity potential. However, this remains challenging due to the immense structural diversity of natural compounds and the complexity of structure-activity relationships. We here assess the bioactivity potential of natural compounds using random forest models utilizing structural fingerprints, maximum common substructure, and molecular descriptors. The models are trained with small-molecule drugs for which the corresponding protein targets are known (1,410 drugs, 0.9 million pairs). Using these models, we evaluated circa 11k natural compounds for functional similarity with therapeutic drugs (1.7 million pairs). The resulting natural compound-drug similarity network consists of several links with support from the published literature as well as links suggestive of unexplored bioactivity of natural compounds. As a proof of concept, we experimentally validated the model-predicted Cox-1 inhibitory activity of 5-methoxysalicylic acid, a compound commonly found in tea, herbs and spices. In contrast, a control compound, with the highest similarity score when using the most weighted fingerprint metric, did not inhibit Cox-1. Our results illustrate the importance of complementing structural similarity with the prior data on molecular interactions, and presents a resource for exploring the therapeutic potential of natural compounds.


2020 ◽  
Vol 49 (11) ◽  
pp. 1302-1305
Author(s):  
Toshiki Higashino ◽  
Kazunori Kuribara ◽  
Naoya Toda ◽  
Sei Uemura ◽  
Hiroaki Tachibana ◽  
...  

2020 ◽  
Vol 12 (1) ◽  
Author(s):  
Aurelio Antelo-Collado ◽  
Ramón Carrasco-Velar ◽  
Nicolás García-Pedrajas ◽  
Gonzalo Cerruela-García

Abstract The maximum common property similarity (MCPhd) method is presented using descriptors as a new approach to determine the similarity between two chemical compounds or molecular graphs. This method uses the concept of maximum common property arising from the concept of maximum common substructure and is based on the electrotopographic state index for atoms. A new algorithm to quantify the similarity values of chemical structures based on the presented maximum common property concept is also developed in this paper. To verify the validity of this approach, the similarity of a sample of compounds with antimalarial activity is calculated and compared with the results obtained by four different similarity methods: the small molecule subgraph detector (SMSD), molecular fingerprint based (OBabel_FP2), ISIDA descriptors and shape-feature similarity (SHAFTS). The results obtained by the MCPhd method differ significantly from those obtained by the compared methods, improving the quantification of the similarity. A major advantage of the proposed method is that it helps to understand the analogy or proximity between physicochemical properties of the molecular fragments or subgraphs compared with the biological response or biological activity. In this new approach, more than one property can be potentially used. The method can be considered a hybrid procedure because it combines descriptor and the fragment approaches.


Molecules ◽  
2020 ◽  
Vol 25 (15) ◽  
pp. 3446 ◽  
Author(s):  
Soumitra Samanta ◽  
Steve O’Hagan ◽  
Neil Swainston ◽  
Timothy J. Roberts ◽  
Douglas B. Kell

Molecular similarity is an elusive but core “unsupervised” cheminformatics concept, yet different “fingerprint” encodings of molecular structures return very different similarity values, even when using the same similarity metric. Each encoding may be of value when applied to other problems with objective or target functions, implying that a priori none are “better” than the others, nor than encoding-free metrics such as maximum common substructure (MCSS). We here introduce a novel approach to molecular similarity, in the form of a variational autoencoder (VAE). This learns the joint distribution p(z|x) where z is a latent vector and x are the (same) input/output data. It takes the form of a “bowtie”-shaped artificial neural network. In the middle is a “bottleneck layer” or latent vector in which inputs are transformed into, and represented as, a vector of numbers (encoding), with a reverse process (decoding) seeking to return the SMILES string that was the input. We train a VAE on over six million druglike molecules and natural products (including over one million in the final holdout set). The VAE vector distances provide a rapid and novel metric for molecular similarity that is both easily and rapidly calculated. We describe the method and its application to a typical similarity problem in cheminformatics.


Sign in / Sign up

Export Citation Format

Share Document