scholarly journals Improved Selection of Rare Reactions in Template-Based Retrosynthesis Predictions

Author(s):  
Mads Koerstz ◽  
Samuel Genheden ◽  
Ola Engkvist ◽  
Jan H. Jensen ◽  
Esben Jannik Bjerrum

Identifying synthetic routes for molecules of interest is a crucial step when discovering new drugs or materials. To find synthetic routes, we can use computer-assisted synthesis planning using expansion policy networks trained on reaction templates extracted from patents and the literature. However, experience has shown that these networks are biased towards frequently reported reactions. This study shows that changing the molecular representation from an extended-connectivity fingerprint to a simple graph representation can increase the accuracy for templates used less than five times by 5.0- 8.5% points. We also illustrate that a simple oversampling of the training set yielded a top-1 accuracy increase in the 17-20% point range for templates used five times or less.

2019 ◽  
Author(s):  
Amol Thakkar ◽  
Thierry Kogej ◽  
Jean-Louis Reymond ◽  
Ola Engkvist ◽  
Esben Jannik Bjerrum

<p>Computer Assisted Synthesis Planning (CASP) has gained considerable interest as of late. Herein we investigate a template-based retrosynthetic planning tool, trained on a variety of datasets consisting of up to 17.5 million reactions. We demonstrate that models trained on datasets such as internal Electronic Laboratory Notebooks (ELN), and the publicly available United States Patent Office (USPTO) extracts, are sufficient for the prediction of full synthetic routes to compounds of interest in medicinal chemistry. As such we have assessed the models on 1,731 compounds from 41 virtual libraries for which experimental results were known. Furthermore, we show that accuracy is a misleading metric for assessment of the ‘filter network’, and propose that the number of successfully applied templates, in conjunction with the overall ability to generate full synthetic routes be examined instead. To this end we found that the specificity of the templates comes at the cost of generalizability, and overall model performance. This is supplemented by a comparison of the underlying datasets and their corresponding models.</p>


2020 ◽  
Author(s):  
Ryosuke Shibukawa ◽  
Shoichi Ishida ◽  
Kazuki Yoshizoe ◽  
Kunihiro Wasa ◽  
Kiyosei Takasu ◽  
...  

In computer-assisted synthesis planning (CASP) programs, providing chemical synthetic routes as many as possible is essential for considering optimal and alternative routes in a chemical reaction network. As the majority of CASP programs have been designed to provide one or a few optimal routes, it is likely that desired one will not be included. To avoid this, an exact algorithm that lists possible synthetic routes from the chemical reaction network is required, alongside a recommendation of synthetic routes that meet specified criteria based on chemist's objectives. Herein, we propose a chemical-reaction-network-based synthetic route recommendation framework called "CompRet" with a mathematically guaranteed enumeration algorithm. In a preliminary experiment, CompRet was shown to successfully provide alternative routes for a known antihistaminic drug, cetirizine. CompRet is expected to promote desirable enumeration-based chemical synthesis searches and aid the development of an interactive CASP framework for chemists.


2020 ◽  
Author(s):  
Ryosuke Shibukawa ◽  
Shoichi Ishida ◽  
Kazuki Yoshizoe ◽  
Kunihiro Wasa ◽  
Kiyosei Takasu ◽  
...  

In computer-assisted synthesis planning (CASP) programs, providing chemical synthetic routes as many as possible is essential for considering optimal and alternative routes in a chemical reaction network. As the majority of CASP programs have been designed to provide one or a few optimal routes, it is likely that desired one will not be included. To avoid this, an exact algorithm that lists possible synthetic routes from the chemical reaction network is required, alongside a recommendation of synthetic routes that meet specified criteria based on chemist's objectives. Herein, we propose a chemical-reaction-network-based synthetic route recommendation framework called "CompRet" with a mathematically guaranteed enumeration algorithm. In a preliminary experiment, CompRet was shown to successfully provide alternative routes for a known antihistaminic drug, cetirizine. CompRet is expected to promote desirable enumeration-based chemical synthesis searches and aid the development of an interactive CASP framework for chemists.


2019 ◽  
Author(s):  
Amol Thakkar ◽  
Thierry Kogej ◽  
Jean-Louis Reymond ◽  
Ola Engkvist ◽  
Esben Jannik Bjerrum

<p>Computer Assisted Synthesis Planning (CASP) has gained considerable interest as of late. Herein we investigate a template-based retrosynthetic planning tool, trained on a variety of datasets consisting of up to 17.5 million reactions. We demonstrate that models trained on datasets such as internal Electronic Laboratory Notebooks (ELN), and the publicly available United States Patent Office (USPTO) extracts, are sufficient for the prediction of full synthetic routes to compounds of interest in medicinal chemistry. As such we have assessed the models on 1,731 compounds from 41 virtual libraries for which experimental results were known. Furthermore, we show that accuracy is a misleading metric for assessment of the ‘filter network’, and propose that the number of successfully applied templates, in conjunction with the overall ability to generate full synthetic routes be examined instead. To this end we found that the specificity of the templates comes at the cost of generalizability, and overall model performance. This is supplemented by a comparison of the underlying datasets and their corresponding models.</p>


2019 ◽  
Author(s):  
Shuangjia Zheng ◽  
Jiahua Rao ◽  
Zhongyue Zhang ◽  
Jun Xu ◽  
Yuedong Yang

<p><a>Synthesis planning is the process of recursively decomposing target molecules into available precursors. Computer-aided retrosynthesis can potentially assist chemists in designing synthetic routes, but at present it is cumbersome and provides results of dissatisfactory quality. In this study, we develop a template-free self-corrected retrosynthesis predictor (SCROP) to perform a retrosynthesis prediction task trained by using the Transformer neural network architecture. In the method, the retrosynthesis planning is converted as a machine translation problem between molecular linear notations of reactants and the products. Coupled with a neural network-based syntax corrector, our method achieves an accuracy of 59.0% on a standard benchmark dataset, which increases >21% over other deep learning methods, and >6% over template-based methods. More importantly, our method shows an accuracy 1.7 times higher than other state-of-the-art methods for compounds not appearing in the training set.</a></p>


2019 ◽  
Author(s):  
Shuangjia Zheng ◽  
Jiahua Rao ◽  
Zhongyue Zhang ◽  
Jun Xu ◽  
Yuedong Yang

<p><a>Synthesis planning is the process of recursively decomposing target molecules into available precursors. Computer-aided retrosynthesis can potentially assist chemists in designing synthetic routes, but at present it is cumbersome and provides results of dissatisfactory quality. In this study, we develop a template-free self-corrected retrosynthesis predictor (SCROP) to perform a retrosynthesis prediction task trained by using the Transformer neural network architecture. In the method, the retrosynthesis planning is converted as a machine translation problem between molecular linear notations of reactants and the products. Coupled with a neural network-based syntax corrector, our method achieves an accuracy of 59.0% on a standard benchmark dataset, which increases >21% over other deep learning methods, and >6% over template-based methods. More importantly, our method shows an accuracy 1.7 times higher than other state-of-the-art methods for compounds not appearing in the training set.</a></p>


Biomolecules ◽  
2021 ◽  
Vol 11 (6) ◽  
pp. 897
Author(s):  
Oliwia Koszła ◽  
Piotr Stępnicki ◽  
Agata Zięba ◽  
Angelika Grudzińska ◽  
Dariusz Matosiuk ◽  
...  

Parkinson’s disease is a progressive neurodegenerative disorder characterized by the death of nerve cells in the substantia nigra of the brain. The treatment options for this disease are very limited as currently the treatment is mainly symptomatic, and the available drugs are not able to completely stop the progression of the disease but only to slow it down. There is still a need to search for new compounds with the most optimal pharmacological profile that would stop the rapidly progressing disease. An increasing understanding of Parkinson’s pathogenesis and the discovery of new molecular targets pave the way to develop new therapeutic agents. The use and selection of appropriate cell and animal models that better reflect pathogenic changes in the brain is a key aspect of the research. In addition, computer-assisted drug design methods are a promising approach to developing effective compounds with potential therapeutic effects. In light of the above, in this review, we present current approaches for developing new drugs for Parkinson’s disease.


2020 ◽  
Vol 24 (8) ◽  
pp. 817-854
Author(s):  
Anil Kumar ◽  
Nishtha Saxena ◽  
Arti Mehrotra ◽  
Nivedita Srivastava

Quinolone derivatives have attracted considerable attention due to their medicinal properties. This review covers many synthetic routes of quinolones preparation with their antibacterial properties. Detailed study with structure-activity relationship among quinolone derivatives will be helpful in designing new drugs in this field.


2020 ◽  
Vol 16 (6) ◽  
pp. 784-795
Author(s):  
Krisnna M.A. Alves ◽  
Fábio José Bonfim Cardoso ◽  
Kathia M. Honorio ◽  
Fábio A. de Molfetta

Background:: Leishmaniosis is a neglected tropical disease and glyceraldehyde 3- phosphate dehydrogenase (GAPDH) is a key enzyme in the design of new drugs to fight this disease. Objective:: The present study aimed to evaluate potential inhibitors of GAPDH enzyme found in Leishmania mexicana (L. mexicana). Methods: A search for novel antileishmanial molecules was carried out based on similarities from the pharmacophoric point of view related to the binding site of the crystallographic enzyme using the ZINCPharmer server. The molecules selected in this screening were subjected to molecular docking and molecular dynamics simulations. Results:: Consensual analysis of the docking energy values was performed, resulting in the selection of ten compounds. These ligand-receptor complexes were visually inspected in order to analyze the main interactions and subjected to toxicophoric evaluation, culminating in the selection of three compounds, which were subsequently submitted to molecular dynamics simulations. The docking results showed that the selected compounds interacted with GAPDH from L. mexicana, especially by hydrogen bonds with Cys166, Arg249, His194, Thr167, and Thr226. From the results obtained from molecular dynamics, it was observed that one of the loop regions, corresponding to the residues 195-222, can be related to the fitting of the substrate at the binding site, assisting in the positioning and the molecular recognition via residues responsible for the catalytic activity. Conclusion:: he use of molecular modeling techniques enabled the identification of promising compounds as inhibitors of the GAPDH enzyme from L. mexicana, and the results obtained here can serve as a starting point to design new and more effective compounds than those currently available.


Author(s):  
Søren Ager Meldgaard ◽  
Jonas Köhler ◽  
Henrik Lund Mortensen ◽  
Mads-Peter Verner Christiansen ◽  
Frank Noé ◽  
...  

Abstract Chemical space is routinely explored by machine learning methods to discover interesting molecules, before time-consuming experimental synthesizing is attempted. However, these methods often rely on a graph representation, ignoring 3D information necessary for determining the stability of the molecules. We propose a reinforcement learning approach for generating molecules in cartesian coordinates allowing for quantum chemical prediction of the stability. To improve sample-efficiency we learn basic chemical rules from imitation learning on the GDB-11 database to create an initial model applicable for all stoichiometries. We then deploy multiple copies of the model conditioned on a specific stoichiometry in a reinforcement learning setting. The models correctly identify low energy molecules in the database and produce novel isomers not found in the training set. Finally, we apply the model to larger molecules to show how reinforcement learning further refines the imitation learning model in domains far from the training data.


Sign in / Sign up

Export Citation Format

Share Document