scholarly journals RetroPrime: A Chemistry-Inspired and Transformer-based Method for Retrosynthesis Predictions

Author(s):  
Xiaorui Wang ◽  
Jiezhong Qiu ◽  
Yuquan Li ◽  
Guangyong Chen ◽  
Huanxiang Liu ◽  
...  

Retrosynthesis prediction is a crucial task for organic synthesis. In this work, we propose a template-free and Transformer-based method dubbed RetroPrime, integrating chemists’ retrosynthetic strategy of (1) decomposing a molecule into synthons then (2) generating reactants by attaching leaving groups. These two steps are accomplished with versatile Transformer models, respectively. While RetroPrime performs competitively against all state-of-the art models on the standard USPTO-50K dataset, it manifests remarkable generalizability and outperforms the only published result by a non-trivial margin of 4.8% for the Top-1 accuracy on the large-scale USPTO-full dataset. It is known that outputs of Transformer-based retrosynthesis model tend to suffer from insufficient diversity and high invalidity. These problems may limit the potential of Transformer-based methods in real practice, yet no prior works address both issues simultaneously. RetroPrime is designed to tackle these challenges. Finally, we provide convincing results to support the claim that RetromPrime can more effectively generalize across chemical space.

2020 ◽  
Author(s):  
Xiaorui Wang ◽  
Jiezhong Qiu ◽  
Yuquan Li ◽  
Guangyong Chen ◽  
Huanxiang Liu ◽  
...  

Retrosynthesis prediction is a crucial task for organic synthesis. In this work, we propose a template-free and Transformer-based method dubbed RetroPrime, integrating chemists’ retrosynthetic strategy of (1) decomposing a molecule into synthons then (2) generating reactants by attaching leaving groups. These two steps are accomplished with versatile Transformer models, respectively. While RetroPrime performs competitively against all state-of-the art models on the standard USPTO-50K dataset, it manifests remarkable generalizability and outperforms the only published result by a non-trivial margin of 4.8% for the Top-1 accuracy on the large-scale USPTO-full dataset. It is known that outputs of Transformer-based retrosynthesis model tend to suffer from insufficient diversity and high invalidity. These problems may limit the potential of Transformer-based methods in real practice, yet no prior works address both issues simultaneously. RetroPrime is designed to tackle these challenges. Finally, we provide convincing results to support the claim that RetromPrime can more effectively generalize across chemical space.


2020 ◽  
Author(s):  
Xiaorui Wang ◽  
Jiezhong Qiu ◽  
Yuquan Li ◽  
Guangyong Chen ◽  
Huanxiang Liu ◽  
...  

Retrosynthesis prediction is a crucial task for organic synthesis. In this work, we propose a template-free and Transformer-based method dubbed RetroPrime, integrating chemists’ retrosynthetic strategy of (1) decomposing a molecule into synthons then (2) generating reactants by attaching leaving groups. These two steps are accomplished with versatile Transformer models, respectively. While RetroPrime performs competitively against all state-of-the art models on the standard USPTO-50K dataset, it manifests remarkable generalizability and outperforms the only published result by a non-trivial margin of 4.8% for the Top-1 accuracy on the large-scale USPTO-full dataset. It is known that outputs of Transformer-based retrosynthesis model tend to suffer from insufficient diversity and high invalidity. These problems may limit the potential of Transformer-based methods in real practice, yet no prior works address both issues simultaneously. RetroPrime is designed to tackle these challenges. Finally, we provide convincing results to support the claim that RetromPrime can more effectively generalize across chemical space.


2020 ◽  
Author(s):  
Chaochao Yan ◽  
Qianggang Ding ◽  
Peilin Zhao ◽  
Shuangjia Zheng ◽  
Jinyu Yang ◽  
...  

<div>Retrosynthesis is the process of recursively decomposing target molecules into available building blocks. It plays an important role in solving problems in organic synthesis planning. To automate the retrosynthesis analysis, many retrosynthesis prediction methods have been proposed.</div><div>However, most of them are cumbersome and lack interpretability about their predictions.</div><div>In this paper, we devise a novel template-free algorithm, RetroXpert, for automatic retrosynthetic expansion by automating the procedure that chemists used to do.</div><div>Our method disassembles retrosynthesis into two steps: i) we identify the potential reaction center within the target molecule through a graph neural network and generate intermediate synthons; and ii) we predict the associated reactants based on the obtained synthons via a reactant generation model. </div><div>While outperforming the state-of-the-art baselines by a significant margin, our model also provides chemically reasonable interpretation.</div>


2020 ◽  
Author(s):  
Chaochao Yan ◽  
Qianggang Ding ◽  
Peilin Zhao ◽  
Shuangjia Zheng ◽  
Jinyu Yang ◽  
...  

<div>Retrosynthesis is the process of recursively decomposing target molecules into available building blocks. It plays an important role in solving problems in organic synthesis planning. To automate the retrosynthesis analysis, many retrosynthesis prediction methods have been proposed.</div><div>However, most of them are cumbersome and lack interpretability about their predictions.</div><div>In this paper, we devise a novel template-free algorithm, RetroXpert, for automatic retrosynthetic expansion by automating the procedure that chemists used to do.</div><div>Our method disassembles retrosynthesis into two steps: i) we identify the potential reaction center within the target molecule through a graph neural network and generate intermediate synthons; and ii) we predict the associated reactants based on the obtained synthons via a reactant generation model. </div><div>While outperforming the state-of-the-art baselines by a significant margin, our model also provides chemically reasonable interpretation.</div>


2020 ◽  
Author(s):  
Chaochao Yan ◽  
Qianggang Ding ◽  
Peilin Zhao ◽  
Shuangjia Zheng ◽  
Jinyu Yang ◽  
...  

<div>Retrosynthesis is the process of recursively decomposing target molecules into available building blocks. It plays an important role in solving problems in organic synthesis planning. To automate the retrosynthesis analysis, many retrosynthesis prediction methods have been proposed.</div><div>However, most of them are cumbersome and lack interpretability about their predictions.</div><div>In this paper, we devise a novel template-free algorithm, RetroXpert, for automatic retrosynthetic expansion by automating the procedure that chemists used to do.</div><div>Our method disassembles retrosynthesis into two steps: i) we identify the potential reaction center within the target molecule through a graph neural network and generate intermediate synthons; and ii) we predict the associated reactants based on the obtained synthons via a reactant generation model. </div><div>While outperforming the state-of-the-art baselines by a significant margin, our model also provides chemically reasonable interpretation.</div>


2020 ◽  
Author(s):  
Chaochao Yan ◽  
Qianggang Ding ◽  
Peilin Zhao ◽  
Shuangjia Zheng ◽  
Jinyu Yang ◽  
...  

<div>Retrosynthesis is the process of recursively decomposing target molecules into available building blocks. It plays an important role in solving problems in organic synthesis planning. To automate the retrosynthesis analysis, many retrosynthesis prediction methods have been proposed.</div><div>However, most of them are cumbersome and lack interpretability about their predictions.</div><div>In this paper, we devise a novel template-free algorithm, RetroXpert, for automatic retrosynthetic expansion by automating the procedure that chemists used to do.</div><div>Our method disassembles retrosynthesis into two steps: i) we identify the potential reaction center within the target molecule through a graph neural network and generate intermediate synthons; and ii) we predict the associated reactants based on the obtained synthons via a reactant generation model. </div><div>While outperforming the state-of-the-art baselines by a significant margin, our model also provides chemically reasonable interpretation.</div>


2019 ◽  
Author(s):  
Kyle Konze ◽  
Pieter Bos ◽  
Markus Dahlgren ◽  
Karl Leswing ◽  
Ivan Tubert-Brohman ◽  
...  

We report a new computational technique, PathFinder, that uses retrosynthetic analysis followed by combinatorial synthesis to generate novel compounds in synthetically accessible chemical space. Coupling PathFinder with active learning and cloud-based free energy calculations allows for large-scale potency predictions of compounds on a timescale that impacts drug discovery. The process is further accelerated by using a combination of population-based statistics and active learning techniques. Using this approach, we rapidly optimized R-groups and core hops for inhibitors of cyclin-dependent kinase 2. We explored greater than 300 thousand ideas and identified 35 ligands with diverse commercially available R-groups and a predicted IC<sub>50</sub> < 100 nM, and four unique cores with a predicted IC<sub>50</sub> < 100 nM. The rapid turnaround time, and scale of chemical exploration, suggests that this is a useful approach to accelerate the discovery of novel chemical matter in drug discovery campaigns.


2019 ◽  
Author(s):  
Kyle Konze ◽  
Pieter Bos ◽  
Markus Dahlgren ◽  
Karl Leswing ◽  
Ivan Tubert-Brohman ◽  
...  

We report a new computational technique, PathFinder, that uses retrosynthetic analysis followed by combinatorial synthesis to generate novel compounds in synthetically accessible chemical space. Coupling PathFinder with active learning and cloud-based free energy calculations allows for large-scale potency predictions of compounds on a timescale that impacts drug discovery. The process is further accelerated by using a combination of population-based statistics and active learning techniques. Using this approach, we rapidly optimized R-groups and core hops for inhibitors of cyclin-dependent kinase 2. We explored greater than 300 thousand ideas and identified 35 ligands with diverse commercially available R-groups and a predicted IC<sub>50</sub> < 100 nM, and four unique cores with a predicted IC<sub>50</sub> < 100 nM. The rapid turnaround time, and scale of chemical exploration, suggests that this is a useful approach to accelerate the discovery of novel chemical matter in drug discovery campaigns.


2018 ◽  
Vol 14 (12) ◽  
pp. 1915-1960 ◽  
Author(s):  
Rudolf Brázdil ◽  
Andrea Kiss ◽  
Jürg Luterbacher ◽  
David J. Nash ◽  
Ladislava Řezníčková

Abstract. The use of documentary evidence to investigate past climatic trends and events has become a recognised approach in recent decades. This contribution presents the state of the art in its application to droughts. The range of documentary evidence is very wide, including general annals, chronicles, memoirs and diaries kept by missionaries, travellers and those specifically interested in the weather; records kept by administrators tasked with keeping accounts and other financial and economic records; legal-administrative evidence; religious sources; letters; songs; newspapers and journals; pictographic evidence; chronograms; epigraphic evidence; early instrumental observations; society commentaries; and compilations and books. These are available from many parts of the world. This variety of documentary information is evaluated with respect to the reconstruction of hydroclimatic conditions (precipitation, drought frequency and drought indices). Documentary-based drought reconstructions are then addressed in terms of long-term spatio-temporal fluctuations, major drought events, relationships with external forcing and large-scale climate drivers, socio-economic impacts and human responses. Documentary-based drought series are also considered from the viewpoint of spatio-temporal variability for certain continents, and their employment together with hydroclimate reconstructions from other proxies (in particular tree rings) is discussed. Finally, conclusions are drawn, and challenges for the future use of documentary evidence in the study of droughts are presented.


Electronics ◽  
2021 ◽  
Vol 10 (15) ◽  
pp. 1807
Author(s):  
Sascha Grollmisch ◽  
Estefanía Cano

Including unlabeled data in the training process of neural networks using Semi-Supervised Learning (SSL) has shown impressive results in the image domain, where state-of-the-art results were obtained with only a fraction of the labeled data. The commonality between recent SSL methods is that they strongly rely on the augmentation of unannotated data. This is vastly unexplored for audio data. In this work, SSL using the state-of-the-art FixMatch approach is evaluated on three audio classification tasks, including music, industrial sounds, and acoustic scenes. The performance of FixMatch is compared to Convolutional Neural Networks (CNN) trained from scratch, Transfer Learning, and SSL using the Mean Teacher approach. Additionally, a simple yet effective approach for selecting suitable augmentation methods for FixMatch is introduced. FixMatch with the proposed modifications always outperformed Mean Teacher and the CNNs trained from scratch. For the industrial sounds and music datasets, the CNN baseline performance using the full dataset was reached with less than 5% of the initial training data, demonstrating the potential of recent SSL methods for audio data. Transfer Learning outperformed FixMatch only for the most challenging dataset from acoustic scene classification, showing that there is still room for improvement.


Sign in / Sign up

Export Citation Format

Share Document