scholarly journals Data-driven Chemical Reaction Prediction and Retrosynthesis

2019 ◽  
Vol 73 (12) ◽  
pp. 997-1000
Author(s):  
Vishnu H Nair ◽  
Philippe Schwaller ◽  
Teodoro Laino

The synthesis of organic compounds, which is central to many areas such as drug discovery, material synthesis and biomolecular chemistry, requires chemists to have years of knowledge and experience. The development of technologies with the potential to learn and support experts in the design of synthetic routes is a half-century-old challenge with an interesting revival in the last decade. In fact, the renewed interest in artificial intelligence (AI), driven mainly by data availability, is profoundly changing the landscape of computer-aided chemical reaction prediction and retrosynthetic analysis. In this article, we briefly review different approaches to predict forward reactions and retrosynthesis, with a strong focus on data-driven ones. While data-driven technologies still need to demonstrate their full potential compared to expert rule-based systems in synthetic chemistry, the acceleration experienced in the last decade is a convincing sign that where we use software today, there will be AI tomorrow. This revolution will help and empower bench chemists, driving the transformation of chemistry towards a high-tech business over the next decades.

2019 ◽  
Author(s):  
Shoichi Ishida ◽  
Kei Terayama ◽  
ryosuke kojima ◽  
Kiyosei Takasu ◽  
Yasushi Okuno

<div>Recently, many research groups have been addressing data-driven approaches for reaction prediction and retrosynthetic analysis. Although the performances of the data-driven approach have progressed due to recent advances of machine learning and deep learning techniques, problems such as improving capability of reaction prediction and the black-box problem of neural networks still persist for practical use by chemists. To expand data-driven approaches to chemists, we focused on two challenges: improvement of reaction prediction and interpretability of the prediction. In this paper, we propose an interpretable prediction framework using Graph Convolutional Networks (GCN) for reaction prediction and Integrated Gradients (IGs) for visualization of contributions to the prediction to address these challenges. As a result, our model showed better performances than the approach using Extended-Connectivity Fingerprint (ECFP). Furthermore, IGs based visualization of the GCN prediction successfully highlighted reaction-related atoms.</div>


2020 ◽  
Author(s):  
Shoichi Ishida ◽  
Kei Terayama ◽  
Ryosuke Kojima ◽  
Kiyosei Takasu ◽  
Yasushi Okuno

<div>Computer-aided synthesis planning (CASP) aims to assist chemists in performing retrosynthetic analysis for which they exploit their experiments, intuition, and knowledge. Recent breakthroughs in machine learning techniques, including deep neural networks, have significantly improved data-driven synthetic route designs without human interventions. However, such CASP applications are yet to incorporate retrosynthesis knowledge sufficiently into their algorithms to reflect chemists' way of thinking flexibly. In this study, we developed a hybrid CASP application of data-driven techniques and various retrosynthesis knowledge called "ReTReK" that integrates the knowledge as adjustable parameters into an evaluation for promising search directions. Experimental results showed that ReTReK successfully searched synthetic routes based on the specified retrosynthesis knowledge, and the results indicated that the synthetic routes searched with the knowledge were preferred to those without knowledge. The concept of integrating retrosynthesis knowledge as adjustable parameters into data-driven CASP applications is expected to contribute to further their development and spread them to chemists widely. </div>


2019 ◽  
Author(s):  
Shoichi Ishida ◽  
Kei Terayama ◽  
ryosuke kojima ◽  
Kiyosei Takasu ◽  
Yasushi Okuno

<div>Recently, many research groups have been addressing data-driven approaches for reaction prediction and retrosynthetic analysis. Although the performances of the data-driven approach have progressed due to recent advances of machine learning and deep learning techniques, problems such as improving capability of reaction prediction and the black-box problem of neural networks still persist for practical use by chemists. To expand data-driven approaches to chemists, we focused on two challenges: improvement of reaction prediction and interpretability of the prediction. In this paper, we propose an interpretable prediction framework using Graph Convolutional Networks (GCN) for reaction prediction and Integrated Gradients (IGs) for visualization of contributions to the prediction to address these challenges. As a result, our model showed better performances than the approach using Extended-Connectivity Fingerprint (ECFP). Furthermore, IGs based visualization of the GCN prediction successfully highlighted reaction-related atoms.</div>


2020 ◽  
Author(s):  
Shoichi Ishida ◽  
Kei Terayama ◽  
Ryosuke Kojima ◽  
Kiyosei Takasu ◽  
Yasushi Okuno

<div>Computer-aided synthesis planning (CASP) aims to assist chemists in performing retrosynthetic analysis for which they exploit their experiments, intuition, and knowledge. Recent breakthroughs in machine learning techniques, including deep neural networks, have significantly improved data-driven synthetic route designs without human interventions. However, such CASP applications are yet to incorporate retrosynthesis knowledge sufficiently into their algorithms to reflect chemists' way of thinking flexibly. In this study, we developed a hybrid CASP application of data-driven techniques and various retrosynthesis knowledge called "ReTReK" that integrates the knowledge as adjustable parameters into an evaluation for promising search directions. Experimental results showed that ReTReK successfully searched synthetic routes based on the specified retrosynthesis knowledge, and the results indicated that the synthetic routes searched with the knowledge were preferred to those without knowledge. The concept of integrating retrosynthesis knowledge as adjustable parameters into data-driven CASP applications is expected to contribute to further their development and spread them to chemists widely. </div>


2020 ◽  
Author(s):  
Tsuyoshi Mita ◽  
Yu Harabuchi ◽  
Satoshi Maeda

The systematic exploration of synthetic pathways to afford a desired product through quantum chemical calculations remains a considerable challenge. In 2013, Maeda et al. introduced ‘quantum chemistry aided retrosynthetic analysis’ (QCaRA), which uses quantum chemical calculations to search systematically for decomposition paths of the target product and propose a synthesis method. However, until now, no new reactions suggested by QCaRA have been reported to lead to experimental discoveries. Using a difluoroglycine derivative as a target, this study investigated the ability of QCaRA to suggest various synthetic paths to the target without relying on previous data or the knowledge and experience of chemists. Furthermore, experimental verification of the seemingly most promising path led to the discovery of a synthesis method for the difluoroglycine derivative. The extent of the hands-on expertise of chemists required during the verification process was also evaluated. These insights are expected to advance the applicability of QCaRA to the discovery of viable experimental synthetic routes.


2020 ◽  
Author(s):  
Tsuyoshi Mita ◽  
Yu Harabuchi ◽  
Satoshi Maeda

The systematic exploration of synthetic pathways to afford a desired product through quantum chemical calculations remains a considerable challenge. In 2013, Maeda et al. introduced ‘quantum chemistry aided retrosynthetic analysis’ (QCaRA), which uses quantum chemical calculations to search systematically for decomposition paths of the target product and propose a synthesis method. However, until now, no new reactions suggested by QCaRA have been reported to lead to experimental discoveries. Using a difluoroglycine derivative as a target, this study investigated the ability of QCaRA to suggest various synthetic paths to the target without relying on previous data or the knowledge and experience of chemists. Furthermore, experimental verification of the seemingly most promising path led to the discovery of a synthesis method for the difluoroglycine derivative. The extent of the hands-on expertise of chemists required during the verification process was also evaluated. These insights are expected to advance the applicability of QCaRA to the discovery of viable experimental synthetic routes.


10.29007/fbh3 ◽  
2018 ◽  
Author(s):  
Xiaohan Li ◽  
Patrick Willems

Urban flood pre-warning decisions made upon urban flood modeling is crucial for human and property management in urban area. However, urbanization, changing environmental conditions and climate change are challenging urban sewer models for their adaptability. While hydraulic models are capable of making accurate flood predictions, they are less flexible and more computationally expensive compared with conceptual models, which are simpler and more efficient. In the era of exploding data availability and computing techniques, data-driven models are gaining popularity in urban flood modelling, but meanwhile suffer from data sparseness. To overcome this issue, a hybrid urban flood modeling approach is proposed in this study. It incorporates a conceptual model to account for the dominant sewer hydrological processes and a logistic regression model able to predict the probabilities of flooding on a sub-urban scale. This approach is demonstrated for a highly urbanized area in Antwerp, Belgium. After comparison with a 1D/0D hydrodynamic model, its ability is shown with promising results to make probabilistic flood predictions, regardless of rainfall types or seasonal variation. In addition, the model has higher tolerance on data input quality and is fully adaptive for real time applications.


2017 ◽  
Vol 19 (1) ◽  
pp. 127-139 ◽  
Author(s):  
Jun Li ◽  
Eric M. Simmons ◽  
Martin D. Eastgate

A predictive analytics approach to understanding process mass intensity (PMI) is described. This method leverages real-world data to predict probable PMI outcomes for a potential synthetic route and to compare PMI outcomes to the summation of prior experience.


MRS Advances ◽  
2020 ◽  
Vol 5 (29-30) ◽  
pp. 1497-1511
Author(s):  
Sergey V. Barabash

ABSTRACTWe describe how the development of advanced materials via high-throughput experimentation at Intermolecular® is accelerated using guidance from modelling, machine learning (ML) and other data-driven approaches. Focusing on rapid development of materials for the semiconductor industry at a reasonable cost, we review the strengths and the limitations of data-driven methods. ML applied to the experimental data accelerates the development of record-breaking materials, but needs a supply of physically meaningful descriptors to succeed in a practical setting. Theoretical materials design greatly benefits from the external modelling ecosystems that have arisen over the last decade, enabling a rapid theoretical screening of materials, including additional material layers introduced to improve the performance of the material stack as a whole, “dopants” to stabilize a given phase of a polymorphic material, etc. We discuss the relative importance of different approaches, and note that the success rates for seemingly similar problems can be drastically different. We then discuss the methods that assist experimentation by providing better phase identification. Finally, we compare the strengths of different approaches, using as an example the problem of identifying regions of thermodynamic stability in multicomponent systems.


Sign in / Sign up

Export Citation Format

Share Document