Molecular Generation Targeting Desired Electronic Properties via Deep Generative Models

Author(s):  
Qi Yuan ◽  
Alejandro Santana-Bonilla ◽  
Martijn Zwijnenburg ◽  
Kim Jelfs

<p>The chemical space for novel electronic donor-acceptor oligomers with targeted properties was explored using deep generative models and transfer learning. A General Recurrent Neural Network model was trained from the ChEMBL database to generate chemically valid SMILES strings. The parameters of the General Recurrent Neural Network were fine-tuned via transfer learning using the electronic donor-acceptor database from the Computational Material Repository to generate novel donor-acceptor oligomers. Six different transfer learning models were developed with different subsets of the donor-acceptor database as training sets. We concluded that electronic properties such as HOMO-LUMO gaps and dipole moments of the training sets can be learned using the SMILES representation with deep generative models, and that the chemical space of the training sets can be efficiently explored. This approach identified approximately 1700 new molecules that have promising electronic properties (HOMO-LUMO gap <2 eV and dipole moment <2 Debye), 6-times more than in the original database. Amongst the molecular transformations, the deep generative model has learned how to produce novel molecules by trading off between selected atomic substitutions (such as halogenation or methylation) and molecular features such as the spatial extension of the oligomer. The method can be extended as a plausible source of new chemical combinations to effectively explore the chemical space for targeted properties.</p>

2019 ◽  
Author(s):  
Qi Yuan ◽  
Alejandro Santana-Bonilla ◽  
Martijn Zwijnenburg ◽  
Kim Jelfs

<p>The chemical space for novel electronic donor-acceptor oligomers with targeted properties was explored using deep generative models and transfer learning. A General Recurrent Neural Network model was trained from the ChEMBL database to generate chemically valid SMILES strings. The parameters of the General Recurrent Neural Network were fine-tuned via transfer learning using the electronic donor-acceptor database from the Computational Material Repository to generate novel donor-acceptor oligomers. Six different transfer learning models were developed with different subsets of the donor-acceptor database as training sets. We concluded that electronic properties such as HOMO-LUMO gaps and dipole moments of the training sets can be learned using the SMILES representation with deep generative models, and that the chemical space of the training sets can be efficiently explored. This approach identified approximately 1700 new molecules that have promising electronic properties (HOMO-LUMO gap <2 eV and dipole moment <2 Debye), 6-times more than in the original database. Amongst the molecular transformations, the deep generative model has learned how to produce novel molecules by trading off between selected atomic substitutions (such as halogenation or methylation) and molecular features such as the spatial extension of the oligomer. The method can be extended as a plausible source of new chemical combinations to effectively explore the chemical space for targeted properties.</p>


2019 ◽  
Author(s):  
Qi Yuan ◽  
Alejandro Santana-Bonilla ◽  
Martijn Zwijnenburg ◽  
Kim Jelfs

<p>The chemical space for novel electronic donor-acceptor oligomers with targeted properties was explored using deep generative models and transfer learning. A General Recurrent Neural Network model was trained from the ChEMBL database to generate chemically valid SMILES strings. The parameters of the General Recurrent Neural Network were fine-tuned via transfer learning using the electronic donor-acceptor database from the Computational Material Repository to generate novel donor-acceptor oligomers. Six different transfer learning models were developed with different subsets of the donor-acceptor database as training sets. We concluded that electronic properties such as HOMO-LUMO gaps and dipole moments of the training sets can be learned using the SMILES representation with deep generative models, and that the chemical space of the training sets can be efficiently explored. This approach identified approximately 1700 new molecules that have promising electronic properties (HOMO-LUMO gap <2 eV and dipole moment <2 Debye), 6-times more than in the original database. Amongst the molecular transformations, the deep generative model has learned how to produce novel molecules by trading off between selected atomic substitutions (such as halogenation or methylation) and molecular features such as the spatial extension of the oligomer. The method can be extended as a plausible source of new chemical combinations to effectively explore the chemical space for targeted properties.</p>


Nanoscale ◽  
2020 ◽  
Vol 12 (12) ◽  
pp. 6744-6758 ◽  
Author(s):  
Qi Yuan ◽  
Alejandro Santana-Bonilla ◽  
Martijn A. Zwijnenburg ◽  
Kim E. Jelfs

A generative recurrent neural network (RNN) model was developed to target and explore the chemical space of electronic donor–acceptor oligomers effectively.


2020 ◽  
Author(s):  
Shuheng Huang ◽  
Hu Mei ◽  
Laichun Lu ◽  
Tingting Shi ◽  
Linxin Chen ◽  
...  

Abstract Due to the potencies in the treatments of neurodegenerative diseases, caspase-6 inhibitors have attracted widespread attentions. Herein, gated recurrent unit (GRU)-based recurrent neural network (RNN) combined with transfer learning was used to build the molecular generative model of caspase-6 inhibitors. The results showed that the GRU-based RNN model can learn accurately the SMILES grammars of about 2.4 million chemical molecules including ionic and isomeric compounds, and can generate potential caspase-6 inhibitors after transfer learning of the known 433 caspase-6 inhibitors. Further exploration of the chemical space and molecular docking showed that the generated potential inhibitors have similar chemical space distributions and binding mechanisms with the known caspase-6 inhibitors. In addition, 3 potential caspase-6 inhibitors with nanomolar-level activities were obtained and proved to be the most promising candidates for the further researches. In general, this paper provides an efficient combinational strategy for de novo molecular design of caspase-6 inhibitors.


2021 ◽  
Vol 7 (1) ◽  
Author(s):  
Yongtae Kim ◽  
Youngsoo Kim ◽  
Charles Yang ◽  
Kundo Park ◽  
Grace X. Gu ◽  
...  

AbstractNeural network-based generative models have been actively investigated as an inverse design method for finding novel materials in a vast design space. However, the applicability of conventional generative models is limited because they cannot access data outside the range of training sets. Advanced generative models that were devised to overcome the limitation also suffer from the weak predictive power on the unseen domain. In this study, we propose a deep neural network-based forward design approach that enables an efficient search for superior materials far beyond the domain of the initial training set. This approach compensates for the weak predictive power of neural networks on an unseen domain through gradual updates of the neural network with active transfer learning and data augmentation methods. We demonstrate the potential of our framework with a grid composite optimization problem that has an astronomical number of possible design configurations. Results show that our proposed framework can provide excellent designs close to the global optima, even with the addition of a very small dataset corresponding to less than 0.5% of the initial training dataset size.


Author(s):  
Justin S Smith ◽  
Benjamin T. Nebgen ◽  
Roman Zubatyuk ◽  
Nicholas Lubbers ◽  
Christian Devereux ◽  
...  

<p>Computational modeling of chemical and biological systems at atomic resolution is a crucial tool in the chemist's toolset. The use of computer simulations requires a balance between cost and accuracy: quantum-mechanical methods provide high accuracy but are computationally expensive and scale poorly to large systems, while classical force fields are cheap and scalable, but lack transferability to new systems. Machine learning can be used to achieve the best of both approaches. Here we train a general-purpose neural network potential (ANI-1ccx) that approaches CCSD(T)/CBS accuracy on benchmarks for reaction thermochemistry, isomerization, and drug-like molecular torsions. This is achieved by training a network to DFT data then using transfer learning techniques to retrain on a dataset of gold standard QM calculations (CCSD(T)/CBS) that optimally spans chemical space. The resulting potential is broadly applicable to materials science, biology and chemistry, and billions of times faster<i></i>than CCSD(T)/CBS calculations. </p>


2021 ◽  
Vol 14 (12) ◽  
pp. 1249
Author(s):  
Shuheng Huang ◽  
Hu Mei ◽  
Laichun Lu ◽  
Minyao Qiu ◽  
Xiaoqi Liang ◽  
...  

Due to their potential in the treatment of neurodegenerative diseases, caspase-6 inhibitors have attracted widespread attention. However, the existing caspase-6 inhibitors showed more or less inevitable deficiencies that restrict their clinical development and applications. Therefore, there is an urgent need to develop novel caspase-6 candidate inhibitors. Herein, a gated recurrent unit (GRU)-based recurrent neural network (RNN) combined with transfer learning was used to build a molecular generative model of caspase-6 inhibitors. The results showed that the GRU-based RNN model can accurately learn the SMILES grammars of about 2.4 million chemical molecules including ionic and isomeric compounds and can generate potential caspase-6 inhibitors after transfer learning of the known 433 caspase-6 inhibitors. Based on the novel molecules derived from the molecular generative model, an optimal logistic regression model and Surflex-dock were employed for predicting and ranking the inhibitory activities. According to the prediction results, three potential caspase-6 inhibitors with different scaffolds were selected as the promising candidates for further research. In general, this paper provides an efficient combinational strategy for de novo molecular design of caspase-6 inhibitors.


2020 ◽  
Author(s):  
Shuheng Huang ◽  
Hu Mei ◽  
Laichun Lu ◽  
Tingting Shi ◽  
Linxin Chen ◽  
...  

Abstract Due to the potencies in the treatments of neurodegenerative diseases, caspase-6 inhibitors have attracted widespread attentions. However, the existing caspase-6 inhibitors showed more or less inevitable deficiencies that restrict their clinical development and applications. Therefore, there is an urgent need to develop novel caspase-6 candidate inhibitors. Herein, gated recurrent unit (GRU)-based recurrent neural network (RNN) combined with transfer learning was used to build the molecular generative model of caspase-6 inhibitors. The results showed that the GRU-based RNN model can learn accurately the SMILES grammars of about 2.4 million chemical molecules including ionic and isomeric compounds, and can generate potential caspase-6 inhibitors after transfer learning of the known 433 caspase-6 inhibitors. Based on the novel molecules derived from the molecular generative model, an optimal machine learning model and Surflex-dock were further employed for predicting and ranking the inhibitory activities. Three potential caspase-6 inhibitors with different scaffolds were selected as the most promising candidates for further researches. In general, this paper provides an efficient combinational strategy for de novo molecular design of caspase-6 inhibitors.


Sign in / Sign up

Export Citation Format

Share Document