scholarly journals De novo design and bioactivity prediction of SARS-CoV-2 main protease inhibitors using recurrent neural network-based transfer learning

BMC Chemistry ◽  
2021 ◽  
Vol 15 (1) ◽  
Author(s):  
Marcos V. S. Santana ◽  
Floriano P. Silva-Jr

AbstractThe global pandemic of coronavirus disease (COVID-19) caused by SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) created a rush to discover drug candidates. Despite the efforts, so far no vaccine or drug has been approved for treatment. Artificial intelligence offers solutions that could accelerate the discovery and optimization of new antivirals, especially in the current scenario dominated by the scarcity of compounds active against SARS-CoV-2. The main protease (Mpro) of SARS-CoV-2 is an attractive target for drug discovery due to the absence in humans and the essential role in viral replication. In this work, we developed a deep learning platform for de novo design of putative inhibitors of SARS-CoV-2 main protease (Mpro). Our methodology consists of 3 main steps: (1) training and validation of general chemistry-based generative model; (2) fine-tuning of the generative model for the chemical space of SARS-CoV- Mpro inhibitors and (3) training of a classifier for bioactivity prediction using transfer learning. The fine-tuned chemical model generated > 90% valid, diverse and novel (not present on the training set) structures. The generated molecules showed a good overlap with Mpro chemical space, displaying similar physicochemical properties and chemical structures. In addition, novel scaffolds were also generated, showing the potential to explore new chemical series. The classification model outperformed the baseline area under the precision-recall curve, showing it can be used for prediction. In addition, the model also outperformed the freely available model Chemprop on an external test set of fragments screened against SARS-CoV-2 Mpro, showing its potential to identify putative antivirals to tackle the COVID-19 pandemic. Finally, among the top-20 predicted hits, we identified nine hits via molecular docking displaying binding poses and interactions similar to experimentally validated inhibitors.

2020 ◽  
Author(s):  
Marcos Santana ◽  
Floriano Paes Silva

Abstract The global pandemic of coronavirus disease (COVID-19) caused by SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) created a rush to discover drug candidates. Despite the efforts, so far no vaccine or drug has been approved for treatment. Artificial intelligence offers solutions that could accelerate the discovery and optimization of new antivirals, especially in the current scenario dominated by the scarcity of compounds active against SARS-CoV-2. The main protease (Mpro) of SARS-CoV-2 is an attractive target for drug discovery due to the absence in humans and the essential role in viral replication. In this work, we developed a deep learning platform for de novo design of putative inhibitors of SARS-CoV-2 main protease (M pro ). Our methodology consists of 3 main steps: 1) training and validation of general chemistry-based generative model; 2) fine-tuning of the generative model for the chemical space of SARS-CoV- M pro inhibitors and 3) training of a classifier for bioactivity prediction using transfer learning. The fine-tuned chemical model generated >90% valid, diverse and novel (not present on the training set) structures. The generated molecules showed a good overlap with M pro chemical space, displaying similar physicochemical properties and chemical structures. In addition, novel scaffolds were also generated, showing the potential to explore new chemical series. The classification model outperformed the baseline area under the precision-recall curve, showing it can be used for prediction. In addition, the model also outperformed the freely available model Chemprop on an external test set of fragments screened against SARS-CoV-2 Mpro, showing its potential to identify putative antivirals to tackle the COVID-19 pandemic. Finally, among the top-20 predicted hits, we identified nine hits via molecular docking displaying binding poses and interactions similar to experimentally validated inhibitors.


2020 ◽  
Author(s):  
Mingyuan Xu ◽  
Ting Ran ◽  
Hongming Chen

<p><i>De novo</i> molecule design through molecular generative model is gaining increasing attention in recent years. Here a novel generative model was proposed by integrating the 3D structural information of the protein binding pocket into the conditional RNN (cRNN) model to control the generation of drug-like molecules. In this model, the composition of protein binding pocket is effectively characterized through a coarse-grain strategy and the three-dimensional information of the pocket can be represented by the sorted eigenvalues of the coulomb matrix (EGCM) of the coarse-grained atoms composing the binding pocket. In current work, we used our EGCM method and a previously reported binding pocket descriptor DeeplyTough to train cRNN models and compared their performance. It has been shown that the molecules generated with the control of protein environment information have a clear tendency on generating compounds with higher similarity to the original X-ray bound ligand than normal RNN model and also achieving better performance in terms of docking scores. Our results demonstrate the potential application of EGCM controlled generative model for the targeted molecule generation and guided exploration on the drug-like chemical space. </p><p> </p>


2016 ◽  
Vol 56 (10) ◽  
pp. 1885-1893 ◽  
Author(s):  
Shunichi Takeda ◽  
Hiromasa Kaneko ◽  
Kimito Funatsu

2020 ◽  
Author(s):  
Mingyuan Xu ◽  
Ting Ran ◽  
Hongming Chen

<p><i>De novo</i> molecule design through molecular generative model is gaining increasing attention in recent years. Here a novel generative model was proposed by integrating the 3D structural information of the protein binding pocket into the conditional RNN (cRNN) model to control the generation of drug-like molecules. In this model, the composition of protein binding pocket is effectively characterized through a coarse-grain strategy and the three-dimensional information of the pocket can be represented by the sorted eigenvalues of the coulomb matrix (EGCM) of the coarse-grained atoms composing the binding pocket. In current work, we used our EGCM method and a previously reported binding pocket descriptor DeeplyTough to train cRNN models and compared their performance. It has been shown that the molecules generated with the control of protein environment information have a clear tendency on generating compounds with higher similarity to the original X-ray bound ligand than normal RNN model and also achieving better performance in terms of docking scores. Our results demonstrate the potential application of EGCM controlled generative model for the targeted molecule generation and guided exploration on the drug-like chemical space. </p><p> </p>


2020 ◽  
Author(s):  
Navneet Bung ◽  
Sowmya Ramaswamy Krishnan ◽  
Gopalakrishnan Bulusu ◽  
Arijit Roy

The novel SARS-CoV-2 is the source of a global pandemic COVID-19, which has severely affected the health and economy of several countries. Multiple studies are in progress, employing diverse approaches to design novel therapeutics against the potential target proteins in SARS-CoV-2. One of the well-studied protein targets for coronaviruses is the chymotrypsin-like (3CL) protease, responsible for post-translational modifications of viral polyproteins essential for its survival and replication in the host. There are ongoing attempts to repurpose the existing viral protease inhibitors against 3CL protease of SARS-CoV-2. Recent studies have proven the efficiency of artificial intelligence techniques in learning the known chemical space and generating novel small molecules. In this study, we employed deep neural network-based generative and predictive models for de novo design of new small molecules capable of inhibiting the 3CL protease. The generated small molecules were filtered and screened against the binding site of the 3CL protease structure of SARS-CoV-2. Based on the screening results and further analysis, we have identified 31 potential compounds as ideal candidates for further synthesis and testing against SARS-CoV-2. The generated small molecules were also compared with available natural products. Two of the generated small molecules showed high similarity to a plant natural product, Aurantiamide, which can be used for rapid testing during this time of crisis.


2021 ◽  
Vol 14 (12) ◽  
pp. 1249
Author(s):  
Shuheng Huang ◽  
Hu Mei ◽  
Laichun Lu ◽  
Minyao Qiu ◽  
Xiaoqi Liang ◽  
...  

Due to their potential in the treatment of neurodegenerative diseases, caspase-6 inhibitors have attracted widespread attention. However, the existing caspase-6 inhibitors showed more or less inevitable deficiencies that restrict their clinical development and applications. Therefore, there is an urgent need to develop novel caspase-6 candidate inhibitors. Herein, a gated recurrent unit (GRU)-based recurrent neural network (RNN) combined with transfer learning was used to build a molecular generative model of caspase-6 inhibitors. The results showed that the GRU-based RNN model can accurately learn the SMILES grammars of about 2.4 million chemical molecules including ionic and isomeric compounds and can generate potential caspase-6 inhibitors after transfer learning of the known 433 caspase-6 inhibitors. Based on the novel molecules derived from the molecular generative model, an optimal logistic regression model and Surflex-dock were employed for predicting and ranking the inhibitory activities. According to the prediction results, three potential caspase-6 inhibitors with different scaffolds were selected as the promising candidates for further research. In general, this paper provides an efficient combinational strategy for de novo molecular design of caspase-6 inhibitors.


Author(s):  
Navneet Bung ◽  
Sowmya Ramaswamy Krishnan ◽  
Gopalakrishnan Bulusu ◽  
Arijit Roy

The novel SARS-CoV-2 is the source of a global pandemic COVID-19, which has severely affected the health and economy of several countries. Multiple studies are in progress, employing diverse approaches to design novel therapeutics against the potential target proteins in SARS-CoV-2. One of the well-studied protein targets for coronaviruses is the chymotrypsin-like (3CL) protease, responsible for post-translational modifications of viral polyproteins essential for its survival and replication in the host. There are ongoing attempts to repurpose the existing viral protease inhibitors against 3CL protease of SARS-CoV-2. Recent studies have proven the efficiency of artificial intelligence techniques in learning the known chemical space and generating novel small molecules. In this study, we employed deep neural network-based generative and predictive models for de novo design of new small molecules capable of inhibiting the 3CL protease. The generated small molecules were filtered and screened against the binding site of the 3CL protease structure of SARS-CoV-2. Based on the screening results and further analysis, we have identified 31 potential compounds as ideal candidates for further synthesis and testing against SARS-CoV-2. The generated small molecules were also compared with available natural products. Two of the generated small molecules showed high similarity to a plant natural product, Aurantiamide, which can be used for rapid testing during this time of crisis.


Author(s):  
Thomas Blaschke ◽  
Josep Arús-Pous ◽  
Hongming Chen ◽  
Christian Margreitter ◽  
Christian Tyrchan ◽  
...  

With this application note we aim to offer the community a production-ready tool for de novo design. It can be effectively applied on drug discovery projects that are striving to resolve either exploration or exploitation problems while navigating the chemical space. By releasing the code we are aiming to facilitate the research on using generative methods on drug discovery problems and to promote the collaborative efforts in this area so that it can be used as an interaction point for future scientific collaborations.


2020 ◽  
Author(s):  
Thomas Blaschke ◽  
Josep Arús-Pous ◽  
Hongming Chen ◽  
Christian Margreitter ◽  
Christian Tyrchan ◽  
...  

With this application note we aim to offer the community a production-ready tool for de novo design. It can be effectively applied on drug discovery projects that are striving to resolve either exploration or exploitation problems while navigating the chemical space. By releasing the code we are aiming to facilitate the research on using generative methods on drug discovery problems and to promote the collaborative efforts in this area so that it can be used as an interaction point for future scientific collaborations.


2020 ◽  
Author(s):  
Navneet Bung ◽  
Sowmya Ramaswamy Krishnan ◽  
Gopalakrishnan Bulusu ◽  
Arijit Roy

The novel SARS-CoV-2 is the source of a global pandemic COVID-19, which has severely affected the health and economy of several countries. Multiple studies are in progress, employing diverse approaches to design novel therapeutics against the potential target proteins in SARS-CoV-2. One of the well-studied protein targets for coronaviruses is the chymotrypsin-like (3CL) protease, responsible for post-translational modifications of viral<br>polyproteins essential for its survival and replication in the host. There are ongoing attempts to repurpose the existing viral protease inhibitors against 3CL protease of SARS-<br>CoV-2. Recent studies have proven the efficiency of artificial intelligence techniques in learning the known chemical space and generating novel small molecules. In this study,<br>we employed deep neural network-based generative and predictive models for de novo design of new small molecules capable of inhibiting the 3CL protease. The generated<br>small molecules were filtered and screened against the binding site of the 3CL protease structure of SARS-CoV-2. Based on the screening results and further analysis, we have<br>identified 31 potential compounds as ideal candidates for further synthesis and testing against SARS-CoV-2.


Sign in / Sign up

Export Citation Format

Share Document