scholarly journals Masked graph modeling for molecule generation

2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Omar Mahmood ◽  
Elman Mansimov ◽  
Richard Bonneau ◽  
Kyunghyun Cho

AbstractDe novo, in-silico design of molecules is a challenging problem with applications in drug discovery and material design. We introduce a masked graph model, which learns a distribution over graphs by capturing conditional distributions over unobserved nodes (atoms) and edges (bonds) given observed ones. We train and then sample from our model by iteratively masking and replacing different parts of initialized graphs. We evaluate our approach on the QM9 and ChEMBL datasets using the GuacaMol distribution-learning benchmark. We find that validity, KL-divergence and Fréchet ChemNet Distance scores are anti-correlated with novelty, and that we can trade off between these metrics more effectively than existing models. On distributional metrics, our model outperforms previously proposed graph-based approaches and is competitive with SMILES-based approaches. Finally, we show our model generates molecules with desired values of specified properties while maintaining physiochemical similarity to the training distribution.

2021 ◽  
Author(s):  
Omar Mahmood ◽  
Elman Mansimov ◽  
Richard Bonneau ◽  
Kyunghyun Cho

De novo, in-silico design of molecules is a challenging problem with applications in drug discovery and material design. We introduce a masked graph model, which learns a distribution over graphs by capturing conditional distributions over unobserved nodes (atoms) and edges (bonds) given observed ones. We train and then sample from our model by iteratively masking and replacing different parts of initialized graphs.<br>We evaluate our approach on the QM9 and ChEMBL datasets using the GuacaMol distribution-learning benchmark. We find that validity, KL-divergence and Fréchet ChemNet Distance scores are anti-correlated with novelty, and that we can trade off between these metrics more effectively than existing models. On distributional metrics, our model outperforms previously proposed graph-based approaches and is competitive with SMILES-based approaches. Finally, we show our model generates molecules with desired values of specified properties while maintaining physiochemical similarity to the<br>training distribution.


2021 ◽  
Author(s):  
Omar Mahmood ◽  
Elman Mansimov ◽  
Richard Bonneau ◽  
Kyunghyun Cho

De novo, in-silico design of molecules is a challenging problem with applications in drug discovery and material design.<br>Here, we introduce a masked graph model which learns a distribution over graphs by capturing all possible conditional distributions over unobserved nodes and edges given observed ones.<br>We train our masked graph model on existing molecular graphs and then sample novel molecular graphs from it by iteratively masking and replacing different parts of initialized graphs. <br>We evaluate our approach on the QM9 and ChEMBL datasets using the distribution-learning benchmark from the GuacaMol framework.<br>The benchmark contains five metrics: the validity, uniqueness, novelty, KL-divergence and Fréchet ChemNet Distance scores, the last two of which are measures of the similarity of the generated samples to the training, validation and test distributions. <br>We find that KL-divergence and Fréchet ChemNet Distance scores are anti-correlated with novelty scores. By varying generation initialization and the fraction of the graph masked and replaced at each generation step, we can increase the Fréchet score at the cost of novelty. <br>In this way, we show that our model offers transparent and tunable control of the trade-off between these metrics, a point of control currently lacking in other approaches to molecular graph generation.<br>We observe that our model outperforms previously proposed graph-based approaches and is competitive with SMILES-based approaches.<br>Finally, we show that our model can generate molecules with desired values of specified properties while maintaining physiochemical similarity to molecules from the training distribution.


2020 ◽  
Author(s):  
Omar Mahmood ◽  
Elman Mansimov ◽  
Richard Bonneau ◽  
Kyunghyun Cho

De novo, in-silico design of molecules is a challenging problem with applications in drug discovery and material design.<br>Here, we introduce a masked graph model which learns a distribution over graphs by capturing all possible conditional distributions over unobserved nodes and edges given observed ones. We train our masked graph model on existing molecular graphs and then sample novel molecular graphs from it by iteratively masking and replacing different parts of initialized graphs. We evaluate our approach on the QM9 and ChEMBL datasets using the distribution-learning benchmark from the GuacaMol framework. The benchmark contains five metrics: the validity, uniqueness, novelty, KL-divergence and Fr{\'e}chet ChemNet Distance scores, the last two of which are measures of the similarity of the generated samples to the training, validation and test distributions. We find that KL-divergence and Fréchet ChemNet Distance scores are anti-correlated with novelty scores. By varying generation initialization and the fraction of the graph masked and replaced at each generation step, we can increase the Fréchet score at the cost of novelty. <br>In this way, we show that our model offers transparent and tunable control of the trade-off between these metrics, a key point of control in design applications currently lacking in other approaches to molecular graph generation. Our model outperforms previously proposed graph-based approaches and is competitive with SMILES-based approaches. Finally, we observe that minimizing validation loss on the training task is a suitable proxy for improving generation quality, which shows the suitability of optimizing the training objective for improving generation.


2020 ◽  
Author(s):  
Omar Mahmood ◽  
Elman Mansimov ◽  
Richard Bonneau ◽  
Kyunghyun Cho

De novo, in-silico design of molecules is a challenging problem with applications in drug discovery and material design.<br>Here, we introduce a masked graph model which learns a distribution over graphs by capturing all possible conditional distributions over unobserved nodes and edges given observed ones. We train our masked graph model on existing molecular graphs and then sample novel molecular graphs from it by iteratively masking and replacing different parts of initialized graphs. We evaluate our approach on the QM9 and ChEMBL datasets using the distribution-learning benchmark from the GuacaMol framework. The benchmark contains five metrics: the validity, uniqueness, novelty, KL-divergence and Fr{\'e}chet ChemNet Distance scores, the last two of which are measures of the similarity of the generated samples to the training, validation and test distributions. We find that KL-divergence and Fréchet ChemNet Distance scores are anti-correlated with novelty scores. By varying generation initialization and the fraction of the graph masked and replaced at each generation step, we can increase the Fréchet score at the cost of novelty. <br>In this way, we show that our model offers transparent and tunable control of the trade-off between these metrics, a key point of control in design applications currently lacking in other approaches to molecular graph generation. Our model outperforms previously proposed graph-based approaches and is competitive with SMILES-based approaches. Finally, we observe that minimizing validation loss on the training task is a suitable proxy for improving generation quality, which shows the suitability of optimizing the training objective for improving generation.


2020 ◽  
Vol 10 (2) ◽  
pp. 2063-2069

One of the largest families of membrane proteins, the G protein-coupled receptors (GPCRs) has been a very important target of drug discovery as they are involved in having a regulatory role in a variety of signaling pathways at the cellular level in response to external stimuli. Modern in-silico and crystallographic approaches have further made it easier to peep into their structures. In this study, β2 adrenergic receptor (β2AR) has been targeted, and a new ligand molecule using the de-novo approach has been proposed. Using 1-Amino-3-(2,3-dihydro-1H-indol-4-yloxy)-propan-2-ol, the best fitting binding fragments were established with a significant dissociation constant value of 5-7 nanomolar. The flexibility of specific active sites was also investigated, and it was observed that residues 114 (V), 117 (V), 203 (S), 286 (W), and 289 (F) played a crucial role in accommodating ligand for the best binding. Upon examination of the bioavailability parameters, the ligand var9 exhibited significant inhibitory characteristics having lower toxicity values and high drug likeliness properties. Findings certainly hold significance in terms of targeting GPCRs in getting insight into structure-based drug designing and drug discovery.


2022 ◽  
Author(s):  
Fatemeh Hosseini ◽  
Mehrdad Azin ◽  
Hamideh Ofoghi ◽  
Tahereh Alinejad

Unfortunately, to date, there is no approved specific antiviral drug treatment against COVID-19. Due to the costly and time-consuming nature of the de novo drug discovery and development process, in recent days, the computational drug repositioning method has been highly regarded for accelerating the drug-discovery process. The selection of drug target molecule(s), preparation of an approved therapeutics agent library, and in silico evaluation of their affinity to the subjected target(s) are the main steps of a molecular docking-based drug repositioning process, which is the most common computational drug re-tasking process. In this chapter, after a review on origin, pathophysiology, molecular biology, and drug development strategies against COVID-19, recent advances, challenges as well as the future perspective of molecular docking-based drug repositioning for COVID-19 are discussed. Furthermore, as a case study, the molecular docking-based drug repurposing process was planned to screen the 3CLpro inhibitor(s) among the nine Food and Drug Administration (FDA)-approved antiviral protease inhibitors. The results demonstrated that Fosamprenavir had the highest binding affinity to 3CLpro and can be considered for more in silico, in vitro, and in vivo evaluations as an effective repurposed anti-COVID-19 drug.


2021 ◽  
Author(s):  
Ben Geoffrey ◽  
Rafal Madaj ◽  
Pavan Preetham Valluri ◽  
Akhil Sanker

The past decade has seen a surge in the range of application data science, machine learning, deep learning, and AI methods to drug discovery. The presented work involves an assemblage of a variety of AI methods for drug discovery along with the incorporation of in silico techniques to provide a holistic tool for automated drug discovery. When drug candidates are required to be identified for aparticular drug target of interest, the user is required to provide the tool target signatures in the form of an amino acid sequence or its corresponding nucleotide sequence. The tool collects data registered on PubChem required to perform an automated QSAR and with the validated QSAR model, prediction and drug lead generation are carried out. This protocol we call Target2Drug. This is followed by a protocol we call Target2DeNovoDrug wherein novel molecules with likely activityagainst the target are generated de novo using a generative LSTM model. It is often required in drug discovery that the generated molecules possess certain properties like drug-likeness, and therefore to optimize the generated de novo molecules toward the required drug-like property we use a deep learning model called DeepFMPO, and this protocol we call Target2DeNovoDrugPropMax. This is followed by the fast automated AutoDock-Vina based in silico modeling and profiling of theinteraction of optimized drug leads and the drug target. This is followed by an automated execution of the Molecular Dynamics protocol that is also carried out for the complex identified with the best protein-ligand interaction from the AutoDock- Vina based virtual screening. The results are stored in the working folder of the user. The code is maintained, supported, and provide for use in thefollowing GitHub repositoryhttps://github.com/bengeof/Target2DeNovoDrugPropMaxAnticipating the rise in the use of quantum computing and quantum machine learning in drug discovery we use the Penny-lane interface to quantum hardware to turn classical Keras layers used in our machine/deep learning models into a quantum layer and introduce quantum layers into our classical models to produce a quantum-classical machine/deep learning hybrid model of our tool and the code corresponding to the same is provided belowhttps://github.com/bengeof/QPoweredTarget2DeNovoDrugPropMax


2019 ◽  
Vol 26 (28) ◽  
pp. 5340-5362 ◽  
Author(s):  
Xin Chen ◽  
Giuseppe Gumina ◽  
Kristopher G. Virga

:As a long-term degenerative disorder of the central nervous system that mostly affects older people, Parkinson’s disease is a growing health threat to our ever-aging population. Despite remarkable advances in our understanding of this disease, all therapeutics currently available only act to improve symptoms but cannot stop the disease progression. Therefore, it is essential that more effective drug discovery methods and approaches are developed, validated, and used for the discovery of disease-modifying treatments for Parkinson’s disease. Drug repurposing, also known as drug repositioning, or the process of finding new uses for existing or abandoned pharmaceuticals, has been recognized as a cost-effective and timeefficient way to develop new drugs, being equally promising as de novo drug discovery in the field of neurodegeneration and, more specifically for Parkinson’s disease. The availability of several established libraries of clinical drugs and fast evolvement in disease biology, genomics and bioinformatics has stimulated the momentums of both in silico and activity-based drug repurposing. With the successful clinical introduction of several repurposed drugs for Parkinson’s disease, drug repurposing has now become a robust alternative approach to the discovery and development of novel drugs for this disease. In this review, recent advances in drug repurposing for Parkinson’s disease will be discussed.


Sign in / Sign up

Export Citation Format

Share Document