scholarly journals Relevant Applications of Generative Adversarial Networks in Drug Design and Discovery: Molecular De Novo Design, Dimensionality Reduction, and De Novo Peptide and Protein Design

Molecules ◽  
2020 ◽  
Vol 25 (14) ◽  
pp. 3250 ◽  
Author(s):  
Eugene Lin ◽  
Chieh-Hsin Lin ◽  
Hsien-Yuan Lane

A growing body of evidence now suggests that artificial intelligence and machine learning techniques can serve as an indispensable foundation for the process of drug design and discovery. In light of latest advancements in computing technologies, deep learning algorithms are being created during the development of clinically useful drugs for treatment of a number of diseases. In this review, we focus on the latest developments for three particular arenas in drug design and discovery research using deep learning approaches, such as generative adversarial network (GAN) frameworks. Firstly, we review drug design and discovery studies that leverage various GAN techniques to assess one main application such as molecular de novo design in drug design and discovery. In addition, we describe various GAN models to fulfill the dimension reduction task of single-cell data in the preclinical stage of the drug development pipeline. Furthermore, we depict several studies in de novo peptide and protein design using GAN frameworks. Moreover, we outline the limitations in regard to the previous drug design and discovery studies using GAN models. Finally, we present a discussion of directions and challenges for future research.

2021 ◽  
Vol 61 (2) ◽  
pp. 621-630
Author(s):  
Sowmya Ramaswamy Krishnan ◽  
Navneet Bung ◽  
Gopalakrishnan Bulusu ◽  
Arijit Roy

2019 ◽  
Vol 35 (14) ◽  
pp. i183-i190 ◽  
Author(s):  
Hao Yang ◽  
Hao Chi ◽  
Wen-Feng Zeng ◽  
Wen-Jing Zhou ◽  
Si-Min He

AbstractMotivationDe novo peptide sequencing based on tandem mass spectrometry data is the key technology of shotgun proteomics for identifying peptides without any database and assembling unknown proteins. However, owing to the low ion coverage in tandem mass spectra, the order of certain consecutive amino acids cannot be determined if all of their supporting fragment ions are missing, which results in the low precision of de novo sequencing.ResultsIn order to solve this problem, we developed pNovo 3, which used a learning-to-rank framework to distinguish similar peptide candidates for each spectrum. Three metrics for measuring the similarity between each experimental spectrum and its corresponding theoretical spectrum were used as important features, in which the theoretical spectra can be precisely predicted by the pDeep algorithm using deep learning. On seven benchmark datasets from six diverse species, pNovo 3 recalled 29–102% more correct spectra, and the precision was 11–89% higher than three other state-of-the-art de novo sequencing algorithms. Furthermore, compared with the newly developed DeepNovo, which also used the deep learning approach, pNovo 3 still identified 21–50% more spectra on the nine datasets used in the study of DeepNovo. In summary, the deep learning and learning-to-rank techniques implemented in pNovo 3 significantly improve the precision of de novo sequencing, and such machine learning framework is worth extending to other related research fields to distinguish the similar sequences.Availability and implementationpNovo 3 can be freely downloaded from http://pfind.ict.ac.cn/software/pNovo/index.html.Supplementary informationSupplementary data are available at Bioinformatics online.


2021 ◽  
Author(s):  
Xuhan Liu ◽  
Kai Ye ◽  
Herman W. T. van Vlijmen ◽  
Adriaan P. IJzerman ◽  
Gerard J. P. van Westen

Due to the large drug-like chemical space available to search for feasible drug-like molecules, rational drug design often starts from specific scaffolds to which side chains/substituents are added or modified. With the rapid growth of the application of deep learning in drug discovery, a variety of effective approaches have been developed for de novo drug design. In previous work, we proposed a method named DrugEx, which can be applied in polypharmacology based on multi-objective deep reinforcement learning. However, the previous version is trained under fixed objectives similar to other known methods and does not allow users to input any prior information (i.e. a desired scaffold). In order to improve the general applicability, we updated DrugEx to design drug molecules based on scaffolds which consist of multiple fragments provided by users. In this work, the Transformer model was employed to generate molecular structures. The Transformer is a multi-head self-attention deep learning model containing an encoder to receive scaffolds as input and a decoder to generate molecules as output. In order to deal with the graph representation of molecules we proposed a novel positional encoding for each atom and bond based on an adjacency matrix to extend the architecture of the Transformer. Each molecule was generated by growing and connecting procedures for the fragments in the given scaffold that were unified into one model. Moreover, we trained this generator under a reinforcement learning framework to increase the number of desired ligands. As a proof of concept, our proposed method was applied to design ligands for the adenosine A2A receptor (A2AAR) and compared with SMILES-based methods. The results demonstrated the effectiveness of our method in that 100% of the generated molecules are valid and most of them had a high predicted affinity value towards A2AAR with given scaffolds.


2021 ◽  
Vol 22 (18) ◽  
pp. 9983
Author(s):  
Jintae Kim ◽  
Sera Park ◽  
Dongbo Min ◽  
Wankyu Kim

Drug discovery based on artificial intelligence has been in the spotlight recently as it significantly reduces the time and cost required for developing novel drugs. With the advancement of deep learning (DL) technology and the growth of drug-related data, numerous deep-learning-based methodologies are emerging at all steps of drug development processes. In particular, pharmaceutical chemists have faced significant issues with regard to selecting and designing potential drugs for a target of interest to enter preclinical testing. The two major challenges are prediction of interactions between drugs and druggable targets and generation of novel molecular structures suitable for a target of interest. Therefore, we reviewed recent deep-learning applications in drug–target interaction (DTI) prediction and de novo drug design. In addition, we introduce a comprehensive summary of a variety of drug and protein representations, DL models, and commonly used benchmark datasets or tools for model training and testing. Finally, we present the remaining challenges for the promising future of DL-based DTI prediction and de novo drug design.


2020 ◽  
Author(s):  
Oscar Méndez-Lucio ◽  
Paula Andrea Marin Zapata ◽  
Joerg Wichard ◽  
David Rouquié ◽  
Djork-Arné Clevert

Developing new small molecules that are bioactive is time-consuming, costly and rarely successful. As a mitigation strategy, we apply, for the first time, generative adversarial networks to de novo design of small molecules using a phenotype-based drug discovery approach. We trained our model on a set of 30,000 compounds and their respective morphological profiles extracted from high content images; no target information was used to train the model. Using this approach, we were able to automatically design agonist-like compounds of different molecular targets.


2019 ◽  
Author(s):  
Mostafa Karimi ◽  
Shaowen Zhu ◽  
Yue Cao ◽  
Yang Shen

AbstractMotivationFacing data quickly accumulating on protein sequence and structure, this study is addressing the following question: to what extent could current data alone reveal deep insights into the sequence-structure relationship, such that new sequences can be designed accordingly for novel structure folds?ResultsWe have developed novel deep generative models, constructed low-dimensional and generalizable representation of fold space, exploited sequence data with and without paired structures, and developed ultra-fast fold predictor as an oracle providing feedback. The resulting semi-supervised gcWGAN is assessed with the oracle over 100 novel folds not in the training set and found to generate more yields and cover 3.6 times more target folds compared to a competing data-driven method (cVAE). Assessed with structure predictor over representative novel folds (including one not even part of basis folds), gcWGAN designs are found to have comparable or better fold accuracy yet much more sequence diversity and novelty than cVAE. gcWGAN explores uncharted sequence space to design proteins by learning from current sequence-structure data. The ultra fast data-driven model can be a powerful addition to principle-driven design methods through generating seed designs or tailoring sequence space.AvailabilityData and source codes will be available upon [email protected] informationSupplementary data are available at Bioinformatics online.


Author(s):  
Thomas Blaschke ◽  
Josep Arús-Pous ◽  
Hongming Chen ◽  
Christian Margreitter ◽  
Christian Tyrchan ◽  
...  

With this application note we aim to offer the community a production-ready tool for de novo design. It can be effectively applied on drug discovery projects that are striving to resolve either exploration or exploitation problems while navigating the chemical space. By releasing the code we are aiming to facilitate the research on using generative methods on drug discovery problems and to promote the collaborative efforts in this area so that it can be used as an interaction point for future scientific collaborations.


2020 ◽  
Author(s):  
Thomas Blaschke ◽  
Josep Arús-Pous ◽  
Hongming Chen ◽  
Christian Margreitter ◽  
Christian Tyrchan ◽  
...  

With this application note we aim to offer the community a production-ready tool for de novo design. It can be effectively applied on drug discovery projects that are striving to resolve either exploration or exploitation problems while navigating the chemical space. By releasing the code we are aiming to facilitate the research on using generative methods on drug discovery problems and to promote the collaborative efforts in this area so that it can be used as an interaction point for future scientific collaborations.


2021 ◽  
Author(s):  
Keisuke Shimizu ◽  
Batsaikhan Mijiddorj ◽  
Masataka Usami ◽  
Shuhei Yoshida ◽  
Shiori Akayama ◽  
...  

Abstract The amino acid sequence of a protein encodes information on its three-dimensional structure and specific functionality. De novo protein design has emerged as a method to manipulate the primary structure for the development of artificial proteins and peptides with desired functionality. This paper describes the de novo design of a pore-forming peptide that has a β-hairpin structure and assembles to form a stable nanopore in a bilayer lipid membrane. This large synthetic nanopore is an entirely artificial device with practical applications. This peptide, named SV28, forms nanopore structures ranging from 1.6 to 6.2 nm in diameter assembled from 7 to 18 monomers. The nanopore formed with a diameter of 5 nm is able to detect long double-stranded DNA (dsDNA) with 1 kbp length. Moreover, the larger sized nanopore can discriminate and human telomeric DNA (G-quadruplex, G4). The blocking current signals allowed us to investigate the translocation behavior of dsDNA or G4 structure at the single molecule level. Such de novo design of peptide sequences has the potential to create novel nanopores, which would be applicable in molecular transporter between across lipid membrane.


2018 ◽  
Vol 16 (06) ◽  
pp. 1840027 ◽  
Author(s):  
Wen Juan Hou ◽  
Bamfa Ceesay

Information on changes in a drug’s effect when taken in combination with a second drug, known as drug–drug interaction (DDI), is relevant in the pharmaceutical industry. DDIs can delay, decrease, or enhance absorption of either drug and thus decrease or increase their action or cause adverse effects. Information Extraction (IE) can be of great benefit in allowing identification and extraction of relevant information on DDIs. We here propose an approach for the extraction of DDI from text using neural word embedding to train a machine learning system. Results show that our system is competitive against other systems for the task of extracting DDIs, and that significant improvements can be achieved by learning from word features and using a deep-learning approach. Our study demonstrates that machine learning techniques such as neural networks and deep learning methods can efficiently aid in IE from text. Our proposed approach is well suited to play a significant role in future research.


Sign in / Sign up

Export Citation Format

Share Document