scholarly journals Transformer Neural Network-Based Molecular Optimization Using General Transformations

Author(s):  
Jiazhen He ◽  
Eva Nittinger ◽  
Christian Tyrchan ◽  
Werngard Czechtizky ◽  
Atanas Patronov ◽  
...  

Molecular optimization aims to improve the drug profile of a starting molecule. It is a fundamental problem in drug discovery but challenging due to (i) the requirement of simultaneous optimization of multiple properties and (ii) the large chemical space to explore. Recently, deep learning methods have been proposed to solve this task by mimicking the chemist's intuition in terms of matched molecular pairs (MMPs). Although MMPs is a typical and widely used strategy by medicinal chemists, it offers limited capability in terms of exploring the space of solutions. There are more options to modify a starting molecule to achieve desirable properties, e.g. one can simultaneously modify the molecule at different places including changing the scaffold. This study trains the same Transformer architecture on different datasets. These datasets consist of a set of molecular pairs which reflect different types of transformations. Beyond MMP transformation, datasets reflecting general transformations are constructed from ChEMBL based on two approaches: Tanimoto similarity (allows for multiple modifications) and scaffold matching (allows for multiple modifications but keep the scaffold constant) respectively. We investigate how the model behavior can be altered by tailoring the dataset while keeping the same model architecture. Our results show that the models trained on differently prepared datasets transform a given starting molecule in a way that it reflects the nature of the dataset used for training the model. These models could complement each other and unlock the capability for the chemists to pursue different options for improving a starting molecule.

2021 ◽  
Author(s):  
Jiazhen He ◽  
Eva Nittinger ◽  
Christian Tyrchan ◽  
Werngard Czechtizky ◽  
Atanas Patronov ◽  
...  

Abstract Molecular optimization aims to improve the drug profile of a starting molecule. It is a fundamental problem in drug discovery but challenging due to (i) the requirement of simultaneous optimization of multiple properties and (ii) the large chemical space to explore. Recently, deep learning methods have been proposed to solve this task by mimicking the chemist's intuition in terms of matched molecular pairs (MMPs). Although MMPs is a typical and widely used strategy by medicinal chemists, it offers limited capability in terms of exploring the space of solutions. There are more options to modify a starting molecule to achieve desirable properties, e.g. one can simultaneously modify the molecule at different places including changing the scaffold. This study trains the same Transformer architecture on different datasets. These datasets consist of a set of molecular pairs which reflect different types of transformations. Beyond MMP transformation, datasets reflecting general transformations are constructed from ChEMBL based on two approaches: Tanimoto similarity (allows for multiple modifications) and scaffold matching (allows for multiple modifications but keep the scaffold constant) respectively. We investigate how the model behavior can be altered by tailoring the dataset while keeping the same model architecture. Our results show that the models trained on differently prepared datasets transform a given starting molecule in a way that it reflects the nature of the dataset used for training the model. These models could complement each other and unlock the capability for the chemists to pursue different options for improving a starting molecule.


2020 ◽  
Author(s):  
Yang Liu ◽  
Hansaim Lim ◽  
Lei Xie

AbstractMotivationDrug discovery is time-consuming and costly. Machine learning, especially deep learning, shows a great potential in accelerating the drug discovery process and reducing its cost. A big challenge in developing robust and generalizable deep learning models for drug design is the lack of a large amount of data with high quality and balanced labels. To address this challenge, we developed a self-training method PLANS that exploits millions of unlabeled chemical compounds as well as partially labeled pharmacological data to improve the performance of neural network models.ResultWe evaluated the self-training with PLANS for Cytochrome P450 binding activity prediction task, and proved that our method could significantly improve the performance of the neural network model with a large margin. Compared with the baseline deep neural network model, the PLANS-trained neural network model improved accuracy, precision, recall, and F1 score by 13.4%, 12.5%, 8.3%, and 10.3%, respectively. The self-training with PLANS is model agnostic, and can be applied to any deep learning architectures. Thus, PLANS provides a general solution to utilize unlabeled and partially labeled data to improve the predictive modeling for drug discovery.AvailabilityThe code that implements PLANS is available at https://github.com/XieResearchGroup/PLANS


2020 ◽  
Author(s):  
Mohammed Maaz ◽  
Sabah Mohammed

<p>The advancement of Artificial Intelligence & Deep Learning has catalyzed the field of technology. The progression in these fields is exponentially increasing, and the discoveries which were once just an imagination are now changed into reality. The evolution of cars each year has made a lot of difference in people travelling from one place to another. One such reform involving Artificial Intelligence & Deep Learning is the birth of a self-driving car. The future is here where one can reach their destination hassle-free safely without the fear of accidents. This paper introduces a practical model of the self-driving robotics car, which can travel from one position to another on different types of tracks. A Pi-camera module is attached with the help of Raspberry Pi, which sends series of image frames to the Convolutional neural network, which then foretells the car to move in a specific direction, i.e. right, left, forward and reverse direction. The outcome is the robotics car, which travels in the desired direction without any individual effort.<br></p>


Author(s):  
Oleksii Prykhodko ◽  
Simon Viet Johansson ◽  
Panagiotis-Christos Kotsias ◽  
Josep Arús-Pous ◽  
Esben Jannik Bjerrum ◽  
...  

<p> </p><p>Deep learning methods applied to drug discovery have been used to generate novel structures. In this study, we propose a new deep learning architecture, LatentGAN, which combines an autoencoder and a generative adversarial neural network for de novo molecular design. We applied the method in two scenarios: one to generate random drug-like compounds and another to generate target-biased compounds. Our results show that the method works well in both cases: sampled compounds from the trained model can largely occupy the same chemical space as the training set and also generate a substantial fraction of novel compounds. Moreover, the drug-likeness score of compounds sampled from LatentGAN is also similar to that of the training set. Lastly, generated compounds differ from those obtained with a Recurrent Neural Network-based generative model approach, indicating that both methods can be used complementarily.</p><p> </p>


2022 ◽  
Vol 2022 ◽  
pp. 1-8
Author(s):  
Mohammad Manthouri ◽  
Zhila Aghajari ◽  
Sheida Safary

Infection diseases are among the top global issues with negative impacts on health, economy, and society as a whole. One of the most effective ways to detect these diseases is done by analysing the microscopic images of blood cells. Artificial intelligence (AI) techniques are now widely used to detect these blood cells and explore their structures. In recent years, deep learning architectures have been utilized as they are powerful tools for big data analysis. In this work, we are presenting a deep neural network for processing of microscopic images of blood cells. Processing these images is particularly important as white blood cells and their structures are being used to diagnose different diseases. In this research, we design and implement a reliable processing system for blood samples and classify five different types of white blood cells in microscopic images. We use the Gram-Schmidt algorithm for segmentation purposes. For the classification of different types of white blood cells, we combine Scale-Invariant Feature Transform (SIFT) feature detection technique with a deep convolutional neural network. To evaluate our work, we tested our method on LISC and WBCis databases. We achieved 95.84% and 97.33% accuracy of segmentation for these data sets, respectively. Our work illustrates that deep learning models can be promising in designing and developing a reliable system for microscopic image processing.


2021 ◽  
Vol 35 (5) ◽  
pp. 375-381
Author(s):  
Putra Sumari ◽  
Wan Muhammad Azimuddin Wan Ahmad ◽  
Faris Hadi ◽  
Muhammad Mazlan ◽  
Nur Anis Liyana ◽  
...  

Fruits come in different variants and subspecies. While some subspecies of fruits can be easily differentiated, others may require an expertness to differentiate them. Although farmers rely on the traditional methods to identify and classify fruit types, the methods are prone to so many challenges. Training a machine to identify and classify fruit types in place of traditional methods can ensure precision fruit classification. By taking advantage of the state-of-the-art image recognition techniques, we approach fruits classification from another perspective by proposing a high performing hybrid deep learning which could ensure precision mangosteen fruit classification. This involves a proposed optimized Convolutional Neural Network (CNN) model compared to other optimized models such as Xception, VGG16, and ResNet50 using Adam, RMSprop, Adagrad, and Stochastic Gradient Descent (SGD) optimizers on specified dense layers and filters numbers. The proposed CNN model has three types of layers that make up its model, they are: 1) the convolutional layers, 2) the pooling layers, and 3) the fully connected (FC) layers. The first convolution layer uses convolution filters with a filter size of 3x3 used for initializing the neural network with some weights prior to updating to a better value for each iteration. The CNN architecture is formed from stacking these layers. Our self-acquired dataset which is composed of four different types of Malaysian mangosteen fruit, namely Manggis Hutan, Manggis Mesta, Manggis Putih and Manggis Ungu was employed for the training and testing of the proposed CNN model. The proposed CNN model achieved 94.99% classification accuracy higher than the optimized Xception model which achieved 90.62% accuracy in the second position.


2020 ◽  
Vol 60 (10) ◽  
pp. 4487-4496
Author(s):  
Paul Maragakis ◽  
Hunter Nisonoff ◽  
Brian Cole ◽  
David E. Shaw

2019 ◽  
Vol 11 (1) ◽  
Author(s):  
Oleksii Prykhodko ◽  
Simon Viet Johansson ◽  
Panagiotis-Christos Kotsias ◽  
Josep Arús-Pous ◽  
Esben Jannik Bjerrum ◽  
...  

AbstractDeep learning methods applied to drug discovery have been used to generate novel structures. In this study, we propose a new deep learning architecture, LatentGAN, which combines an autoencoder and a generative adversarial neural network for de novo molecular design. We applied the method in two scenarios: one to generate random drug-like compounds and another to generate target-biased compounds. Our results show that the method works well in both cases. Sampled compounds from the trained model can largely occupy the same chemical space as the training set and also generate a substantial fraction of novel compounds. Moreover, the drug-likeness score of compounds sampled from LatentGAN is also similar to that of the training set. Lastly, generated compounds differ from those obtained with a Recurrent Neural Network-based generative model approach, indicating that both methods can be used complementarily.


2019 ◽  
Vol 26 (4) ◽  
pp. 1361-1366 ◽  
Author(s):  
Sho Ito ◽  
Go Ueno ◽  
Masaki Yamamoto

High-throughput protein crystallography using a synchrotron light source is an important method used in drug discovery. Beamline components for automated experiments including automatic sample changers have been utilized to accelerate the measurement of a number of macromolecular crystals. However, unlike cryo-loop centering, crystal centering involving automated crystal detection is a difficult process to automate fully. Here, DeepCentering, a new automated crystal centering system, is presented. DeepCentering works using a convolutional neural network, which is a deep learning operation. This system achieves fully automated accurate crystal centering without using X-ray irradiation of crystals, and can be used for fully automated data collection in high-throughput macromolecular crystallography.


Sign in / Sign up

Export Citation Format

Share Document