scholarly journals Transmol: Repurposing Language Model for Molecular Generation

Author(s):  
Rustam Zhumagambetov ◽  
Vsevolod A. Peshkov ◽  
Siamac Fazli

Recent advances in convolutional neural networks have inspired the application of deep learning to other disciplines. Even though image processing and natural language processing have turned out to be the most successful, there are many other areas that have benefited, like computational chemistry in general and drug design in particular. From 2018 the scientific community has seen a surge of methodologies related to the generation of diverse molecular libraries using machine learning. However, no algorithm used an attention mechanisms for <i>de novo</i> molecular generation. Here we employ a variant of transformers, a recent NLP architecture, for this purpose. We have achieved a statistically significant increase in some of the core metrics of the MOSES benchmark. Furthermore, a novel way of generating libraries fusing two molecules as seeds has been described.

2021 ◽  
Author(s):  
Rustam Zhumagambetov ◽  
Vsevolod A. Peshkov ◽  
Siamac Fazli

Recent advances in convolutional neural networks have inspired the application of deep learning to other disciplines. Even though image processing and natural language processing have turned out to be the most successful, there are many other areas that have benefited, like computational chemistry in general and drug design in particular. From 2018 the scientific community has seen a surge of methodologies related to the generation of diverse molecular libraries using machine learning. However, no algorithm used an attention mechanisms for <i>de novo</i> molecular generation. Here we employ a variant of transformers, a recent NLP architecture, for this purpose. We have achieved a statistically significant increase in some of the core metrics of the MOSES benchmark. Furthermore, a novel way of generating libraries fusing two molecules as seeds has been described.


2021 ◽  
Author(s):  
Sanjar Adilov

Generative neural networks have shown promising results in <i>de novo</i> drug design. Recent studies suggest that one of the efficient ways to produce novel molecules matching target properties is to model SMILES sequences using deep learning in a way similar to language modeling in natural language processing. In this paper, we present a survey of various machine learning methods for SMILES-based language modeling and propose our benchmarking results on a standardized subset of ChEMBL database.


2021 ◽  
Author(s):  
Sanjar Adilov

Generative neural networks have shown promising results in <i>de novo</i> drug design. Recent studies suggest that one of the efficient ways to produce novel molecules matching target properties is to model SMILES sequences using deep learning in a way similar to language modeling in natural language processing. In this paper, we present a survey of various machine learning methods for SMILES-based language modeling and propose our benchmarking results on a standardized subset of ChEMBL database.


2020 ◽  
pp. 1-38
Author(s):  
Amandeep Kaur ◽  
◽  
Anjum Mohammad Aslam ◽  

In this chapter we discuss the core concept of Artificial Intelligence. We define the term of Artificial Intelligence and its interconnected terms such as Machine learning, deep learning, Neural Networks. We describe the concept with the perspective of its usage in the area of business. We further analyze various applications and case studies which can be achieved using Artificial Intelligence and its sub fields. In the area of business already numerous Artificial Intelligence applications are being utilized and will be expected to be utilized more in the future where machines will improve the Artificial Intelligence, Natural language processing, Machine learning abilities of humans in various zones.


2021 ◽  
Vol 11 (7) ◽  
pp. 3184
Author(s):  
Ismael Garrido-Muñoz  ◽  
Arturo Montejo-Ráez  ◽  
Fernando Martínez-Santiago  ◽  
L. Alfonso Ureña-López 

Deep neural networks are hegemonic approaches to many machine learning areas, including natural language processing (NLP). Thanks to the availability of large corpora collections and the capability of deep architectures to shape internal language mechanisms in self-supervised learning processes (also known as “pre-training”), versatile and performing models are released continuously for every new network design. These networks, somehow, learn a probability distribution of words and relations across the training collection used, inheriting the potential flaws, inconsistencies and biases contained in such a collection. As pre-trained models have been found to be very useful approaches to transfer learning, dealing with bias has become a relevant issue in this new scenario. We introduce bias in a formal way and explore how it has been treated in several networks, in terms of detection and correction. In addition, available resources are identified and a strategy to deal with bias in deep NLP is proposed.


Algorithms ◽  
2021 ◽  
Vol 14 (2) ◽  
pp. 39
Author(s):  
Carlos Lassance ◽  
Vincent Gripon ◽  
Antonio Ortega

Deep Learning (DL) has attracted a lot of attention for its ability to reach state-of-the-art performance in many machine learning tasks. The core principle of DL methods consists of training composite architectures in an end-to-end fashion, where inputs are associated with outputs trained to optimize an objective function. Because of their compositional nature, DL architectures naturally exhibit several intermediate representations of the inputs, which belong to so-called latent spaces. When treated individually, these intermediate representations are most of the time unconstrained during the learning process, as it is unclear which properties should be favored. However, when processing a batch of inputs concurrently, the corresponding set of intermediate representations exhibit relations (what we call a geometry) on which desired properties can be sought. In this work, we show that it is possible to introduce constraints on these latent geometries to address various problems. In more detail, we propose to represent geometries by constructing similarity graphs from the intermediate representations obtained when processing a batch of inputs. By constraining these Latent Geometry Graphs (LGGs), we address the three following problems: (i) reproducing the behavior of a teacher architecture is achieved by mimicking its geometry, (ii) designing efficient embeddings for classification is achieved by targeting specific geometries, and (iii) robustness to deviations on inputs is achieved via enforcing smooth variation of geometry between consecutive latent spaces. Using standard vision benchmarks, we demonstrate the ability of the proposed geometry-based methods in solving the considered problems.


2018 ◽  
Vol 7 (2.7) ◽  
pp. 614 ◽  
Author(s):  
M Manoj krishna ◽  
M Neelima ◽  
M Harshali ◽  
M Venu Gopala Rao

The image classification is a classical problem of image processing, computer vision and machine learning fields. In this paper we study the image classification using deep learning. We use AlexNet architecture with convolutional neural networks for this purpose. Four test images are selected from the ImageNet database for the classification purpose. We cropped the images for various portion areas and conducted experiments. The results show the effectiveness of deep learning based image classification using AlexNet.  


Author(s):  
Shatakshi Singh ◽  
Kanika Gautam ◽  
Prachi Singhal ◽  
Sunil Kumar Jangir ◽  
Manish Kumar

The recent development in artificial intelligence is quite astounding in this decade. Especially, machine learning is one of the core subareas of AI. Also, ML field is an incessantly growing along with evolution and becomes a rise in its demand and importance. It transmogrified the way data is extracted, analyzed, and interpreted. Computers are trained to get in a self-training mode so that when new data is fed they can learn, grow, change, and develop themselves without explicit programming. It helps to make useful predictions that can guide better decisions in a real-life situation without human interference. Selection of ML tool is always a challenging task, since choosing an appropriate tool can end up saving time as well as making it faster and easier to provide any solution. This chapter provides a classification of various machine learning tools on the following aspects: for non-programmers, for model deployment, for Computer vision, natural language processing, and audio for reinforcement learning and data mining.


2019 ◽  
Vol 63 (4) ◽  
pp. 243-252 ◽  
Author(s):  
Jaret Hodges ◽  
Soumya Mohan

Machine learning algorithms are used in language processing, automated driving, and for prediction. Though the theory of machine learning has existed since the 1950s, it was not until the advent of advanced computing that their potential has begun to be realized. Gifted education is a field where machine learning has yet to be utilized, even though one of the underlying problems of gifted education is classification, which is an area where learning algorithms have become exceptionally accurate. We provide a brief overview of machine learning with a focus on neural networks and supervised learning, followed by a demonstration using simulated data and neural networks for classification issues with a practical explanation of the mechanics of the neural network and associated R code. Implications for gifted education are then discussed. Finally, the limitations of supervised learning are discussed. Code used in this article can be found at https://osf.io/4pa3b/


Sign in / Sign up

Export Citation Format

Share Document