Transmol: Repurposing Language Model for Molecular Generation

10.26434/chemrxiv.14350610 ◽

2021 ◽

Author(s):

Rustam Zhumagambetov ◽

Vsevolod A. Peshkov ◽

Siamac Fazli

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Image Processing ◽

Language Processing ◽

Scientific Community ◽

De Novo ◽

Language Model ◽

The Core ◽

Molecular Libraries ◽

Core Metrics

Recent advances in convolutional neural networks have inspired the application of deep learning to other disciplines. Even though image processing and natural language processing have turned out to be the most successful, there are many other areas that have benefited, like computational chemistry in general and drug design in particular. From 2018 the scientific community has seen a surge of methodologies related to the generation of diverse molecular libraries using machine learning. However, no algorithm used an attention mechanisms for de novo molecular generation. Here we employ a variant of transformers, a recent NLP architecture, for this purpose. We have achieved a statistically significant increase in some of the core metrics of the MOSES benchmark. Furthermore, a novel way of generating libraries fusing two molecules as seeds has been described.

Download Full-text

Neural Language Modeling for Molecule Generation

10.26434/chemrxiv.14700831 ◽

2021 ◽

Author(s):

Sanjar Adilov

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Deep Learning ◽

Natural Language Processing ◽

Drug Design ◽

Natural Language ◽

Language Processing ◽

De Novo ◽

Language Modeling ◽

Machine Learning Methods

Generative neural networks have shown promising results in de novo drug design. Recent studies suggest that one of the efficient ways to produce novel molecules matching target properties is to model SMILES sequences using deep learning in a way similar to language modeling in natural language processing. In this paper, we present a survey of various machine learning methods for SMILES-based language modeling and propose our benchmarking results on a standardized subset of ChEMBL database.

Download Full-text

Neural Language Modeling for Molecule Generation

10.26434/chemrxiv.14700831.v1 ◽

2021 ◽

Author(s):

Sanjar Adilov

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Deep Learning ◽

Natural Language Processing ◽

Drug Design ◽

Natural Language ◽

Language Processing ◽

De Novo ◽

Language Modeling ◽

Machine Learning Methods

Generative neural networks have shown promising results in de novo drug design. Recent studies suggest that one of the efficient ways to produce novel molecules matching target properties is to model SMILES sequences using deep learning in a way similar to language modeling in natural language processing. In this paper, we present a survey of various machine learning methods for SMILES-based language modeling and propose our benchmarking results on a standardized subset of ChEMBL database.

Download Full-text

Artificial Intelligence in Business

Management of Data in AI Age ◽

10.46679/isbn978819484834901 ◽

2020 ◽

pp. 1-38

Author(s):

Amandeep Kaur ◽

◽

Anjum Mohammad Aslam ◽

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Neural Networks ◽

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Case Studies ◽

Language Processing ◽

Learning Abilities ◽

The Core

In this chapter we discuss the core concept of Artificial Intelligence. We define the term of Artificial Intelligence and its interconnected terms such as Machine learning, deep learning, Neural Networks. We describe the concept with the perspective of its usage in the area of business. We further analyze various applications and case studies which can be achieved using Artificial Intelligence and its sub fields. In the area of business already numerous Artificial Intelligence applications are being utilized and will be expected to be utilized more in the future where machines will improve the Artificial Intelligence, Natural language processing, Machine learning abilities of humans in various zones.

Download Full-text

A Survey on Bias in Deep NLP

Applied Sciences ◽

10.3390/app11073184 ◽

2021 ◽

Vol 11 (7) ◽

pp. 3184

Author(s):

Ismael Garrido-Muñoz ◽

Arturo Montejo-Ráez ◽

Fernando Martínez-Santiago ◽

L. Alfonso Ureña-López

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Natural Language Processing ◽

Probability Distribution ◽

Natural Language ◽

Network Design ◽

Language Processing ◽

Deep Neural Networks ◽

Learning Processes ◽

Relevant Issue

Deep neural networks are hegemonic approaches to many machine learning areas, including natural language processing (NLP). Thanks to the availability of large corpora collections and the capability of deep architectures to shape internal language mechanisms in self-supervised learning processes (also known as “pre-training”), versatile and performing models are released continuously for every new network design. These networks, somehow, learn a probability distribution of words and relations across the training collection used, inheriting the potential flaws, inconsistencies and biases contained in such a collection. As pre-trained models have been found to be very useful approaches to transfer learning, dealing with bias has become a relevant issue in this new scenario. We introduce bias in a formal way and explore how it has been treated in several networks, in terms of detection and correction. In addition, available resources are identified and a strategy to deal with bias in deep NLP is proposed.

Download Full-text

Representing Deep Neural Networks Latent Space Geometries with Graphs

Algorithms ◽

10.3390/a14020039 ◽

2021 ◽

Vol 14 (2) ◽

pp. 39

Author(s):

Carlos Lassance ◽

Vincent Gripon ◽

Antonio Ortega

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Deep Learning ◽

Objective Function ◽

Learning Process ◽

Deep Neural Networks ◽

State Of The Art ◽

The Core ◽

Learning Tasks ◽

Latent Space

Deep Learning (DL) has attracted a lot of attention for its ability to reach state-of-the-art performance in many machine learning tasks. The core principle of DL methods consists of training composite architectures in an end-to-end fashion, where inputs are associated with outputs trained to optimize an objective function. Because of their compositional nature, DL architectures naturally exhibit several intermediate representations of the inputs, which belong to so-called latent spaces. When treated individually, these intermediate representations are most of the time unconstrained during the learning process, as it is unclear which properties should be favored. However, when processing a batch of inputs concurrently, the corresponding set of intermediate representations exhibit relations (what we call a geometry) on which desired properties can be sought. In this work, we show that it is possible to introduce constraints on these latent geometries to address various problems. In more detail, we propose to represent geometries by constructing similarity graphs from the intermediate representations obtained when processing a batch of inputs. By constraining these Latent Geometry Graphs (LGGs), we address the three following problems: (i) reproducing the behavior of a teacher architecture is achieved by mimicking its geometry, (ii) designing efficient embeddings for classification is achieved by targeting specific geometries, and (iii) robustness to deviations on inputs is achieved via enforcing smooth variation of geometry between consecutive latent spaces. Using standard vision benchmarks, we demonstrate the ability of the proposed geometry-based methods in solving the considered problems.

Download Full-text

Image classification using Deep learning

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i2.7.10892 ◽

2018 ◽

Vol 7 (2.7) ◽

pp. 614 ◽

Cited By ~ 5

Author(s):

M Manoj krishna ◽

M Neelima ◽

M Harshali ◽

M Venu Gopala Rao

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Image Processing ◽

Computer Vision ◽

Deep Learning ◽

Image Classification ◽

Convolutional Neural Networks ◽

Classical Problem

The image classification is a classical problem of image processing, computer vision and machine learning fields. In this paper we study the image classification using deep learning. We use AlexNet architecture with convolutional neural networks for this purpose. Four test images are selected from the ImageNet database for the classification purpose. We cropped the images for various portion areas and conducted experiments. The results show the effectiveness of deep learning based image classification using AlexNet.

Download Full-text

A Survey on Intelligence Tools for Data Analytics

Advances in Data Mining and Database Management - Handbook of Research on Engineering, Business, and Healthcare Applications of Data Science and Analytics ◽

10.4018/978-1-7998-3053-5.ch005 ◽

2021 ◽

pp. 73-95

Author(s):

Shatakshi Singh ◽

Kanika Gautam ◽

Prachi Singhal ◽

Sunil Kumar Jangir ◽

Manish Kumar

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Language Processing ◽

Real Life ◽

Learning Tools ◽

The Core ◽

Training Mode ◽

Real Life Situation ◽

Selection Of

The recent development in artificial intelligence is quite astounding in this decade. Especially, machine learning is one of the core subareas of AI. Also, ML field is an incessantly growing along with evolution and becomes a rise in its demand and importance. It transmogrified the way data is extracted, analyzed, and interpreted. Computers are trained to get in a self-training mode so that when new data is fed they can learn, grow, change, and develop themselves without explicit programming. It helps to make useful predictions that can guide better decisions in a real-life situation without human interference. Selection of ML tool is always a challenging task, since choosing an appropriate tool can end up saving time as well as making it faster and easier to provide any solution. This chapter provides a classification of various machine learning tools on the following aspects: for non-programmers, for model deployment, for Computer vision, natural language processing, and audio for reinforcement learning and data mining.

Download Full-text

Pixel normalization from numeric data as input to neural networks: For machine learning and image processing

2017 International Conference on Wireless Communications, Signal Processing and Networking (WiSPNET) ◽

10.1109/wispnet.2017.8300154 ◽

2017 ◽

Cited By ~ 2

Author(s):

Parth Sane ◽

Ravindra Agrawal

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Image Processing ◽

Numeric Data

Download Full-text

Machine Learning in Gifted Education: A Demonstration Using Neural Networks

Gifted Child Quarterly ◽

10.1177/0016986219867483 ◽

2019 ◽

Vol 63 (4) ◽

pp. 243-252 ◽

Cited By ~ 1

Author(s):

Jaret Hodges ◽

Soumya Mohan

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Supervised Learning ◽

Language Processing ◽

Gifted Education ◽

Learning Algorithms ◽

Simulated Data ◽

Machine Learning Algorithms ◽

Automated Driving ◽

The Neural Network

Machine learning algorithms are used in language processing, automated driving, and for prediction. Though the theory of machine learning has existed since the 1950s, it was not until the advent of advanced computing that their potential has begun to be realized. Gifted education is a field where machine learning has yet to be utilized, even though one of the underlying problems of gifted education is classification, which is an area where learning algorithms have become exceptionally accurate. We provide a brief overview of machine learning with a focus on neural networks and supervised learning, followed by a demonstration using simulated data and neural networks for classification issues with a practical explanation of the mechanics of the neural network and associated R code. Implications for gifted education are then discussed. Finally, the limitations of supervised learning are discussed. Code used in this article can be found at https://osf.io/4pa3b/

Download Full-text