Levenshtein Augmentation Improves Performance of SMILES Based Deep-Learning Synthesis Prediction

Author(s):  
Dean Sumner ◽  
Jiazhen He ◽  
Amol Thakkar ◽  
Ola Engkvist ◽  
Esben Jannik Bjerrum

<p>SMILES randomization, a form of data augmentation, has previously been shown to increase the performance of deep learning models compared to non-augmented baselines. Here, we propose a novel data augmentation method we call “Levenshtein augmentation” which considers local SMILES sub-sequence similarity between reactants and their respective products when creating training pairs. The performance of Levenshtein augmentation was tested using two state of the art models - transformer and sequence-to-sequence based recurrent neural networks with attention. Levenshtein augmentation demonstrated an increase performance over non-augmented, and conventionally SMILES randomization augmented data when used for training of baseline models. Furthermore, Levenshtein augmentation seemingly results in what we define as <i>attentional gain </i>– an enhancement in the pattern recognition capabilities of the underlying network to molecular motifs.</p>

2020 ◽  
Author(s):  
Dean Sumner ◽  
Jiazhen He ◽  
Amol Thakkar ◽  
Ola Engkvist ◽  
Esben Jannik Bjerrum

<p>SMILES randomization, a form of data augmentation, has previously been shown to increase the performance of deep learning models compared to non-augmented baselines. Here, we propose a novel data augmentation method we call “Levenshtein augmentation” which considers local SMILES sub-sequence similarity between reactants and their respective products when creating training pairs. The performance of Levenshtein augmentation was tested using two state of the art models - transformer and sequence-to-sequence based recurrent neural networks with attention. Levenshtein augmentation demonstrated an increase performance over non-augmented, and conventionally SMILES randomization augmented data when used for training of baseline models. Furthermore, Levenshtein augmentation seemingly results in what we define as <i>attentional gain </i>– an enhancement in the pattern recognition capabilities of the underlying network to molecular motifs.</p>


2020 ◽  
Author(s):  
Dean Sumner ◽  
Jiazhen He ◽  
Amol Thakkar ◽  
Ola Engkvist ◽  
Esben Jannik Bjerrum

<p>SMILES randomization, a form of data augmentation, has previously been shown to increase the performance of deep learning models compared to non-augmented baselines. Here, we propose a novel data augmentation method we call “Levenshtein augmentation” which considers local SMILES sub-sequence similarity between reactants and their respective products when creating training pairs. The performance of Levenshtein augmentation was tested using two state of the art models - transformer and sequence-to-sequence based recurrent neural networks with attention. Levenshtein augmentation demonstrated an increase performance over non-augmented, and conventionally SMILES randomization augmented data when used for training of baseline models. Furthermore, Levenshtein augmentation seemingly results in what we define as <i>attentional gain </i>– an enhancement in the pattern recognition capabilities of the underlying network to molecular motifs.</p>


Mathematics ◽  
2020 ◽  
Vol 8 (12) ◽  
pp. 2178
Author(s):  
Yi-Chung Chen ◽  
Tsu-Chiang Lei ◽  
Shun Yao ◽  
Hsin-Ping Wang

Airborne particulate matter 2.5 (PM2.5) can have a profound effect on the health of the population. Many researchers have been reporting highly accurate numerical predictions based on raw PM2.5 data imported directly into deep learning models; however, there is still considerable room for improvement in terms of implementation costs due to heavy computational overhead. From the perspective of environmental science, PM2.5 values in a given location can be attributed to local sources as well as external sources. Local sources tend to have a dramatic short-term impact on PM2.5 values, whereas external sources tend to have more subtle but longer-lasting effects. In the presence of PM2.5 from both sources at the same time, this combination of effects can undermine the predictive accuracy of the model. This paper presents a novel combinational Hammerstein recurrent neural network (CHRNN) to enhance predictive accuracy and overcome the heavy computational and monetary burden imposed by deep learning models. The CHRNN comprises a based-neural network tasked with learning gradual (long-term) fluctuations in conjunction with add-on neural networks to deal with dramatic (short-term) fluctuations. The CHRNN can be coupled with a random forest model to determine the degree to which short-term effects influence long-term outcomes. We also developed novel feature selection and normalization methods to enhance prediction accuracy. Using real-world measurement data of air quality and PM2.5 datasets from Taiwan, the precision of the proposed system in the numerical prediction of PM2.5 levels was comparable to that of state-of-the-art deep learning models, such as deep recurrent neural networks and long short-term memory, despite far lower implementation costs and computational overhead.


10.29007/j5hd ◽  
2020 ◽  
Author(s):  
Bartosz Piotrowski ◽  
Josef Urban

In this work we develop a new learning-based method for selecting facts (premises) when proving new goals over large formal libraries. Unlike previous methods that choose sets of facts independently of each other by their rank, the new method uses the notion of state that is updated each time a choice of a fact is made. Our stateful architecture is based on recurrent neural networks which have been recently very successful in stateful tasks such as language translation. The new method is combined with data augmentation techniques, evaluated in several ways on a standard large-theory benchmark and compared to state-of-the-art premise approach based on gradient boosted trees. It is shown to perform significantly better and to solve many new problems.


Information ◽  
2019 ◽  
Vol 10 (5) ◽  
pp. 157 ◽  
Author(s):  
Daniel S. Berman

Domain generation algorithms (DGAs) represent a class of malware used to generate large numbers of new domain names to achieve command-and-control (C2) communication between the malware program and its C2 server to avoid detection by cybersecurity measures. Deep learning has proven successful in serving as a mechanism to implement real-time DGA detection, specifically through the use of recurrent neural networks (RNNs) and convolutional neural networks (CNNs). This paper compares several state-of-the-art deep-learning implementations of DGA detection found in the literature with two novel models: a deeper CNN model and a one-dimensional (1D) Capsule Networks (CapsNet) model. The comparison shows that the 1D CapsNet model performs as well as the best-performing model from the literature.


Sensors ◽  
2021 ◽  
Vol 21 (19) ◽  
pp. 6410
Author(s):  
Ke Zang ◽  
Wenqi Wu ◽  
Wei Luo

Deep learning models, especially recurrent neural networks (RNNs), have been successfully applied to automatic modulation classification (AMC) problems recently. However, deep neural networks are usually overparameterized, i.e., most of the connections between neurons are redundant. The large model size hinders the deployment of deep neural networks in applications such as Internet-of-Things (IoT) networks. Therefore, reducing parameters without compromising the network performance via sparse learning is often desirable since it can alleviates the computational and storage burdens of deep learning models. In this paper, we propose a sparse learning algorithm that can directly train a sparsely connected neural network based on the statistics of weight magnitude and gradient momentum. We first used the MNIST and CIFAR10 datasets to demonstrate the effectiveness of this method. Subsequently, we applied it to RNNs with different pruning strategies on recurrent and non-recurrent connections for AMC problems. Experimental results demonstrated that the proposed method can effectively reduce the parameters of the neural networks while maintaining model performance. Moreover, we show that appropriate sparsity can further improve network generalization ability.


2021 ◽  
Vol 11 (9) ◽  
pp. 3883
Author(s):  
Spyridon Kardakis ◽  
Isidoros Perikos ◽  
Foteini Grivokostopoulou ◽  
Ioannis Hatzilygeroudis

Attention-based methods for deep neural networks constitute a technique that has attracted increased interest in recent years. Attention mechanisms can focus on important parts of a sequence and, as a result, enhance the performance of neural networks in a variety of tasks, including sentiment analysis, emotion recognition, machine translation and speech recognition. In this work, we study attention-based models built on recurrent neural networks (RNNs) and examine their performance in various contexts of sentiment analysis. Self-attention, global-attention and hierarchical-attention methods are examined under various deep neural models, training methods and hyperparameters. Even though attention mechanisms are a powerful recent concept in the field of deep learning, their exact effectiveness in sentiment analysis is yet to be thoroughly assessed. A comparative analysis is performed in a text sentiment classification task where baseline models are compared with and without the use of attention for every experiment. The experimental study additionally examines the proposed models’ ability in recognizing opinions and emotions in movie reviews. The results indicate that attention-based models lead to great improvements in the performance of deep neural models showcasing up to a 3.5% improvement in their accuracy.


2019 ◽  
Author(s):  
Janayna Moura ◽  
Lucas Bissaro ◽  
Fernanda Santos ◽  
Murillo Carneiro

Credit evaluation models have been largely studied in the accounting and finance literature. With the support of such models, usually developed as part of a data mining process, it is possible to classify the credit applicants more accurately into ``good'' or ``bad'' risk groups. Despite many machine learning techniques have been extensively evaluated to this problem, deep learning models have been barely explored yet, although they have provided state-of-the-art results for a myriad of applications. In this paper, we propose deep learning models for the credit evaluation problem. To be specific, we investigate the abilities of deep neural networks (DNN) and convolutional neural networks (CNN) for such a problem and systematically compare their classification accuracy against five commonly adopted techniques on three real-world credit evaluation datasets. The results show that random forest, which is a state-of-the-art technique for such a problem, presented the most consistent performance, although CNN demonstrated a high potential to outperform it in bigger datasets.


Author(s):  
Liuyu Xiang ◽  
Xiaoming Jin ◽  
Lan Yi ◽  
Guiguang Ding

Deep learning models such as convolutional neural networks and recurrent networks are widely applied in text classification. In spite of their great success, most deep learning models neglect the importance of modeling context information, which is crucial to understanding texts. In this work, we propose the Adaptive Region Embedding to learn context representation to improve text classification. Specifically, a metanetwork is learned to generate a context matrix for each region, and each word interacts with its corresponding context matrix to produce the regional representation for further classification. Compared to previous models that are designed to capture context information, our model contains less parameters and is more flexible. We extensively evaluate our method on 8 benchmark datasets for text classification. The experimental results prove that our method achieves state-of-the-art performances and effectively avoids word ambiguity.


2021 ◽  
Author(s):  
Kanimozhi V ◽  
T. Prem Jacob

Abstract Although numerous profound learning models have been proposed, this research article contributed to symbolize the investigation of artificial deep learning models on sensible IoT gadgets to perform online protection in IoT network traffic by using the realistic IoT-23 dataset. This dataset is a recent network traffic dataset generated from the real-time network traffic data of IoT appliances. IoT products are utilized in various program applications such as home, commercial, mechanization, and various forms of wearable technologies. IoT security is more critical than network security because of its massive attack surface and multiplied weak spots of IoT gadgets. Universally, the general amount of IoT gadgets conveyed by 2025 is foreseen to achieve 41600 million. Henceforth, IoT anomaly detection systems based on the realistic Iot-23 big data for detecting IoT-based attacks on the artificial neural networks of Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), and Multilayer perceptron (MLP) in IoT- cybersecurity has implemented and executed in this research article. As a result, Convolutional Neural Networks produces an outstanding performance of metric accuracy score is 0.998234, and minimal loss function is 0.008842, compare to Multilayer perceptron and Recurrent Neural Networks in IoT Anomaly Detection. Also generated well-displayed graph plots of Model_Accuracy, Learning curve of artificial Intelligence deep learning algorithms such as MLP, CNN, and RNN.


Sign in / Sign up

Export Citation Format

Share Document