scholarly journals Fragment-based Sequential Translation for Molecular Optimization

Author(s):  
Benson Chen ◽  
Xiang Fu ◽  
Tommi Jaakkola ◽  
Regina Barzilay

Searching for novel molecular compounds with desired properties is an important problem in drug discovery. Many existing frameworks generate molecules one atom at a time. We instead propose a flexible editing paradigm that generates molecules using learned molecular fragments---meaningful substructures of molecules. To do so, we train a variational autoencoder (VAE) to encode molecular fragments in a coherent latent space, which we then utilize as a vocabulary for editing molecules to explore the complex chemical property space. Equipped with the learned fragment vocabulary, we propose Fragment-based Sequential Translation (FaST), which learns a reinforcement learning (RL) policy to iteratively translate model-discovered molecules into increasingly novel molecules while satisfying desired properties. Empirical evaluation shows that FaST significantly improves over state-of-the-art methods on benchmark single/multi-objective molecular optimization tasks.

2021 ◽  
Author(s):  
Benson Chen ◽  
Xiang Fu ◽  
Regina Barzilay ◽  
Tommi Jaakkola

Searching for novel molecular compounds with desired properties is an important problem in drug discovery. Many existing frameworks generate molecules one atom at a time. We instead propose a flexible editing paradigm that generates molecules using learned molecular fragments---meaningful substructures of molecules. To do so, we train a variational autoencoder (VAE) to encode molecular fragments in a coherent latent space, which we then utilize as a vocabulary for editing molecules to explore the complex chemical property space. Equipped with the learned fragment vocabulary, we propose Fragment-based Sequential Translation (FaST), which learns a reinforcement learning (RL) policy to iteratively translate model-discovered molecules into increasingly novel molecules while satisfying desired properties. Empirical evaluation shows that FaST significantly improves over state-of-the-art methods on benchmark single/multi-objective molecular optimization tasks.


Author(s):  
Bidisha Samanta ◽  
Sharmila Reddy ◽  
Hussain Jagirdar ◽  
Niloy Ganguly ◽  
Soumen Chakrabarti

Code-switching, the interleaving of two or more languages within a sentence or discourse is pervasive in multilingual societies. Accurate language models for code-switched text are critical for NLP tasks. State-of-the-art data-intensive neural language models are difficult to train well from scarce language-labeled code-switched text. A potential solution is to use deep generative models to synthesize large volumes of realistic code-switched text. Although generative adversarial networks and variational autoencoders can synthesize plausible monolingual text from continuous latent space, they cannot adequately address code-switched text, owing to their informal style and complex interplay between the constituent languages. We introduce VACS, a novel variational autoencoder architecture specifically tailored to code-switching phenomena. VACS encodes to and decodes from a two-level hierarchical representation, which models syntactic contextual signals in the lower level, and language switching signals in the upper layer. Sampling representations from the prior and decoding them produced well-formed, diverse code-switched sentences. Extensive experiments show that using synthetic code-switched text with natural monolingual data results in significant (33.06\%) drop in perplexity.


Author(s):  
Ridhi Arora ◽  
Vipul Bansal ◽  
Himanshu Buckchash ◽  
Rahul Kumar ◽  
Vinodh J Sahayasheela ◽  
...  

<div>According to WHO, COVID-19 is an infectious disease and has a significant social and economic impact. The main challenge in ?fighting against this disease is its scale. Due to the imminent outbreak, the medical facilities are over exhausted and unable to accommodate the piling cases. A quick diagnosis system is required to address these challenges. To this end, a stochastic deep learning model is proposed. The main idea is to constrain the deep representations over a gaussian prior to reinforce the discriminability in feature space. The model can work on chest X-ray or CT-scan images. It provides</div><div>a fast diagnosis of COVID-19 and can scale seamlessly. This work presents a comprehensive evaluation of previously proposed approaches for X-ray based</div><div>disease diagnosis. Our approach works by learning a latent space over X-ray image distribution from the ensemble of state-of-the-art convolutional-nets,</div><div>and then linearly regressing the predictions from an ensemble of classifi?ers which take the latent vector as input. We experimented with publicly available datasets having three classes { COVID-19, normal, Pneumonia. Moreover, for robust evaluation, experiments were performed on a large chest X-ray dataset with fi?ve different very similar diseases. Extensive empirical evaluation shows</div><div>how the proposed approach advances the state-of-the-art.</div>


Author(s):  
Ridhi Arora ◽  
Vipul Bansal ◽  
Himanshu Buckchash ◽  
Rahul Kumar ◽  
Vinodh J Sahayasheela ◽  
...  

<div>According to WHO, COVID-19 is an infectious disease and has a significant social and economic impact. The main challenge in ?fighting against this disease is its scale. Due to the imminent outbreak, the medical facilities are over exhausted and unable to accommodate the piling cases. A quick diagnosis system is required to address these challenges. To this end, a stochastic deep learning model is proposed. The main idea is to constrain the deep representations over a gaussian prior to reinforce the discriminability in feature space. The model can work on chest X-ray or CT-scan images. It provides</div><div>a fast diagnosis of COVID-19 and can scale seamlessly. This work presents a comprehensive evaluation of previously proposed approaches for X-ray based</div><div>disease diagnosis. Our approach works by learning a latent space over X-ray image distribution from the ensemble of state-of-the-art convolutional-nets,</div><div>and then linearly regressing the predictions from an ensemble of classifi?ers which take the latent vector as input. We experimented with publicly available datasets having three classes { COVID-19, normal, Pneumonia. Moreover, for robust evaluation, experiments were performed on a large chest X-ray dataset with fi?ve different very similar diseases. Extensive empirical evaluation shows</div><div>how the proposed approach advances the state-of-the-art.</div>


Author(s):  
Dazhong Shen ◽  
Chuan Qin ◽  
Chao Wang ◽  
Hengshu Zhu ◽  
Enhong Chen ◽  
...  

As one of the most popular generative models, Variational Autoencoder (VAE) approximates the posterior of latent variables based on amortized variational inference. However, when the decoder network is sufficiently expressive, VAE may lead to posterior collapse; that is, uninformative latent representations may be learned. To this end, in this paper, we propose an alternative model, DU-VAE, for learning a more Diverse and less Uncertain latent space, and thus the representation can be learned in a meaningful and compact manner. Specifically, we first theoretically demonstrate that it will result in better latent space with high diversity and low uncertainty awareness by controlling the distribution of posterior’s parameters across the whole data accordingly. Then, without the introduction of new loss terms or modifying training strategies, we propose to exploit Dropout on the variances and Batch-Normalization on the means simultaneously to regularize their distributions implicitly. Furthermore, to evaluate the generalization effect, we also exploit DU-VAE for inverse autoregressive flow based-VAE (VAE-IAF) empirically. Finally, extensive experiments on three benchmark datasets clearly show that our approach can outperform state-of-the-art baselines on both likelihood estimation and underlying classification tasks.


2022 ◽  
Vol 16 (2) ◽  
pp. 1-37
Author(s):  
Hangbin Zhang ◽  
Raymond K. Wong ◽  
Victor W. Chu

E-commerce platforms heavily rely on automatic personalized recommender systems, e.g., collaborative filtering models, to improve customer experience. Some hybrid models have been proposed recently to address the deficiency of existing models. However, their performances drop significantly when the dataset is sparse. Most of the recent works failed to fully address this shortcoming. At most, some of them only tried to alleviate the problem by considering either user side or item side content information. In this article, we propose a novel recommender model called Hybrid Variational Autoencoder (HVAE) to improve the performance on sparse datasets. Different from the existing approaches, we encode both user and item information into a latent space for semantic relevance measurement. In parallel, we utilize collaborative filtering to find the implicit factors of users and items, and combine their outputs to deliver a hybrid solution. In addition, we compare the performance of Gaussian distribution and multinomial distribution in learning the representations of the textual data. Our experiment results show that HVAE is able to significantly outperform state-of-the-art models with robust performance.


2019 ◽  
Author(s):  
Gabriel O. Ramos ◽  
Ana L. C. Bazzan ◽  
Bruno C. Da Silva

Traffic congestions present a major challenge in large cities. Consid- ering the distributed, self-interested nature oftraffic we tackle congestions using multiagent reinforcement learning (MARL). In this thesis, we advance the state- of-the-art by delivering the first MARL convergence guarantees in congestion- like problems. We introduce an algorithm through which drivers can learn opti- mal routes by locally estimating the regret associated with their decisions, which we prove to converge to an equilibrium. In order to mitigate the effects ofselfish- ness, we also devise a decentralised tolling scheme, which we prove to minimise traffic congestion levels. Our theoretical results are supported by an extensive empirical evaluation on realistic traffic networks. 1.


Author(s):  
Rémy Portelas ◽  
Cédric Colas ◽  
Lilian Weng ◽  
Katja Hofmann ◽  
Pierre-Yves Oudeyer

Automatic Curriculum Learning (ACL) has become a cornerstone of recent successes in Deep Reinforcement Learning (DRL). These methods shape the learning trajectories of agents by challenging them with tasks adapted to their capacities. In recent years, they have been used to improve sample efficiency and asymptotic performance, to organize exploration, to encourage generalization or to solve sparse reward problems, among others. To do so, ACL mechanisms can act on many aspects of learning problems. They can optimize domain randomization for Sim2Real transfer, organize task presentations in multi-task robotic settings, order sequences of opponents in multi-agent scenarios, etc. The ambition of this work is dual: 1) to present a compact and accessible introduction to the Automatic Curriculum Learning literature and 2) to draw a bigger picture of the current state of the art in ACL to encourage the cross-breeding of existing concepts and the emergence of new ideas.


2021 ◽  
Vol 15 (3) ◽  
pp. 1-35
Author(s):  
Muhammad Anis Uddin Nasir ◽  
Cigdem Aslay ◽  
Gianmarco De Francisci Morales ◽  
Matteo Riondato

“Perhaps he could dance first and think afterwards, if it isn’t too much to ask him.” S. Beckett, Waiting for Godot Given a labeled graph, the collection of -vertex induced connected subgraph patterns that appear in the graph more frequently than a user-specified minimum threshold provides a compact summary of the characteristics of the graph, and finds applications ranging from biology to network science. However, finding these patterns is challenging, even more so for dynamic graphs that evolve over time, due to the streaming nature of the input and the exponential time complexity of the problem. We study this task in both incremental and fully-dynamic streaming settings, where arbitrary edges can be added or removed from the graph. We present TipTap , a suite of algorithms to compute high-quality approximations of the frequent -vertex subgraphs w.r.t. a given threshold, at any time (i.e., point of the stream), with high probability. In contrast to existing state-of-the-art solutions that require iterating over the entire set of subgraphs in the vicinity of the updated edge, TipTap operates by efficiently maintaining a uniform sample of connected -vertex subgraphs, thanks to an optimized neighborhood-exploration procedure. We provide a theoretical analysis of the proposed algorithms in terms of their unbiasedness and of the sample size needed to obtain a desired approximation quality. Our analysis relies on sample-complexity bounds that use Vapnik–Chervonenkis dimension, a key concept from statistical learning theory, which allows us to derive a sufficient sample size that is independent from the size of the graph. The results of our empirical evaluation demonstrates that TipTap returns high-quality results more efficiently and accurately than existing baselines.


Algorithms ◽  
2021 ◽  
Vol 14 (8) ◽  
pp. 226
Author(s):  
Wenzel Pilar von Pilchau ◽  
Anthony Stein ◽  
Jörg Hähner

State-of-the-art Deep Reinforcement Learning Algorithms such as DQN and DDPG use the concept of a replay buffer called Experience Replay. The default usage contains only the experiences that have been gathered over the runtime. We propose a method called Interpolated Experience Replay that uses stored (real) transitions to create synthetic ones to assist the learner. In this first approach to this field, we limit ourselves to discrete and non-deterministic environments and use a simple equally weighted average of the reward in combination with observed follow-up states. We could demonstrate a significantly improved overall mean average in comparison to a DQN network with vanilla Experience Replay on the discrete and non-deterministic FrozenLake8x8-v0 environment.


Sign in / Sign up

Export Citation Format

Share Document