scholarly journals Unsupervised Learning of KB Queries in Task-Oriented Dialogs

2021 ◽  
Vol 9 ◽  
pp. 374-390
Author(s):  
Dinesh Raghu ◽  
Nikhil Gupta ◽  
Mausam

Abstract Task-oriented dialog (TOD) systems often need to formulate knowledge base (KB) queries corresponding to the user intent and use the query results to generate system responses. Existing approaches require dialog datasets to explicitly annotate these KB queries—these annotations can be time consuming, and expensive. In response, we define the novel problems of predicting the KB query and training the dialog agent, without explicit KB query annotation. For query prediction, we propose a reinforcement learning (RL) baseline, which rewards the generation of those queries whose KB results cover the entities mentioned in subsequent dialog. Further analysis reveals that correlation among query attributes in KB can significantly confuse memory augmented policy optimization (MAPO), an existing state of the art RL agent. To address this, we improve the MAPO baseline with simple but important modifications suited to our task. To train the full TOD system for our setting, we propose a pipelined approach: it independently predicts when to make a KB query (query position predictor), then predicts a KB query at the predicted position (query predictor), and uses the results of predicted query in subsequent dialog (next response predictor). Overall, our work proposes first solutions to our novel problem, and our analysis highlights the research challenges in training TOD systems without query annotation.

2022 ◽  
pp. 1-12
Author(s):  
Shuailong Li ◽  
Wei Zhang ◽  
Huiwen Zhang ◽  
Xin Zhang ◽  
Yuquan Leng

Model-free reinforcement learning methods have successfully been applied to practical applications such as decision-making problems in Atari games. However, these methods have inherent shortcomings, such as a high variance and low sample efficiency. To improve the policy performance and sample efficiency of model-free reinforcement learning, we propose proximal policy optimization with model-based methods (PPOMM), a fusion method of both model-based and model-free reinforcement learning. PPOMM not only considers the information of past experience but also the prediction information of the future state. PPOMM adds the information of the next state to the objective function of the proximal policy optimization (PPO) algorithm through a model-based method. This method uses two components to optimize the policy: the error of PPO and the error of model-based reinforcement learning. We use the latter to optimize a latent transition model and predict the information of the next state. For most games, this method outperforms the state-of-the-art PPO algorithm when we evaluate across 49 Atari games in the Arcade Learning Environment (ALE). The experimental results show that PPOMM performs better or the same as the original algorithm in 33 games.


Author(s):  
Man Luo ◽  
Wenzhe Zhang ◽  
Tianyou Song ◽  
Kun Li ◽  
Hongming Zhu ◽  
...  

Electric Vehicle (EV) sharing systems have recently experienced unprecedented growth across the world. One of the key challenges in their operation is vehicle rebalancing, i.e., repositioning the EVs across stations to better satisfy future user demand. This is particularly challenging in the shared EV context, because i) the range of EVs is limited while charging time is substantial, which constrains the rebalancing options; and ii) as a new mobility trend, most of the current EV sharing systems are still continuously expanding their station networks, i.e., the targets for rebalancing can change over time. To tackle these challenges, in this paper we model the rebalancing task as a Multi-Agent Reinforcement Learning (MARL) problem, which directly takes the range and charging properties of the EVs into account. We propose a novel approach of policy optimization with action cascading, which isolates the non-stationarity locally, and use two connected networks to solve the formulated MARL. We evaluate the proposed approach using a simulator calibrated with 1-year operation data from a real EV sharing system. Results show that our approach significantly outperforms the state-of-the-art, offering up to 14% gain in order satisfied rate and 12% increase in net revenue.


2021 ◽  
Vol 9 (1) ◽  
pp. 69-110
Author(s):  
Shailesh Kumar Shivakumar

In this paper, the authors introduce the novel concept of intent-based code search that categorizes code search goals into a hierarchy. They will explore state-of-the-art techniques in source code search covering various tools, techniques, and algorithms related to source code search. They will survey the code search field through the core use cases of code search such as code reusability, code understanding, and code repair. They propose a user intent-based taxonomy based on the code search goals. The code search goal taxonomy is derived based on deep analysis of literature survey of code search, and the taxonomy is validated based on their exclusive developer survey conducted as part of this paper. The code search goal taxonomy is based on logical categorization of code search goals and shared characteristics (query type, expected response, and such) for each of the categories in the taxonomy. The paper also details the latest trends and surveys the code search tools and the implications on tool design.


2021 ◽  
Vol 11 (11) ◽  
pp. 4887
Author(s):  
Ting He ◽  
Xiaohong Xu ◽  
Yating Wu ◽  
Huazhen Wang ◽  
Jian Chen

Intent detection and slot filling are important modules in task-oriented dialog systems. In order to make full use of the relationship between different modules and resource sharing, solving the problem of a lack of semantics, this paper proposes a multitasking learning intent-detection system, based on the knowledge-base and slot-filling joint model. The approach has been used to share information and rich external utility between intent and slot modules in a three-part process. First, this model obtains shared parameters and features between the two modules based on long short-term memory and convolutional neural networks. Second, a knowledge base is introduced into the model to improve its performance. Finally, a weighted-loss function is built to optimize the joint model. Experimental results demonstrate that our model achieves better performance compared with state-of-the-art algorithms on a benchmark Airline Travel Information System (ATIS) dataset and the Snips dataset. Our joint model achieves state-of-the-art results on the benchmark ATIS dataset with a 1.33% intent-detection accuracy improvement, a 0.94% slot filling F value improvement, and with 0.19% and 0.31% improvements respectively on the Snips dataset.


Author(s):  
Yunshi Lan ◽  
Shuohang Wang ◽  
Jing Jiang

Knowledge base question answering (KBQA) is an important task in natural language processing. Existing methods for KBQA usually start with entity linking, which considers mostly named entities found in a question as the starting points in the KB to search for answers to the question. However, relying only on entity linking to look for answer candidates may not be sufficient. In this paper, we propose to perform topic unit linking where topic units cover a wider range of units of a KB. We use a generation-and-scoring approach to gradually refine the set of topic units. Furthermore, we use reinforcement learning to jointly learn the parameters for topic unit linking and answer candidate ranking in an end-to-end manner. Experiments on three commonly used benchmark datasets show that our method consistently works well and outperforms the previous state of the art on two datasets.


Symmetry ◽  
2019 ◽  
Vol 11 (2) ◽  
pp. 290 ◽  
Author(s):  
SeungYoon Choi ◽  
Tuyen Le ◽  
Quang Nguyen ◽  
Md Layek ◽  
SeungGwan Lee ◽  
...  

In this paper, we propose a controller for a bicycle using the DDPG (Deep Deterministic Policy Gradient) algorithm, which is a state-of-the-art deep reinforcement learning algorithm. We use a reward function and a deep neural network to build the controller. By using the proposed controller, a bicycle can not only be stably balanced but also travel to any specified location. We confirm that the controller with DDPG shows better performance than the other baselines such as Normalized Advantage Function (NAF) and Proximal Policy Optimization (PPO). For the performance evaluation, we implemented the proposed algorithm in various settings such as fixed and random speed, start location, and destination location.


Sensors ◽  
2021 ◽  
Vol 21 (4) ◽  
pp. 1278
Author(s):  
Jiang Hua ◽  
Liangcai Zeng ◽  
Gongfa Li ◽  
Zhaojie Ju

Dexterous manipulation of the robot is an important part of realizing intelligence, but manipulators can only perform simple tasks such as sorting and packing in a structured environment. In view of the existing problem, this paper presents a state-of-the-art survey on an intelligent robot with the capability of autonomous deciding and learning. The paper first reviews the main achievements and research of the robot, which were mainly based on the breakthrough of automatic control and hardware in mechanics. With the evolution of artificial intelligence, many pieces of research have made further progresses in adaptive and robust control. The survey reveals that the latest research in deep learning and reinforcement learning has paved the way for highly complex tasks to be performed by robots. Furthermore, deep reinforcement learning, imitation learning, and transfer learning in robot control are discussed in detail. Finally, major achievements based on these methods are summarized and analyzed thoroughly, and future research challenges are proposed.


Foods ◽  
2021 ◽  
Vol 10 (2) ◽  
pp. 316
Author(s):  
Marco Montemurro ◽  
Erica Pontonio ◽  
Rossana Coda ◽  
Carlo Giuseppe Rizzello

Due to the increasing demand for milk alternatives, related to both health and ethical needs, plant-based yogurt-like products have been widely explored in recent years. With the main goal to obtain snacks similar to the conventional yogurt in terms of textural and sensory properties and ability to host viable lactic acid bacteria for a long-time storage, several plant-derived ingredients (e.g., cereals, pseudocereals, legumes, and fruits) as well as technological solutions (e.g., enzymatic and thermal treatments) have been investigated. The central role of fermentation in yogurt-like production led to specific selections of lactic acid bacteria strains to be used as starters to guarantee optimal textural (e.g., through the synthesis of exo-polysaccharydes), nutritional (high protein digestibility and low content of anti-nutritional compounds), and functional (synthesis of bioactive compounds) features of the products. This review provides an overview of the novel insights on fermented yogurt-like products. The state-of-the-art on the use of unconventional ingredients, traditional and innovative biotechnological processes, and the effects of fermentation on the textural, nutritional, functional, and sensory features, and the shelf life are described. The supplementation of prebiotics and probiotics and the related health effects are also reviewed.


Sign in / Sign up

Export Citation Format

Share Document