Unsupervised Learning of KB Queries in Task-Oriented Dialogs

Abstract Task-oriented dialog (TOD) systems often need to formulate knowledge base (KB) queries corresponding to the user intent and use the query results to generate system responses. Existing approaches require dialog datasets to explicitly annotate these KB queries—these annotations can be time consuming, and expensive. In response, we define the novel problems of predicting the KB query and training the dialog agent, without explicit KB query annotation. For query prediction, we propose a reinforcement learning (RL) baseline, which rewards the generation of those queries whose KB results cover the entities mentioned in subsequent dialog. Further analysis reveals that correlation among query attributes in KB can significantly confuse memory augmented policy optimization (MAPO), an existing state of the art RL agent. To address this, we improve the MAPO baseline with simple but important modifications suited to our task. To train the full TOD system for our setting, we propose a pipelined approach: it independently predicts when to make a KB query (query position predictor), then predicts a KB query at the predicted position (query predictor), and uses the results of predicted query in subsequent dialog (next response predictor). Overall, our work proposes first solutions to our novel problem, and our analysis highlights the research challenges in training TOD systems without query annotation.

Download Full-text

Proximal policy optimization with model-based methods

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-211935 ◽

2022 ◽

pp. 1-12

Author(s):

Shuailong Li ◽

Wei Zhang ◽

Huiwen Zhang ◽

Xin Zhang ◽

Yuquan Leng

Keyword(s):

Reinforcement Learning ◽

State Of The Art ◽

Transition Model ◽

Practical Applications ◽

Original Algorithm ◽

Policy Performance ◽

Model Based ◽

Model Free ◽

Future State ◽

Policy Optimization

Model-free reinforcement learning methods have successfully been applied to practical applications such as decision-making problems in Atari games. However, these methods have inherent shortcomings, such as a high variance and low sample efficiency. To improve the policy performance and sample efficiency of model-free reinforcement learning, we propose proximal policy optimization with model-based methods (PPOMM), a fusion method of both model-based and model-free reinforcement learning. PPOMM not only considers the information of past experience but also the prediction information of the future state. PPOMM adds the information of the next state to the objective function of the proximal policy optimization (PPO) algorithm through a model-based method. This method uses two components to optimize the policy: the error of PPO and the error of model-based reinforcement learning. We use the latter to optimize a latent transition model and predict the information of the next state. For most games, this method outperforms the state-of-the-art PPO algorithm when we evaluate across 49 Atari games in the Arcade Learning Environment (ALE). The experimental results show that PPOMM performs better or the same as the original algorithm in 33 games.

Download Full-text

Rebalancing Expanding EV Sharing Systems with Deep Reinforcement Learning

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/186 ◽

2020 ◽

Author(s):

Man Luo ◽

Wenzhe Zhang ◽

Tianyou Song ◽

Kun Li ◽

Hongming Zhu ◽

...

Keyword(s):

Reinforcement Learning ◽

State Of The Art ◽

Charging Time ◽

Novel Approach ◽

The World ◽

Net Revenue ◽

Multi Agent ◽

User Demand ◽

Policy Optimization ◽

Over Time

Electric Vehicle (EV) sharing systems have recently experienced unprecedented growth across the world. One of the key challenges in their operation is vehicle rebalancing, i.e., repositioning the EVs across stations to better satisfy future user demand. This is particularly challenging in the shared EV context, because i) the range of EVs is limited while charging time is substantial, which constrains the rebalancing options; and ii) as a new mobility trend, most of the current EV sharing systems are still continuously expanding their station networks, i.e., the targets for rebalancing can change over time. To tackle these challenges, in this paper we model the rebalancing task as a Multi-Agent Reinforcement Learning (MARL) problem, which directly takes the range and charging properties of the EVs into account. We propose a novel approach of policy optimization with action cascading, which isolates the non-stationarity locally, and use two connected networks to solve the formulated MARL. We evaluate the proposed approach using a simulator calibrated with 1-year operation data from a real EV sharing system. Results show that our approach significantly outperforms the state-of-the-art, offering up to 14% gain in order satisfied rate and 12% increase in net revenue.

Download Full-text

A Survey and Taxonomy of Intent-Based Code Search

International Journal of Software Innovation ◽

10.4018/ijsi.2021010106 ◽

2021 ◽

Vol 9 (1) ◽

pp. 69-110

Author(s):

Shailesh Kumar Shivakumar

Keyword(s):

State Of The Art ◽

Source Code ◽

The Novel ◽

User Intent ◽

Code Search ◽

Search Field ◽

Query Type ◽

Novel Concept ◽

Code Understanding ◽

Source Code Search

In this paper, the authors introduce the novel concept of intent-based code search that categorizes code search goals into a hierarchy. They will explore state-of-the-art techniques in source code search covering various tools, techniques, and algorithms related to source code search. They will survey the code search field through the core use cases of code search such as code reusability, code understanding, and code repair. They propose a user intent-based taxonomy based on the code search goals. The code search goal taxonomy is derived based on deep analysis of literature survey of code search, and the taxonomy is validated based on their exclusive developer survey conducted as part of this paper. The code search goal taxonomy is based on logical categorization of code search goals and shared characteristics (query type, expected response, and such) for each of the categories in the taxonomy. The paper also details the latest trends and surveys the code search tools and the implications on tool design.

Download Full-text

Multitask Learning with Knowledge Base for Joint Intent Detection and Slot Filling

Applied Sciences ◽

10.3390/app11114887 ◽

2021 ◽

Vol 11 (11) ◽

pp. 4887

Author(s):

Ting He ◽

Xiaohong Xu ◽

Yating Wu ◽

Huazhen Wang ◽

Jian Chen

Keyword(s):

Knowledge Base ◽

Resource Sharing ◽

Short Term Memory ◽

State Of The Art ◽

Detection System ◽

Joint Model ◽

Detection Accuracy ◽

Dialog Systems ◽

Task Oriented ◽

Slot Filling

Intent detection and slot filling are important modules in task-oriented dialog systems. In order to make full use of the relationship between different modules and resource sharing, solving the problem of a lack of semantics, this paper proposes a multitasking learning intent-detection system, based on the knowledge-base and slot-filling joint model. The approach has been used to share information and rich external utility between intent and slot modules in a three-part process. First, this model obtains shared parameters and features between the two modules based on long short-term memory and convolutional neural networks. Second, a knowledge base is introduced into the model to improve its performance. Finally, a weighted-loss function is built to optimize the joint model. Experimental results demonstrate that our model achieves better performance compared with state-of-the-art algorithms on a benchmark Airline Travel Information System (ATIS) dataset and the Snips dataset. Our joint model achieves state-of-the-art results on the benchmark ATIS dataset with a 1.33% intent-detection accuracy improvement, a 0.94% slot filling F value improvement, and with 0.19% and 0.31% improvements respectively on the Snips dataset.

Download Full-text

Knowledge Base Question Answering with Topic Units

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/701 ◽

2019 ◽

Author(s):

Yunshi Lan ◽

Shuohang Wang ◽

Jing Jiang

Keyword(s):

Natural Language Processing ◽

Reinforcement Learning ◽

Knowledge Base ◽

Language Processing ◽

Question Answering ◽

State Of The Art ◽

Entity Linking ◽

Named Entities ◽

Benchmark Datasets ◽

Previous State

Knowledge base question answering (KBQA) is an important task in natural language processing. Existing methods for KBQA usually start with entity linking, which considers mostly named entities found in a question as the starting points in the KB to search for answers to the question. However, relying only on entity linking to look for answer candidates may not be sufficient. In this paper, we propose to perform topic unit linking where topic units cover a wider range of units of a KB. We use a generation-and-scoring approach to gradually refine the set of topic units. Furthermore, we use reinforcement learning to jointly learn the parameters for topic unit linking and answer candidate ranking in an end-to-end manner. Experiments on three commonly used benchmark datasets show that our method consistently works well and outperforms the previous state of the art on two datasets.

Download Full-text

Toward Self-Driving Bicycles Using State-of-the-Art Deep Reinforcement Learning Algorithms

Symmetry ◽

10.3390/sym11020290 ◽

2019 ◽

Vol 11 (2) ◽

pp. 290 ◽

Cited By ~ 4

Author(s):

SeungYoon Choi ◽

Tuyen Le ◽

Quang Nguyen ◽

Md Layek ◽

SeungGwan Lee ◽

...

Keyword(s):

Reinforcement Learning ◽

Deep Neural Network ◽

Learning Algorithm ◽

State Of The Art ◽

The Other ◽

Gradient Algorithm ◽

Reward Function ◽

Policy Gradient ◽

Policy Optimization ◽

Start Location

In this paper, we propose a controller for a bicycle using the DDPG (Deep Deterministic Policy Gradient) algorithm, which is a state-of-the-art deep reinforcement learning algorithm. We use a reward function and a deep neural network to build the controller. By using the proposed controller, a bicycle can not only be stably balanced but also travel to any specified location. We confirm that the controller with DDPG shows better performance than the other baselines such as Normalized Advantage Function (NAF) and Proximal Policy Optimization (PPO). For the performance evaluation, we implemented the proposed algorithm in various settings such as fixed and random speed, start location, and destination location.

Download Full-text

Learning for a Robot: Deep Reinforcement Learning, Imitation Learning, Transfer Learning

Sensors ◽

10.3390/s21041278 ◽

2021 ◽

Vol 21 (4) ◽

pp. 1278

Author(s):

Jiang Hua ◽

Liangcai Zeng ◽

Gongfa Li ◽

Zhaojie Ju

Keyword(s):

Reinforcement Learning ◽

Transfer Learning ◽

Robot Control ◽

Learning Transfer ◽

State Of The Art ◽

Imitation Learning ◽

Future Research ◽

Intelligent Robot ◽

Dexterous Manipulation ◽

Research Challenges

Dexterous manipulation of the robot is an important part of realizing intelligence, but manipulators can only perform simple tasks such as sorting and packing in a structured environment. In view of the existing problem, this paper presents a state-of-the-art survey on an intelligent robot with the capability of autonomous deciding and learning. The paper first reviews the main achievements and research of the robot, which were mainly based on the breakthrough of automatic control and hardware in mechanics. With the evolution of artificial intelligence, many pieces of research have made further progresses in adaptive and robust control. The survey reveals that the latest research in deep learning and reinforcement learning has paved the way for highly complex tasks to be performed by robots. Furthermore, deep reinforcement learning, imitation learning, and transfer learning in robot control are discussed in detail. Finally, major achievements based on these methods are summarized and analyzed thoroughly, and future research challenges are proposed.

Download Full-text

Legislation to Create a State-of-the-Art Substance Abuse Treatment and Training Facility Introduced in the New Mexico Legislature

PsycEXTRA Dataset ◽

10.1037/e522062010-004 ◽

2007 ◽

Keyword(s):

Substance Abuse ◽

New Mexico ◽

Substance Abuse Treatment ◽

State Of The Art ◽

Training Facility ◽

Abuse Treatment ◽

And Training

Download Full-text

BiOSS® Clinical Programme – State Of The Art – BiOSS LIM® Stent – The Novel Option For Coronary Bifurcation Treatment

Interventional Cardiology Review ◽

10.15420/icr.2015.10.2.s1 ◽

2015 ◽

Vol 10 (2) ◽

pp. 1

Author(s):

Jacek Bil ◽

Robert J Gil ◽

Dobrin Vassilev

Keyword(s):

State Of The Art ◽

The Novel ◽

Coronary Bifurcation

Download Full-text

Plant-Based Alternatives to Yogurt: State-of-the-Art and Perspectives of New Biotechnological Challenges

Foods ◽

10.3390/foods10020316 ◽

2021 ◽

Vol 10 (2) ◽

pp. 316

Author(s):

Marco Montemurro ◽

Erica Pontonio ◽

Rossana Coda ◽

Carlo Giuseppe Rizzello

Keyword(s):

Lactic Acid ◽

Lactic Acid Bacteria ◽

State Of The Art ◽

Protein Digestibility ◽

The Novel ◽

Sensory Features ◽

Long Time ◽

Biotechnological Processes ◽

Nutritional Compounds ◽

Increasing Demand

Due to the increasing demand for milk alternatives, related to both health and ethical needs, plant-based yogurt-like products have been widely explored in recent years. With the main goal to obtain snacks similar to the conventional yogurt in terms of textural and sensory properties and ability to host viable lactic acid bacteria for a long-time storage, several plant-derived ingredients (e.g., cereals, pseudocereals, legumes, and fruits) as well as technological solutions (e.g., enzymatic and thermal treatments) have been investigated. The central role of fermentation in yogurt-like production led to specific selections of lactic acid bacteria strains to be used as starters to guarantee optimal textural (e.g., through the synthesis of exo-polysaccharydes), nutritional (high protein digestibility and low content of anti-nutritional compounds), and functional (synthesis of bioactive compounds) features of the products. This review provides an overview of the novel insights on fermented yogurt-like products. The state-of-the-art on the use of unconventional ingredients, traditional and innovative biotechnological processes, and the effects of fermentation on the textural, nutritional, functional, and sensory features, and the shelf life are described. The supplementation of prebiotics and probiotics and the related health effects are also reviewed.

Download Full-text