Regret Minimisation and System-Efficiency in Route Choice

Traffic congestions present a major challenge in large cities. Consid- ering the distributed, self-interested nature oftraffic we tackle congestions using multiagent reinforcement learning (MARL). In this thesis, we advance the state- of-the-art by delivering the first MARL convergence guarantees in congestion- like problems. We introduce an algorithm through which drivers can learn opti- mal routes by locally estimating the regret associated with their decisions, which we prove to converge to an equilibrium. In order to mitigate the effects ofselfish- ness, we also devise a decentralised tolling scheme, which we prove to minimise traffic congestion levels. Our theoretical results are supported by an extensive empirical evaluation on realistic traffic networks. 1.

Download Full-text

A Decentralised Approach to Intersection Traffic Management

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/73 ◽

2018 ◽

Cited By ~ 4

Author(s):

Huan Vu ◽

Samir Aknine ◽

Sarvapali D. Ramchurn

Keyword(s):

Quality Of Life ◽

Waiting Time ◽

Traffic Congestion ◽

Traffic Management ◽

State Of The Art ◽

The State ◽

Management Mechanism ◽

Traffic Conditions

Traffic congestion has a significant impact on quality of life and the economy. This paper presents a decentralised traffic management mechanism for intersections using a distributed constraint optimisation approach (DCOP). Our solution outperforms the state of the art solution both for stable traffic conditions (about 60% reduced waiting time) and robustness to unpredictable events.

Download Full-text

Textual Deblurring using Convolutional Neural Network

10.36227/techrxiv.16760632 ◽

2021 ◽

Author(s):

Muhammad Shahroz Nadeem ◽

Sibt Hussain ◽

Fatih Kurugollu

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Loss Function ◽

State Of The Art ◽

Empirical Evaluation ◽

The State

This paper solves the textual deblurring problem, In this paper we propose a new loss function, we provide empirical evaluation of the design choices based on which a memory friendly CNN model is proposed, that performs better then the state of the art CNN method.

Download Full-text

Dialogue Generation: From Imitation Learning to Inverse Reinforcement Learning

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33016722 ◽

2019 ◽

Vol 33 ◽

pp. 6722-6729 ◽

Cited By ~ 4

Author(s):

Ziming Li ◽

Julia Kiseleva ◽

Maarten De Rijke

Keyword(s):

Reinforcement Learning ◽

State Of The Art ◽

The State ◽

Experimental Results ◽

Imitation Learning ◽

Local Optimum ◽

Inverse Reinforcement Learning ◽

High Quality ◽

Overall Performance

The performance of adversarial dialogue generation models relies on the quality of the reward signal produced by the discriminator. The reward signal from a poor discriminator can be very sparse and unstable, which may lead the generator to fall into a local optimum or to produce nonsense replies. To alleviate the first problem, we first extend a recently proposed adversarial dialogue generation method to an adversarial imitation learning solution. Then, in the framework of adversarial inverse reinforcement learning, we propose a new reward model for dialogue generation that can provide a more accurate and precise reward signal for generator training. We evaluate the performance of the resulting model with automatic metrics and human evaluations in two annotation settings. Our experimental results demonstrate that our model can generate more high-quality responses and achieve higher overall performance than the state-of-the-art.

Download Full-text

AsymDPOP: Complete Inference for Asymmetric Distributed Constraint Optimization Problems

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/32 ◽

2019 ◽

Author(s):

Yanchen Deng ◽

Ziyu Chen ◽

Dingding Chen ◽

Wenxin Zhang ◽

Xingqiong Jiang

Keyword(s):

Optimization Problems ◽

State Of The Art ◽

Empirical Evaluation ◽

The State ◽

Constraint Optimization ◽

Memory Consumption ◽

Distributed Constraint Optimization ◽

Constraint Optimization Problems

Asymmetric distributed constraint optimization problems (ADCOPs) are an emerging model for coordinating agents with personal preferences. However, the existing inference-based complete algorithms which use local eliminations cannot be applied to ADCOPs, as the parent agents are required to transfer their private functions to their children. Rather than disclosing private functions explicitly to facilitate local eliminations, we solve the problem by enforcing delayed eliminations and propose AsymDPOP, the first inference-based complete algorithm for ADCOPs. To solve the severe scalability problems incurred by delayed eliminations, we propose to reduce the memory consumption by propagating a set of smaller utility tables instead of a joint utility table, and to reduce the computation efforts by sequential optimizations instead of joint optimizations. The empirical evaluation indicates that AsymDPOP significantly outperforms the state-of-the-art, as well as the vanilla DPOP with PEAV formulation.

Download Full-text

Deep Reinforcement Learning Overview of the state of the Art

Journal of Automation Mobile Robotics & Intelligent Systems ◽

10.14313/jamris_3-2018/15 ◽

2018 ◽

Vol 12 (3) ◽

pp. 20-39 ◽

Cited By ~ 2

Author(s):

Youssef Fenjiro ◽

Houda Benbrahim

Keyword(s):

Reinforcement Learning ◽

State Of The Art ◽

The State

Download Full-text

DIFFERENTIAL-REFLECTANCE SPECTROSCOPY AND REFLECTANCE-ANISOTROPY SPECTROSCOPY ON SEMICONDUCTOR SURFACES

Surface Review and Letters ◽

10.1142/s0218625x99000482 ◽

1999 ◽

Vol 06 (03n04) ◽

pp. 517-528 ◽

Cited By ~ 52

Author(s):

P. CHIARADIA ◽

R. DEL SOLE

Keyword(s):

State Of The Art ◽

Reflectance Spectroscopy ◽

The State ◽

Semiconductor Surfaces ◽

Short Account ◽

Reflectance Anisotropy ◽

Reflection Of Light ◽

Theoretical Results

The development and the state of the art of surface methods based on the reflection of light are briefly reviewed. A short account is given of the main experimental and theoretical results obtained on semiconductor surfaces over about three decades.

Download Full-text

Comparative Evaluation of Link-Based Approaches for Candidate Ranking in Link-to-Wikipedia Systems

Journal of Artificial Intelligence Research ◽

10.1613/jair.4129 ◽

2014 ◽

Vol 49 ◽

pp. 733-773 ◽

Cited By ~ 6

Author(s):

N. Fernandez Garcia ◽

J. Arias Fisteus ◽

L. Sanchez Fernandez

Keyword(s):

Comparative Evaluation ◽

State Of The Art ◽

Empirical Evaluation ◽

The State ◽

Context Information ◽

Research Attention ◽

Link Information

In recent years, the task of automatically linking pieces of text (anchors) mentioned in a document to Wikipedia articles that represent the meaning of these anchors has received extensive research attention. Typically, link-to-Wikipedia systems try to find a set of Wikipedia articles that are candidates to represent the meaning of the anchor and, later, rank these candidates to select the most appropriate one. In this ranking process the systems rely on context information obtained from the document where the anchor is mentioned and/or from Wikipedia. In this paper we center our attention in the use of Wikipedia links as context information. In particular, we offer a review of several candidate ranking approaches in the state-of-the-art that rely on Wikipedia link information. In addition, we provide a comparative empirical evaluation of the different approaches on five different corpora: the TAC 2010 corpus and four corpora built from actual Wikipedia articles and news items.

Download Full-text

Fragment-based Sequential Translation for Molecular Optimization

10.33774/chemrxiv-2021-fzxmk-v2 ◽

2021 ◽

Author(s):

Benson Chen ◽

Xiang Fu ◽

Regina Barzilay ◽

Tommi Jaakkola

Keyword(s):

Reinforcement Learning ◽

State Of The Art ◽

Empirical Evaluation ◽

Molecular Fragments ◽

Complex Chemical ◽

Latent Space ◽

Variational Autoencoder ◽

Property Space ◽

Molecular Compounds ◽

Do So

Searching for novel molecular compounds with desired properties is an important problem in drug discovery. Many existing frameworks generate molecules one atom at a time. We instead propose a flexible editing paradigm that generates molecules using learned molecular fragments---meaningful substructures of molecules. To do so, we train a variational autoencoder (VAE) to encode molecular fragments in a coherent latent space, which we then utilize as a vocabulary for editing molecules to explore the complex chemical property space. Equipped with the learned fragment vocabulary, we propose Fragment-based Sequential Translation (FaST), which learns a reinforcement learning (RL) policy to iteratively translate model-discovered molecules into increasingly novel molecules while satisfying desired properties. Empirical evaluation shows that FaST significantly improves over state-of-the-art methods on benchmark single/multi-objective molecular optimization tasks.

Download Full-text

Playing First-Person Perspective Games with Deep Reinforcement Learning Using the State-of-the-Art Game-AI Research Platforms

10.1007/978-3-030-77939-9_18 ◽

2021 ◽

pp. 635-667

Author(s):

Adil Khan ◽

Asad Masood Khattak ◽

Muhammad Zubair Asghar ◽

Muhammad Naeem ◽

Aziz Ud Din

Keyword(s):

Reinforcement Learning ◽

State Of The Art ◽

The State ◽

First Person ◽

Person Perspective ◽

Game Ai ◽

First Person Perspective ◽

Art Game

Download Full-text

Preference-based interactive multi-document summarisation

Information Retrieval ◽

10.1007/s10791-019-09367-8 ◽

2019 ◽

Vol 23 (6) ◽

pp. 555-585

Author(s):

Yang Gao ◽

Christian M. Meyer ◽

Iryna Gurevych

Keyword(s):

Reinforcement Learning ◽

Active Learning ◽

Upper Bound ◽

State Of The Art ◽

Interactive Learning ◽

User Feedback ◽

The State ◽

Ranking Function ◽

Preference Learning ◽

Interactive Method

AbstractInteractive NLP is a promising paradigm to close the gap between automatic NLP systems and the human upper bound. Preference-based interactive learning has been successfully applied, but the existing methods require several thousand interaction rounds even in simulations with perfect user feedback. In this paper, we study preference-based interactive summarisation. To reduce the number of interaction rounds, we propose the Active Preference-based ReInforcement Learning (APRIL) framework. APRIL uses active learning to query the user, preference learning to learn a summary ranking function from the preferences, and neural Reinforcement learning to efficiently search for the (near-)optimal summary. Our results show that users can easily provide reliable preferences over summaries and that APRIL outperforms the state-of-the-art preference-based interactive method in both simulation and real-user experiments.

Download Full-text