Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence
Latest Publications


TOTAL DOCUMENTS

965
(FIVE YEARS 965)

H-INDEX

14
(FIVE YEARS 14)

Published By International Joint Conferences On Artificial Intelligence Organization

9780999241141

Author(s):  
Wojciech Jamroga ◽  
Michał Knapik

Model checking strategic abilities in multi-agent systems is hard, especially for agents with partial observability of the state of the system. In that case, it ranges from NP-complete to undecidable, depending on the precise syntax and the semantic variant. That, however, is the worst case complexity, and the problem might as well be easier when restricted to particular subclasses of inputs. In this paper, we look at the verification of models with "extreme" epistemic structure, and identify several special cases for which model checking is easier than in general. We also prove that, in the other cases, no gain is possible even if the agents have almost full (or almost nil) observability. To prove the latter kind of results, we develop generic techniques that may be useful also outside of this study.


Author(s):  
Adriana Pacheco ◽  
Cédric Pralet ◽  
Stéphanie Roussel

In this paper, we consider scheduling problems involving resources that must perform complex setup operations between the tasks they realize. To deal with such problems, we introduce a simple yet efficient iterative two-layer decision process that alternates between the fast synthesis of high-level schedules based on a coarse-grain model of setup operations, and the production of detailed schedules based on a fine-grain model. Experiments realized on representative benchmarks of a multi-robot application show the efficiency of the approach.


Author(s):  
Anton Andreychuk ◽  
Konstantin Yakovlev ◽  
Dor Atzmon ◽  
Roni Stern

Multi-Agent Pathfinding (MAPF) is the problem of finding paths for multiple agents such that every agent reaches its goal and the agents do not collide. Most prior work on MAPF were on grids, assumed agents' actions have uniform duration, and that time is discretized into timesteps. In this work, we propose a MAPF algorithm that do not assume any of these assumptions, is complete, and provides provably optimal solutions. This algorithm is based on a novel combination of Safe Interval Path Planning (SIPP), a continuous time single agent planning algorithms, and Conflict-Based Search (CBS). We analyze this algorithm, discuss its pros and cons, and evaluate it experimentally on several standard benchmarks.


Author(s):  
Tejas Gokhale

Deep neural networks trained in an end-to-end fashion have brought about exceptional advances in computer vision, especially in computational perception. We go beyond perception and seek to enable vision modules to reason about perceived visual entities such as scenes, objects and actions. We introduce a challenging visual reasoning task, Image-Based Event Sequencing (IES) and compile the first IES dataset, Blocksworld Image Reasoning Dataset (BIRD). Motivated by the blocksworld concept, we propose a modular approach supported by literature in cognitive psychology and children's development. We decompose the problem into two stages - visual perception and event sequencing, and show that our approach can be extended to natural images without re-training.


Author(s):  
Zhu Zhang ◽  
Zhou Zhao ◽  
Zhijie Lin ◽  
Jingkuan Song ◽  
Deng Cai

Action localization in untrimmed videos is an important topic in the field of video understanding. However, existing action localization methods are restricted to a pre-defined set of actions and cannot localize unseen activities. Thus, we consider a new task to localize unseen activities in videos via image queries, named Image-Based Activity Localization. This task faces three inherent challenges: (1) how to eliminate the influence of semantically inessential contents in image queries; (2) how to deal with the fuzzy localization of inaccurate image queries; (3) how to determine the precise boundaries of target segments. We then propose a novel self-attention interaction localizer to retrieve unseen activities in an end-to-end fashion. Specifically, we first devise a region self-attention method with relative position encoding to learn fine-grained image region representations. Then, we employ a local transformer encoder to build multi-step fusion and reasoning of image and video contents. We next adopt an order-sensitive localizer to directly retrieve the target segment. Furthermore, we construct a new dataset ActivityIBAL by reorganizing the ActivityNet dataset. The extensive experiments show the effectiveness of our method.


Author(s):  
Yi Wan ◽  
Muhammad Zaheer ◽  
Adam White ◽  
Martha White ◽  
Richard S. Sutton

Distribution and sample models are two popular model choices in model-based reinforcement learning (MBRL). However, learning these models can be intractable, particularly when the state and action spaces are large. Expectation models, on the other hand, are relatively easier to learn due to their compactness and have also been widely used for deterministic environments. For stochastic environments, it is not obvious how expectation models can be used for planning as they only partially characterize a distribution. In this paper, we propose a sound way of using approximate expectation models for MBRL. In particular, we 1) show that planning with an expectation model is equivalent to planning with a distribution model if the state value function is linear in state features, 2) analyze two common parametrization choices for approximating the expectation: linear and non-linear expectation models, 3) propose a sound model-based policy evaluation algorithm and present its convergence results, and 4) empirically demonstrate the effectiveness of the proposed planning algorithm. 


Author(s):  
Yang Gao ◽  
Christian M. Meyer ◽  
Mohsen Mesgar ◽  
Iryna Gurevych

Document summarisation can be formulated as a sequential decision-making problem, which can be solved by Reinforcement Learning (RL) algorithms. The predominant RL paradigm for summarisation learns a cross-input policy, which requires considerable time, data and parameter tuning due to the huge search spaces and the delayed rewards. Learning input-specific RL policies is a more efficient alternative, but so far depends on handcrafted rewards, which are difficult to design and yield poor performance. We propose RELIS, a novel RL paradigm that learns a reward function with Learning-to-Rank (L2R) algorithms at training time and uses this reward function to train an input-specific RL policy at test time. We prove that RELIS guarantees to generate near-optimal summaries with appropriate L2R and RL algorithms. Empirically, we evaluate our approach on extractive multi-document summarisation. We show that RELIS reduces the training time by two orders of magnitude compared to the state-of-the-art models while performing on par with them.


Author(s):  
Taoran Ji ◽  
Zhiqian Chen ◽  
Nathan Self ◽  
Kaiqun Fu ◽  
Chang-Tien Lu ◽  
...  

Modeling and forecasting forward citations to a patent is a central task for the discovery of emerging technologies and for measuring the pulse of inventive progress. Conventional methods for forecasting these forward citations cast the problem as analysis of temporal point processes which rely on the conditional intensity of previously received citations. Recent approaches model the conditional intensity as a chain of recurrent neural networks to capture memory dependency in hopes of reducing the restrictions of the parametric form of the intensity function. For the problem of patent citations, we observe that forecasting a patent's chain of citations benefits from not only the patent's history itself but also from the historical citations of assignees and inventors associated with that patent. In this paper, we propose a sequence-to-sequence model which employs an attention-of-attention mechanism to capture the dependencies of these multiple time sequences. Furthermore, the proposed model is able to forecast both the timestamp and the category of a patent's next citation. Extensive experiments on a large patent citation dataset collected from USPTO demonstrate that the proposed model outperforms state-of-the-art models at forward citation forecasting.


Author(s):  
Lin Lan ◽  
Zhenguo Li ◽  
Xiaohong Guan ◽  
Pinghui Wang

Despite significant progress, deep reinforcement learning (RL) suffers from data-inefficiency and limited generalization. Recent efforts apply meta-learning to learn a meta-learner from a set of RL tasks such that a novel but related task could be solved quickly. Though specific in some ways, different tasks in meta-RL are generally similar at a high level. However, most meta-RL methods do not explicitly and adequately model the specific and shared information among different tasks, which limits their ability to learn training tasks and to generalize to novel tasks. In this paper, we propose to capture the shared information on the one hand and meta-learn how to quickly abstract the specific information about a task on the other hand. Methodologically, we train an SGD meta-learner to quickly optimize a task encoder for each task, which generates a task embedding based on past experience. Meanwhile, we learn a policy which is shared across all tasks and conditioned on task embeddings. Empirical results on four simulated tasks demonstrate that our method has better learning capacity on both training and novel tasks and attains up to 3 to 4 times higher returns compared to baselines.


Author(s):  
Ziyao Li ◽  
Liang Zhang ◽  
Guojie Song

Graph Convolutional Networks (GCNs) have proved to be a most powerful architecture in aggregating local neighborhood information for individual graph nodes. Low-rank proximities and node features are successfully leveraged in existing GCNs, however, attributes that graph links may carry are commonly ignored, as almost all of these models simplify graph links into binary or scalar values describing node connectedness. In our paper instead, links are reverted to hypostatic relationships between entities with descriptional attributes. We propose GCN-LASE (GCN with Link Attributes and Sampling Estimation), a novel GCN model taking both node and link attributes as inputs. To adequately captures the interactions between link and node attributes, their tensor product is used as neighbor features, based on which we define several graph kernels and further develop according architectures for LASE. Besides, to accelerate the training process, the sum of features in entire neighborhoods are estimated through Monte Carlo method, with novel  sampling strategies designed for LASE to minimize the estimation variance. Our experiments show that LASE outperforms strong baselines over various graph datasets, and further experiments corroborate the informativeness of link attributes and our model's ability of adequately leveraging them.


Sign in / Sign up

Export Citation Format

Share Document