scholarly journals Adversarial Imitation Learning from Incomplete Demonstrations

Author(s):  
Mingfei Sun ◽  
Xiaojuan Ma

Imitation learning targets deriving a mapping from states to actions, a.k.a. policy, from expert demonstrations. Existing methods for imitation learning typically require any actions in the demonstrations to be fully available, which is hard to ensure in real applications. Though algorithms for learning with unobservable actions have been proposed, they focus solely on state information and over- look the fact that the action sequence could still be partially available and provide useful information for policy deriving. In this paper, we propose a novel algorithm called Action-Guided Adversarial Imitation Learning (AGAIL) that learns a pol- icy from demonstrations with incomplete action sequences, i.e., incomplete demonstrations. The core idea of AGAIL is to separate demonstrations into state and action trajectories, and train a policy with state trajectories while using actions as auxiliary information to guide the training whenever applicable. Built upon the Generative Adversarial Imitation Learning, AGAIL has three components: a generator, a discriminator, and a guide. The generator learns a policy with rewards provided by the discriminator, which tries to distinguish state distributions between demonstrations and samples generated by the policy. The guide provides additional rewards to the generator when demonstrated actions for specific states are available. We com- pare AGAIL to other methods on benchmark tasks and show that AGAIL consistently delivers com- parable performance to the state-of-the-art methods even when the action sequence in demonstrations is only partially available.

Author(s):  
Wenbin Li ◽  
Lei Wang ◽  
Jing Huo ◽  
Yinghuan Shi ◽  
Yang Gao ◽  
...  

The core idea of metric-based few-shot image classification is to directly measure the relations between query images and support classes to learn transferable feature embeddings. Previous work mainly focuses on image-level feature representations, which actually cannot effectively estimate a class's distribution due to the scarcity of samples. Some recent work shows that local descriptor based representations can achieve richer representations than image-level based representations. However, such works are still based on a less effective instance-level metric, especially a symmetric metric, to measure the relation between a query image and a support class. Given the natural asymmetric relation between a query image and a support class, we argue that an asymmetric measure is more suitable for metric-based few-shot learning. To that end, we propose a novel Asymmetric Distribution Measure (ADM) network for few-shot learning by calculating a joint local and global asymmetric measure between two multivariate local distributions of a query and a class. Moreover, a task-aware Contrastive Measure Strategy (CMS) is proposed to further enhance the measure function. On popular miniImageNet and tieredImageNet, ADM can achieve the state-of-the-art results, validating our innovative design of asymmetric distribution measures for few-shot learning. The source code can be downloaded from https://github.com/WenbinLee/ADM.git.


Author(s):  
Jarne R. Verpoorten ◽  
Miche`le Auglaire ◽  
Frank Bertels

During a hypothetical Severe Accident (SA), core damage is to be expected due to insufficient core cooling. If the lack of core cooling persists, the degradation of the core can continue and could lead to the presence of corium in the lower plenum. There, the thermo-mechanical attack of the lower head by the corium could eventually lead to vessel failure and corium release to the reactor cavity pit. In this paper, it is described how the international state-of-the-art knowledge has been applied in combination with plant-specific data in order to obtain a custom Severe Accident Management (SAM) approach and hardware adaptations for existing NPPs. Also the interest of Tractebel Engineering in future SA research projects related to this topic will be addressed from the viewpoint of keeping the analysis up-to-date with the state-of-the art knowledge.


Author(s):  
Ziming Li ◽  
Julia Kiseleva ◽  
Maarten De Rijke

The performance of adversarial dialogue generation models relies on the quality of the reward signal produced by the discriminator. The reward signal from a poor discriminator can be very sparse and unstable, which may lead the generator to fall into a local optimum or to produce nonsense replies. To alleviate the first problem, we first extend a recently proposed adversarial dialogue generation method to an adversarial imitation learning solution. Then, in the framework of adversarial inverse reinforcement learning, we propose a new reward model for dialogue generation that can provide a more accurate and precise reward signal for generator training. We evaluate the performance of the resulting model with automatic metrics and human evaluations in two annotation settings. Our experimental results demonstrate that our model can generate more high-quality responses and achieve higher overall performance than the state-of-the-art.


2019 ◽  
Author(s):  
Francis M. Tyers ◽  
Jonathan N. Washington ◽  
Darya Kavitskaya ◽  
Memduh Gökırmak

This paper describes a weighted finite-state morphological transducer for Crimean Tatar able to analyse and generate in both Latin and Cyrillic orthographies. This transducer was developed by a team including a community member and language expert, a field linguist who works with the community, a Turkologist with computational linguistics expertise, and an experienced computational linguist with Turkic expertise. Dealing with two orthographic systems in the same transducer is challenging as they employ different strategies to deal with the spelling of loan words and encode the full range of the language's phonemes and their interaction. We develop the core transducer using the Latin orthography and then design a separate transliteration transducer to map the surface forms to Cyrillic. To help control the non-determinism in the orthographic mapping, we use weights to prioritise forms seen in the corpus. We perform an evaluation of all components of the system, finding an accuracy above 90% for morphological analysis and near 90% for orthographic conversion. This comprises the state of the art for Crimean Tatar morphological modelling, and, to our knowledge, is the first biscriptual single morphological transducer for any language.


2020 ◽  
Author(s):  
Ke Shang ◽  
Hisao Ishibuchi

<div> <div> <div> <p>In this paper, a new hypervolume-based evolutionary multi-objective optimization algorithm (EMOA), namely R2HCA-EMOA (R2-based Hypervolume Contribution Approximation EMOA), is proposed for many-objective optimization. The core idea of the algorithm is to use an R2 indicator variant to approximate the hypervolume contribution. The basic framework of the proposed algorithm is the same as SMS- EMOA. In order to make the algorithm computationally efficient, a utility tensor structure is introduced for the calculation of the R2 indicator variant. Moreover, a normalization mechanism is incorporated into R2HCA-EMOA to enhance the performance of the algorithm. Through experimental studies, R2HCA-EMOA is compared with three hypervolume-based EMOAs and several other state-of-the-art EMOAs on 5-, 10- and 15-objective DTLZ, WFG problems and their minus versions. Our results show that R2HCA-EMOA is more efficient than the other hypervolume-based EMOAs, and is superior to all the compared state-of-the-art EMOAs. </p> </div> </div> </div>


2019 ◽  
Author(s):  
Ke Shang

<div> <div> <div> <p>In this paper, a new hypervolume-based evolution- ary multi-objective optimization algorithm (EMOA), namely R2HCA-EMOA (R2-based Hypervolume Contribution Approx- imation EMOA), is proposed for many-objective optimization. The core idea of the algorithm is to use an R2 indicator variant to approximate the hypervolume contribution. The basic framework of the proposed algorithm is the same as SMS- EMOA. In order to make the algorithm computationally efficient, a utility tensor structure is introduced for the calculation of the R2 indicator variant. Moreover, a normalization mechanism is incorporated into R2HCA-EMOA to enhance the performance of the algorithm. Through experimental studies, R2HCA-EMOA is compared with three hypervolume-based EMOAs and several other state-of-the-art EMOAs on 5-, 10- and 15-objective DTLZ, WFG problems and their minus versions. Our results show that R2HCA-EMOA is more efficient than the other hypervolume- based EMOAs, and is superior to all the compared state-of-the- art EMOAs. </p> </div> </div> </div>


Author(s):  
Valeria Fionda ◽  
Giuseppe Pirrò

We tackle fact checking using Knowledge Graphs (KGs) as a source of background knowledge. Our approach leverages the KG schema to generate candidate evidence patterns, that is, schema-level paths that capture the semantics of a target fact in alternative ways. Patterns verified in the data are used to both assemble semantic evidence for a fact and provide a numerical assessment of its truthfulness. We present efficient algorithms to generate and verify evidence patterns, and assemble evidence. We also provide a translation of the core of our algorithms into the SPARQL query language. Not only our approach is faster than the state of the art and offers comparable accuracy, but it can also use any SPARQL-enabled KG.


2020 ◽  
Vol 68 ◽  
pp. 691-752
Author(s):  
Enrico Scala ◽  
Patrik Haslum ◽  
Sylvie Thiébaux ◽  
Miquel Ramirez

This paper studies novel subgoaling relaxations for automated planning with propositional and numeric state variables. Subgoaling relaxations address one source of complexity of the planning problem: the requirement to satisfy conditions simultaneously. The core idea is to relax this requirement by recursively decomposing conditions into atomic subgoals that are considered in isolation. Such relaxations are typically used for pruning, or as the basis for computing admissible or inadmissible heuristic estimates to guide optimal or satis_cing heuristic search planners. In the last decade or so, the subgoaling principle has underpinned the design of an abundance of relaxation-based heuristics whose formulations have greatly extended the reach of classical planning. This paper extends subgoaling relaxations to support numeric state variables and numeric conditions. We provide both theoretical and practical results, with the aim of reaching a good trade-o_ between accuracy and computation costs within a heuristic state-space search planner. Our experimental results validate the theoretical assumptions, and indicate that subgoaling substantially improves on the state of the art in optimal and satisficing numeric planning via forward state-space search.


Author(s):  
Ziru Xu ◽  
Yunbo Wang ◽  
Mingsheng Long ◽  
Jianmin Wang

Predicting future frames in videos remains an unsolved but challenging problem. Mainstream recurrent models suffer from huge memory usage and computation cost, while convolutional models are unable to effectively capture the temporal dependencies between consecutive video frames. To tackle this problem, we introduce an entirely CNN-based architecture, PredCNN, that models the dependencies between the next frame and the sequential video inputs. Inspired by the core idea of recurrent models that previous states have more transition operations than future states, we design a cascade multiplicative unit (CMU) that provides relatively more operations for previous video frames. This newly proposed unit enables PredCNN to predict future spatiotemporal data without any recurrent chain structures, which eases gradient propagation and enables a fully paralleled optimization. We show that PredCNN outperforms the state-of-the-art recurrent models for video prediction on the standard Moving MNIST dataset and two challenging crowd flow prediction datasets, and achieves a faster training speed and lower memory footprint.


Author(s):  
Armin W. Schulz

This chapter develops a new account of the evolution of cognitive representational decision making—i.e. of decision making that relies on representations about the state of the world. The core idea behind this account is that cognitive representational decision making can—at times—be more cognitively efficient than non-cognitive representational decision making. In particular, cognitive representational decision making, by being able to draw on the inferential resources of higher-level mental states, can enable organisms to adjust more easily to changes in their environment and to streamline their neural decision making machinery (relative to non-representational decision makers). While these cognitive efficiency gains will sometimes be outweighed by the costs of this way of making decisions—i.e. the fact that representational decision making is generally slower and more concentration- and attention-hungry than non-representational decision making—this will not always be the case. Moreover, it is possible to say in more detail which kinds of circumstances will favor the evolution of cognitive representational decision making, and which do not.


Sign in / Sign up

Export Citation Format

Share Document