scholarly journals Robust Adversarial Imitation Learning via Adaptively-Selected Demonstrations

Author(s):  
Yunke Wang ◽  
Chang Xu ◽  
Bo Du

The agent in imitation learning (IL) is expected to mimic the behavior of the expert. Its performance relies highly on the quality of given expert demonstrations. However, the assumption that collected demonstrations are optimal cannot always hold in real-world tasks, which would seriously influence the performance of the learned agent. In this paper, we propose a robust method within the framework of Generative Adversarial Imitation Learning (GAIL) to address imperfect demonstration issue, in which good demonstrations can be adaptively selected for training while bad demonstrations are abandoned. Specifically, a binary weight is assigned to each expert demonstration to indicate whether to select it for training. The reward function in GAIL is employed to determine this weight (i.e. higher reward results in higher weight). Compared to some existing solutions that require some auxiliary information about this weight, we set up the connection between weight and model so that we can jointly optimize GAIL and learn the latent weight. Besides hard binary weighting, we also propose a soft weighting scheme. Experiments in the Mujoco demonstrate the proposed method outperforms other GAIL-based methods when dealing with imperfect demonstrations.

2021 ◽  
pp. 1-10
Author(s):  
Jin Yi ◽  
Jiajin Huang ◽  
Jin Qin

Recommender systems have been widely used in our life in recent years to facilitate our life. And it is very important and meaningful to improve recommendation performance. Generally, recommendation methods use users’ historical ratings on items to predict ratings on their unrated items to make recommendations. However, with the increase of the number of users and items, the degree of data sparsity increases, and the quality of recommendations decreases sharply. In order to solve the sparsity problem, other auxiliary information is combined to mine users’ preferences for higher recommendation quality. Similar to rating data, review data also contain rich information about users’ preferences on items. This paper proposes a novel recommendation model, which harnesses an adversarial learning among auto-encoders to improve recommendation quality by minimizing the gap of the rating and review relation between a user and an item. The empirical studies on real-world datasets show that the proposed method improves the recommendation performance.


2021 ◽  
Author(s):  
Handi Chen ◽  
Xiaojie Wang ◽  
Zhaolong Ning ◽  
Lei Guo

With the advocacy of green renewable energy, Electric Vehicles (EVs) have gradually become the mainstream in the automobile market. Due to the finite edge resources of the Internet of EVs, this paper integrates idle communication, caching and computational resources of EVs to enrich the available resources for vehicular task migration. Considering the limited capacity and resources of EVs, a distributed lightweight imitation learning-based efficient Task cOoperative migration Policy Integrating 3C resource policy, named TOPIC, is proposed to maximize the obtained quality of service. The experimental results based on the real-world traffic dataset of Hangzhou (China) demonstrate the QoS obtained based on the expert policy and agent policy of TOPIC is about 3 times higher than other representative policies.


Author(s):  
Stephen Verderber

The interdisciplinary field of person-environment relations has, from its origins, addressed the transactional relationship between human behavior and the built environment. This body of knowledge has been based upon qualitative and quantitative assessment of phenomena in the “real world.” This knowledge base has been instrumental in advancing the quality of real, physical environments globally at various scales of inquiry and with myriad user/client constituencies. By contrast, scant attention has been devoted to using simulation as a means to examine and represent person-environment transactions and how what is learned can be applied. The present discussion posits that press-competency theory, with related aspects drawn from functionalist-evolutionary theory, can together function to help us learn of how the medium of film can yield further insights to person-environment (P-E) transactions in the real world. Sampling, combined with extemporary behavior setting analysis, provide the basis for this analysis of healthcare settings as expressed throughout the history of cinema. This method can be of significant aid in examining P-E transactions across diverse historical periods, building types and places, healthcare and otherwise, otherwise logistically, geographically, or temporally unattainable in real time and space.


At production of fabrics, including fabrics for agricultural purpose, an important role is played by the cor-rect adjustment of operation of machine main regulator. The quality of setup of machine main controller is determined by the proper selection of rotation angle of warp beam weaving per one filling thread. In the pro-cess of using the regulator as a result of mistakes in adjustment, wear of transmission gear and backlashes in connections of details there are random changes in threads length. The purpose of the article is the research of property of random errors of basis giving by STB machine regulator. Mistakes can be both negative, and positive. In case of emergence only negative or only positive mistakes operation of the machine becomes im-possible as there will be a consecutive accumulation of mistakes. As a result of experimental data processing for stable process of weaving and the invariable diameter of basis threads winding of threads it is revealed that the random error of giving is set up as linear function of the accidental length having normal distribution. Measurements of accidental deviations in giving of a basis by the main regulator allowed to construct a curve of normal distribution of its actual length for one pass of weft thread. The presented curve of distribution of random errors in giving of a basis is the displaced curve of normal distribution of the accidental sizes. Also we define the density of probability of normal distribution of basis giving errors connected with a margin er-ror operation of the main regulator knowing of which allows to plan ways of their decrease that is important for improvement of quality of the produced fabrics.


2020 ◽  
Vol 19 (10) ◽  
pp. 943-948
Author(s):  
Peter Lio ◽  
Andreas Wollenberg ◽  
Jacob Thyssen ◽  
Evangeline Pierce ◽  
Maria Rueda ◽  
...  

1991 ◽  
Vol 24 (10) ◽  
pp. 171-177
Author(s):  
T. Vellinga ◽  
J. P. J. Nijssen

Much of the material dredged from the port of Rotterdam is contaminated to such a degree that it must be placed in specially constructed sites. The aim of Rotterdam is to ensure that the dredged material will once again be clean. This will entail the thorough cleansing of the sources of the contamination of the sediment in the harbours and in the River Rhine. The Rotterdam Rhine Research Project (RRP) is one of the means to achieve this based on: technical research, legal research, public relations and dialogues with dischargers. The programme for five selected heavy metals is almost complete. For many heavy metal discharge points between Rotterdam and Rheinfelden, a specially devised independent load assessment has been carried out four times. Balance studies were used to determine the relative contributions of the point discharges to the total. Currently the results are being used in an attempt to negotiate agreements with a selected number of the major dischargers. At present, more detailed balance studies are being set up and exploratory measurements carried out for organic micropollutants. It may be concluded that the research is progressing successfully and methods and techniques developed seem satisfactory and broadly applicable. The Rhine Action Programme encompasses an international effort to improve the quality of the Rhine water. Although the RRP plays a modest complementary role to the Rhine Action Plan, there is no doubt of the value of this Rotterdam initiative. The mode of work followed in the RRP contains elements that can be of use in combatting the contamination of the North Sea by rivers other than the Rhine.


1988 ◽  
Vol 20 (4-5) ◽  
pp. 249-251
Author(s):  
Jacques Bernard

The flow and the water quality of the rivers vary throughout the year. Very frequently the environment protection authorities set up a quality objective for the river water and this mini mum quality level is constant. So, it wou1d seem possible to accept variable quality standards for plant effluents. A first approach of the problem,by a small French task group,based on three actual cases leads to the provisory conclusion that such a regulation is suitable and presents economical benefit only in some very limit ed cases.


2015 ◽  
Vol 3 (2) ◽  
pp. 1-7 ◽  
Author(s):  
Achin Jain ◽  
M P Venkatesh M P ◽  
Pramod T.M. Kumar

In Tanzania, Tanzania Food and Drugs Authority (TFDA), is a regulatory body responsible for controlling the quality,safety and effectiveness of food, drugs, herbal drugs, cosmetics and medical devices. The Authority has been ensuringsafety, efficacy and quality of medicines by quality control tests; in addition to other quality assessment mechanisms.The guidelines laid by TFDA have also emanated from commitment to democracy and gives strong emphasis to thefulfilment of the needs of the less privileged rural population.Tanzania is an emerging market; the pharmaceutical market is valued at over US$250 million, and is growing at anannual rate of around 16.5% and is expected to reach approximately US$550 billion in 2020. Currently, the market ishighly dependent on imports, which account for around 75% of the total pharmaceutical market.The procedures and approval requirements of new drugs, variations, import, export and disposal have been set up bythe TFDA, which help in maintaining quality of the drug products that are imported as well being produced locally 


Sign in / Sign up

Export Citation Format

Share Document