Implementation of assembly task based on guided policy search algorithm

Trust region policy optimization (TRPO) is a popular and empirically successful policy search algorithm in Reinforcement Learning (RL) in which a surrogate problem, that restricts consecutive policies to be ‘close’ to one another, is iteratively solved. Nevertheless, TRPO has been considered a heuristic algorithm inspired by Conservative Policy Iteration (CPI). We show that the adaptive scaling mechanism used in TRPO is in fact the natural “RL version” of traditional trust-region methods from convex analysis. We first analyze TRPO in the planning setting, in which we have access to the model and the entire state space. Then, we consider sample-based TRPO and establish Õ(1/√N) convergence rate to the global optimum. Importantly, the adaptive scaling mechanism allows us to analyze TRPO in regularized MDPs for which we prove fast rates of Õ(1/N), much like results in convex optimization. This is the first result in RL of better rates when regularizing the instantaneous cost or reward.

Download Full-text

Policy search for multi-robot coordination under uncertainty

The International Journal of Robotics Research ◽

10.1177/0278364916679611 ◽

2016 ◽

Vol 35 (14) ◽

pp. 1760-1778 ◽

Cited By ~ 13

Author(s):

Christopher Amato ◽

George Konidaris ◽

Ariel Anders ◽

Gabriel Cruz ◽

Jonathan P How ◽

...

Keyword(s):

Search Algorithm ◽

Infinite Horizon ◽

Tree Representation ◽

Policy Search ◽

Performance Improvements ◽

Robot Systems ◽

Finite State ◽

Planning Algorithm ◽

Significant Performance ◽

Multi Robot

We introduce a principled method for multi-robot coordination based on a general model (termed a MacDec-POMDP) of multi-robot cooperative planning in the presence of stochasticity, uncertain sensing, and communication limitations. A new MacDec-POMDP planning algorithm is presented that searches over policies represented as finite-state controllers, rather than the previous policy tree representation. Finite-state controllers can be much more concise than trees, are much easier to interpret, and can operate over an infinite horizon. The resulting policy search algorithm requires a substantially simpler simulator that models only the outcomes of executing a given set of motor controllers, not the details of the executions themselves and can solve significantly larger problems than existing MacDec-POMDP planners. We demonstrate significant performance improvements over previous methods and show that our method can be used for actual multi-robot systems through experiments on a cooperative multi-robot bartending domain.

Download Full-text

Musculoskeletal disorder risk associated with auto rotation angle during an assembly task

PsycEXTRA Dataset ◽

10.1037/e578482012-002 ◽

2009 ◽

Author(s):

Sue A. Ferguson ◽

William S. Marras ◽

W. Gary Allread ◽

Gregory G. Knapik ◽

Kimberly A. Vandlen ◽

...

Keyword(s):

Musculoskeletal Disorder ◽

Rotation Angle ◽

Assembly Task ◽

Disorder Risk

Download Full-text

Fast search algorithm for VQ-based recognition of isolated words

IEE Proceedings I Communications Speech and Vision ◽

10.1049/ip-i-2.1989.0059 ◽

1989 ◽

Vol 136 (6) ◽

pp. 391 ◽

Cited By ~ 15

Author(s):

S.H. Chen ◽

J.S. Pan

Keyword(s):

Search Algorithm ◽

Fast Search ◽

Fast Search Algorithm

Download Full-text

Optimal broadcast scheduling method for VANETs: An adaptive discrete firefly approach

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-189134 ◽

2020 ◽

Vol 39 (6) ◽

pp. 8125-8137

Author(s):

Jackson J Christy ◽

D Rekha ◽

V Vijayakumar ◽

Glaucio H.S. Carvalho

Keyword(s):

Intelligent Transportation System ◽

Search Algorithm ◽

Mac Protocol ◽

High Mobility ◽

Optimal Solution ◽

Cuckoo Search ◽

Cuckoo Search Algorithm ◽

Traffic Density ◽

Scheduling Problem ◽

Broadcast Scheduling

Vehicular Adhoc Networks (VANET) are thought-about as a mainstay in Intelligent Transportation System (ITS). For an efficient vehicular Adhoc network, broadcasting i.e. sharing a safety related message across all vehicles and infrastructure throughout the network is pivotal. Hence an efficient TDMA based MAC protocol for VANETs would serve the purpose of broadcast scheduling. At the same time, high mobility, influential traffic density, and an altering network topology makes it strenuous to form an efficient broadcast schedule. In this paper an evolutionary approach has been chosen to solve the broadcast scheduling problem in VANETs. The paper focusses on identifying an optimal solution with minimal TDMA frames and increased transmissions. These two parameters are the converging factor for the evolutionary algorithms employed. The proposed approach uses an Adaptive Discrete Firefly Algorithm (ADFA) for solving the Broadcast Scheduling Problem (BSP). The results are compared with traditional evolutionary approaches such as Genetic Algorithm and Cuckoo search algorithm. A mathematical analysis to find the probability of achieving a time slot is done using Markov Chain analysis.

Download Full-text

Comparison of Meta-heuristic Algorithms on Benchmark Functions

Academic Perspective Procedia ◽

10.33793/acperpro.02.03.41 ◽

2019 ◽

Vol 2 (3) ◽

pp. 508-517

Author(s):

FerdaNur Arıcı ◽

Ersin Kaya

Keyword(s):

Heuristic Algorithms ◽

Optimization Problems ◽

Search Algorithm ◽

Optimization Algorithms ◽

Gravitational Search Algorithm ◽

Differential Evolution Algorithm ◽

Population Based ◽

Time Interval ◽

Bee Colony ◽

Different Characteristics

Optimization is a process to search the most suitable solution for a problem within an acceptable time interval. The algorithms that solve the optimization problems are called as optimization algorithms. In the literature, there are many optimization algorithms with different characteristics. The optimization algorithms can exhibit different behaviors depending on the size, characteristics and complexity of the optimization problem. In this study, six well-known population based optimization algorithms (artificial algae algorithm - AAA, artificial bee colony algorithm - ABC, differential evolution algorithm - DE, genetic algorithm - GA, gravitational search algorithm - GSA and particle swarm optimization - PSO) were used. These six algorithms were performed on the CEC&rsquo;17 test functions. According to the experimental results, the algorithms were compared and performances of the algorithms were evaluated.

Download Full-text