Optimization of Multilayer Optical Films with Unsupervised Learning, reinforcement learning and genetic algorithm

Reinforcement Learning

Biologically Inspired Artificial Intelligence for Computer Games ◽

10.4018/978-1-59140-646-4.ch012 ◽

2008 ◽

pp. 202-226

Author(s):

Darryl Charles ◽

Colin Fyfe ◽

Daniel Livingstone ◽

Stephen McGlinchey

Keyword(s):

Reinforcement Learning ◽

Unsupervised Learning ◽

Supervised And Unsupervised Learning ◽

Different Types ◽

Beneficial Action ◽

Learning Reinforcement

Just as there are many different types of supervised and unsupervised learning, so there are many different types of reinforcement learning. Reinforcement learning is appropriate for an AI or agent which is actively exploring its environment and also actively exploring what actions are best to take in different situations. Reinforcement learning is so-called because, when an AI performs a beneficial action, it receives some reward which reinforces its tendency to perform that beneficial action again. An excellent overview of reinforcement learning (on which this brief chapter is based) is by Sutton and Barto (1998).

Download Full-text

Pemanfaatan Machine Learning dalam Berbagai Bidang: Review paper

IJCIT (Indonesian Journal on Computer and Information Technology) ◽

10.31294/ijcit.v5i1.7951 ◽

2020 ◽

Vol 5 (1) ◽

Author(s):

Ahmad Roihan ◽

Po Abas Sunarya ◽

Ageng Setiani Rafika

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Reinforcement Learning ◽

Unsupervised Learning ◽

Supervised Learning ◽

Review Paper ◽

Computational Costs ◽

Accuracy And Precision ◽

Learning Reinforcement ◽

High Level

Abstrak - Pembelajaran mesin merupakan bagian dari kecerdasan buatan yang banyak digunakan untuk memecahkan berbagai masalah. Artikel ini menyajikan ulasan pemecahan masalah dari penelitian-penelitian terkini dengan mengklasifikasikan machine learning menjadi tiga kategori: pembelajaran terarah, pembelajaran tidak terarah, dan pembelajaran reinforcement. Hasil ulasan menunjukkan ketiga kategori masih berpeluang digunakan dalam beberapa kasus terkini dan dapat ditingkatkan untuk mengurangi beban komputasi dan mempercepat kinerja untuk mendapatkan tingkat akurasi dan presisi yang tinggi. Tujuan ulasan artikel ini diharapkan dapat menemukan celah dan dijadikan pedoman untuk penelitian pada masa yang akan datang.Katakunci: pembelajaran mesin, pembelajaran reinforcement, pembelajaran terarah, pembelajaran tidak terarahAbstract - Machine learning is part of artificial intelligence that is widely used to solve various problems. This article reviews problem solving from the latest studies by classifying machine learning into three categories: supervised learning, unsupervised learning, and reinforcement learning. The results of the review show that the three categories are still likely to be used in some of the latest cases and can be improved to reduce computational costs and accelerate performance to get a high level of accuracy and precision. The purpose of this article review is expected to be able to find a gap and it is used as a guideline for future research.Keywords: machine learning, reinforcement learning, supervised learning, unsupervised learning

Download Full-text

Computational Design of Modular Robots Based on Genetic Algorithm and Reinforcement Learning

Symmetry ◽

10.3390/sym13030471 ◽

2021 ◽

Vol 13 (3) ◽

pp. 471

Author(s):

Jai Hoon Park ◽

Kang Hoon Lee

Keyword(s):

Genetic Algorithm ◽

Reinforcement Learning ◽

Design Space ◽

Learning Algorithm ◽

Computational Design ◽

Computational Method ◽

Learning Ability ◽

Modular Robots ◽

Control Mechanisms ◽

Candidate Structure

Designing novel robots that can cope with a specific task is a challenging problem because of the enormous design space that involves both morphological structures and control mechanisms. To this end, we present a computational method for automating the design of modular robots. Our method employs a genetic algorithm to evolve robotic structures as an outer optimization, and it applies a reinforcement learning algorithm to each candidate structure to train its behavior and evaluate its potential learning ability as an inner optimization. The size of the design space is reduced significantly by evolving only the robotic structure and by performing behavioral optimization using a separate training algorithm compared to that when both the structure and behavior are evolved simultaneously. Mutual dependence between evolution and learning is achieved by regarding the mean cumulative rewards of a candidate structure in the reinforcement learning as its fitness in the genetic algorithm. Therefore, our method searches for prospective robotic structures that can potentially lead to near-optimal behaviors if trained sufficiently. We demonstrate the usefulness of our method through several effective design results that were automatically generated in the process of experimenting with actual modular robotics kit.

Download Full-text

Integrating Production Planning with Truck-Dispatching Decisions through Reinforcement Learning While Managing Uncertainty

Minerals ◽

10.3390/min11060587 ◽

2021 ◽

Vol 11 (6) ◽

pp. 587

Author(s):

Joao Pedro de Carvalho ◽

Roussos Dimitrakopoulos

Keyword(s):

Reinforcement Learning ◽

Discrete Event ◽

Mining Operations ◽

Fixed Sequence ◽

Q Learning ◽

Reward Function ◽

Copper Gold ◽

Mining Complex ◽

Learning Reinforcement ◽

Operational Plan

This paper presents a new truck dispatching policy approach that is adaptive given different mining complex configurations in order to deliver supply material extracted by the shovels to the processors. The method aims to improve adherence to the operational plan and fleet utilization in a mining complex context. Several sources of operational uncertainty arising from the loading, hauling and dumping activities can influence the dispatching strategy. Given a fixed sequence of extraction of the mining blocks provided by the short-term plan, a discrete event simulator model emulates the interaction arising from these mining operations. The continuous repetition of this simulator and a reward function, associating a score value to each dispatching decision, generate sample experiences to train a deep Q-learning reinforcement learning model. The model learns from past dispatching experience, such that when a new task is required, a well-informed decision can be quickly taken. The approach is tested at a copper–gold mining complex, characterized by uncertainties in equipment performance and geological attributes, and the results show improvements in terms of production targets, metal production, and fleet management.

Download Full-text

Collaborative Scheduling of Algorithms for Path Planning of Unmanned Systems

Current Chinese Science ◽

10.2174/2210298101666210211094253 ◽

2021 ◽

Vol 01 ◽

Author(s):

Ying Li ◽

Chubing Guo ◽

Jianshe Wu ◽

Xin Zhang ◽

Jian Gao ◽

...

Keyword(s):

Genetic Algorithm ◽

Reinforcement Learning ◽

Path Planning ◽

Unmanned Systems ◽

Ant Colony Optimization Algorithm ◽

A Algorithm ◽

Collaborative Scheduling ◽

Simulation Results ◽

Planning Problems ◽

Effective Path

Background: Unmanned systems have been widely used in multiple fields. Many algorithms have been proposed to solve path planning problems. Each algorithm has its advantages and defects and cannot adapt to all kinds of requirements. An appropriate path planning method is needed for various applications. Objective: To select an appropriate algorithm fastly in a given application. This could be helpful for improving the efficiency of path planning for Unmanned systems. Methods: This paper proposes to represent and quantify the features of algorithms based on the physical indicators of results. At the same time, an algorithmic collaborative scheme is developed to search the appropriate algorithm according to the requirement of the application. As an illustration of the scheme, four algorithms, including the A-star (A*) algorithm, reinforcement learning, genetic algorithm, and ant colony optimization algorithm, are implemented in the representation of their features. Results: In different simulations, the algorithmic collaborative scheme can select an appropriate algorithm in a given application based on the representation of algorithms. And the algorithm could plan a feasible and effective path. Conclusion: An algorithmic collaborative scheme is proposed, which is based on the representation of algorithms and requirement of the application. The simulation results prove the feasibility of the scheme and the representation of algorithms.

Download Full-text

Psychological Learning, Reinforcement Learning, and The Emergence of Cooperation in the Prisoner’s Dilemma

Korean Political Science Review ◽

10.18854/kpsr.2009.43.5.003 ◽

2009 ◽

Vol 43 (5) ◽

pp. 53-74

Author(s):

김성연

Keyword(s):

Reinforcement Learning ◽

Prisoner's Dilemma ◽

Prisoner’S Dilemma ◽

Emergence Of Cooperation ◽

Learning Reinforcement

Download Full-text

Extended Q-Learning: Reinforcement Learning Using Self-Organized State Space

RoboCup 2000: Robot Soccer World Cup IV - Lecture Notes in Computer Science ◽

10.1007/3-540-45324-5_11 ◽

2001 ◽

pp. 129-138 ◽

Cited By ~ 2

Author(s):

Shuichi Enokida ◽

Takeshi Ohasi ◽

Takaichi Yoshida ◽

Toshiaki Ejima

Keyword(s):

Reinforcement Learning ◽

State Space ◽

Q Learning ◽

Self Organized ◽

Learning Reinforcement

Download Full-text

Classifier Fitness Based on Accuracy

Evolutionary Computation ◽

10.1162/evco.1995.3.2.149 ◽

1995 ◽

Vol 3 (2) ◽

pp. 149-175 ◽

Cited By ~ 837

Author(s):

Stewart W. Wilson

Keyword(s):

Genetic Algorithm ◽

Reinforcement Learning ◽

Strength Parameter ◽

System Research ◽

Classifier Systems ◽

Expected Payoff ◽

Classifier System ◽

Wide Range ◽

Accuracy Criterion

In many classifier systems, the classifier strength parameter serves as a predictor of future payoff and as the classifier's fitness for the genetic algorithm. We investigate a classifier system, XCS, in which each classifier maintains a prediction of expected payoff, but the classifier's fitness is given by a measure of the prediction's accuracy. The system executes the genetic algorithm in niches defined by the match sets, instead of panmictically. These aspects of XCS result in its population tending to form a complete and accurate mapping X × A → P from inputs and actions to payoff predictions. Further, XCS tends to evolve classifiers that are maximally general, subject to an accuracy criterion. Besides introducing a new direction for classifier system research, these properties of XCS make it suitable for a wide range of reinforcement learning situations where generalization over states is desirable.

Download Full-text

Vision-Based Robot Navigation through Combining Unsupervised Learning and Hierarchical Reinforcement Learning

Sensors ◽

10.3390/s19071576 ◽

2019 ◽

Vol 19 (7) ◽

pp. 1576 ◽

Cited By ~ 1

Author(s):

Xiaomao Zhou ◽

Tao Bai ◽

Yanbin Gao ◽

Yuntao Han

Keyword(s):

Reinforcement Learning ◽

Unsupervised Learning ◽

Learning Algorithm ◽

Spatial Scales ◽

Learning Performance ◽

Head Direction ◽

Topological Map ◽

Topological Maps ◽

Hierarchical Reinforcement Learning ◽

Continuous State

Extensive studies have shown that many animals’ capability of forming spatial representations for self-localization, path planning, and navigation relies on the functionalities of place and head-direction (HD) cells in the hippocampus. Although there are numerous hippocampal modeling approaches, only a few span the wide functionalities ranging from processing raw sensory signals to planning and action generation. This paper presents a vision-based navigation system that involves generating place and HD cells through learning from visual images, building topological maps based on learned cell representations and performing navigation using hierarchical reinforcement learning. First, place and HD cells are trained from sequences of visual stimuli in an unsupervised learning fashion. A modified Slow Feature Analysis (SFA) algorithm is proposed to learn different cell types in an intentional way by restricting their learning to separate phases of the spatial exploration. Then, to extract the encoded metric information from these unsupervised learning representations, a self-organized learning algorithm is adopted to learn over the emerged cell activities and to generate topological maps that reveal the topology of the environment and information about a robot’s head direction, respectively. This enables the robot to perform self-localization and orientation detection based on the generated maps. Finally, goal-directed navigation is performed using reinforcement learning in continuous state spaces which are represented by the population activities of place cells. In particular, considering that the topological map provides a natural hierarchical representation of the environment, hierarchical reinforcement learning (HRL) is used to exploit this hierarchy to accelerate learning. The HRL works on different spatial scales, where a high-level policy learns to select subgoals and a low-level policy learns over primitive actions to specialize on the selected subgoals. Experimental results demonstrate that our system is able to navigate a robot to the desired position effectively, and the HRL shows a much better learning performance than the standard RL in solving our navigation tasks.

Download Full-text