Reinforcement Learning for Improving the Accuracy of PM2.5 Pollution Forecast Under the Neural Network Framework

The paper examines the life cycle of field development, analyzes the processes of the field development design stage for the application of machine learning methods. For each process, relevant problems are highlighted, existing solutions based on machine learning methods, ideas and problems are proposed that could be effectively solved by machine learning methods. For the main part of the processes, examples of solutions are briefly described; the advantages and disadvantages of the approaches are identified. The most common solution method is feed-forward neural networks. Subject to preliminary normalization of the input data, this is the most versatile algorithm for regression and classification problems. However, in the problem of selecting wells for hydraulic fracturing, a whole ensemble of machine learning models was used, where, in addition to a neural network, there was a random forest, gradient boosting and linear regression. For the problem of optimizing the placement of a grid of oil wells, the disadvantages of existing solutions based on a neural network and a simple reinforcement learning approach based on Markov decision-making process are identified. A deep reinforcement learning algorithm called Alpha Zero is proposed, which has previously shown significant results in the role of artificial intelligence for games. This algorithm is a decision tree search that directs the neural network: only those branches that have received the best estimates from the neural network are considered more thoroughly. The paper highlights the similarities between the tasks for which Alpha Zero was previously used, and the task of optimizing the placement of a grid of oil producing wells. Conclusions are made about the possibility of using and modifying the algorithm of the optimization problem being solved. Аn approach is proposed to take into account symmetric states in a Monte Carlo tree to reduce the number of required simulations.

Download Full-text

Analysis of Symbol Generation and Integration in a Unified Model Based on a Neural Network

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2005.p0297 ◽

2005 ◽

Vol 9 (3) ◽

pp. 297-303

Author(s):

Yukari Yamauchi ◽

◽

Shun'ichi Tano ◽

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Simple Model ◽

Unified Model ◽

Symbolic Processing ◽

Advantages And Disadvantages ◽

Intelligent Processing ◽

Knowledge Based ◽

The Neural Network ◽

Trained Neural Network

The computational (numerical information) and symbolic (knowledge-based) processing used in intelligent processing has advantages and disadvantages. A simple model integrating symbols into a neural network was proposed as a first step toward fusing computational and symbolic processing. To verify the effectiveness of this model, we first analyze the trained neural network and generate symbols manually. Then we discuss generation methods that are able to discover effective symbols during training of the neural network. We evaluated these through simulations of reinforcement learning in simple football games. Results indicate that the integration of symbols into the neural network improved the performance of player agents.

Download Full-text

Reinforcement Learning-Based Satellite Attitude Stabilization Method for Non-Cooperative Target Capturing

Sensors ◽

10.3390/s18124331 ◽

2018 ◽

Vol 18 (12) ◽

pp. 4331 ◽

Cited By ~ 3

Author(s):

Zhong Ma ◽

Yuejiao Wang ◽

Yidai Yang ◽

Zhuping Wang ◽

Lei Tang ◽

...

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Network Model ◽

Neural Network Model ◽

Attitude Control ◽

Dynamics Simulation ◽

Continuous Control ◽

Simulation Environment ◽

Control Torque ◽

The Neural Network

When a satellite performs complex tasks such as discarding a payload or capturing a non-cooperative target, it will encounter sudden changes in the attitude and mass parameters, causing unstable flying and rolling of the satellite. In such circumstances, the change of the movement and mass characteristics are unpredictable. Thus, the traditional attitude control methods are unable to stabilize the satellite since they are dependent on the mass parameters of the controlled object. In this paper, we proposed a reinforcement learning method to re-stabilize the attitude of a satellite under such circumstances. Specifically, we discretize the continuous control torque, and build a neural network model that can output the discretized control torque to control the satellite. A dynamics simulation environment of the satellite is built, and the deep Q Network algorithm is then performed to train the neural network in this simulation environment. The reward of the training is the stabilization of the satellite. Simulation experiments illustrate that, with the iteration of training progresses, the neural network model gradually learned to re-stabilize the attitude of a satellite after unknown disturbance. As a contrast, the traditional PD (Proportion Differential) controller was unable to re-stabilize the satellite due to its dependence on the mass parameters. The proposed method adopts self-learning to control satellite attitudes, shows considerable intelligence and certain universality, and has a strong application potential for future intelligent control of satellites performing complex space tasks.

Download Full-text

On the use of neural regulators

Transaction of Scientific Papers of the Novosibirsk State Technical University ◽

10.17212/2307-6879-2021-1-53-63 ◽

2021 ◽

pp. 53-63

Author(s):

Alexsander Voevoda ◽

◽

Dmitry Romannikov ◽

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Control Systems ◽

State Vector ◽

Neural Controller ◽

Discrete Form ◽

Learning Methods ◽

Data Set ◽

The Neural Network ◽

Input Error

The application of neural networks for the synthesis of control systems is considered. Examples of synthesis of control systems using methods of reinforcement learning, in which the state vector is involved, are given. And the synthesis of a neural controller for objects with an inaccessible state vector is discussed: 1) a variant using a neural network with recurrent feedbacks; 2) a variant using the input error vector, where each error (except for the first one) enters the input of the neural network passing through the delay element. The disadvantages of the first method include the fact that for such a structure of a neural network it is not possible to apply existing learning methods with confirmation and for training it is required to use a data set obtained, for example, from a previously calculated linear controller. The structure of the neural network used in the second option allows the application of reinforcement learning methods, but the article provides a statement and its proof that for the synthesis of a control system for objects with three or more integrators, a neural network without recurrent connections cannot be used. The application of the above structures is given on examples of the synthesis of control systems for objects 1/s2 and 1/s3 presented in a discrete form.

Download Full-text

Intelligent scheduling using a neural network model in conjunction with reinforcement learning

Proceedings of the Institution of Mechanical Engineers Part B Journal of Engineering Manufacture ◽

10.1243/095440505x8181 ◽

2005 ◽

Vol 219 (2) ◽

pp. 229-235

Author(s):

C J Fourie

Keyword(s):

Neural Network ◽

Artificial Neural Network ◽

Reinforcement Learning ◽

Learning From Experience ◽

Scheduling System ◽

Simulated Environment ◽

The Neural Network ◽

Learning Techniques ◽

Proposed Model ◽

Intelligent Scheduling

This paper describes the use of an artificial neural network in conjunction with reinforcement learning techniques to develop an intelligent scheduling system that is capable of learning from experience. In a simulated environment the model controls a mobile robot that transports material to machines. States of ‘happiness’ are defined for each machine, which are the inputs to the neural network. The output of the neural network is the decision on which machine to service next. After every decision, a critic evaluates the decision and a teacher ‘rewards’ the network to encourage good decisions and discourage bad decisions. From the results obtained, it is concluded that the proposed model is capable of learning from past experience and thereby improving the intelligence of the system.

Download Full-text

A Parametric Study of a Deep Reinforcement Learning Control System Applied to the Swing-Up Problem of the Cart-Pole

Applied Sciences ◽

10.3390/app10249013 ◽

2020 ◽

Vol 10 (24) ◽

pp. 9013

Author(s):

Camilo Andrés Manrique Escobar ◽

Carmine Maria Pappalardo ◽

Domenico Guida

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Dry Friction ◽

Network Architecture ◽

Control Policy ◽

Neural Network Architecture ◽

Learning Capability ◽

Friction Forces ◽

The Neural Network ◽

Pole System

In this investigation, the nonlinear swing-up problem associated with the cart-pole system modeled as a multibody dynamical system is solved by developing a deep Reinforcement Learning (RL) controller. Furthermore, the sensitivity analysis of the deep RL controller applied to the cart-pole swing-up problem is carried out. To this end, the influence of modifying the physical properties of the system and the presence of dry friction forces are analyzed employing the cumulative reward during the task. Extreme limits for the modifications of the parameters are determined to prove that the neural network architecture employed in this work features enough learning capability to handle the task under modifications as high as 90% on the pendulum mass, as well as a 100% increment on the cart mass. As expected, the presence of dry friction greatly affects the performance of the controller. However, a post-training of the agent in the modified environment takes only thirty-nine episodes to find the optimal control policy, resulting in a promising path for further developments of robust controllers.

Download Full-text

Autonomous guidewire navigation in a two dimensional vascular phantom

Current Directions in Biomedical Engineering ◽

10.1515/cdbme-2020-0007 ◽

2020 ◽

Vol 6 (1) ◽

Author(s):

Lennart Karstensen ◽

Tobias Behr ◽

Tim Philipp Pusch ◽

Franziska Mathis-Ullrich ◽

Jan Stallkamp

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Two Dimensional ◽

Level Control ◽

Control Task ◽

Vascular Tree ◽

The Neural Network ◽

Experience Replay ◽

Catheter Navigation ◽

High Level

AbstractThe treatment of cerebro- and cardiovascular diseases requires complex and challenging navigation of a catheter. Previous attempts to automate catheter navigation lack the ability to be generalizable. Methods of Deep Reinforcement Learning show promising results and may be the key to automate catheter navigation through the tortuous vascular tree. This work investigates Deep Reinforcement Learning for guidewire manipulation in a complex and rigid vascular model in 2D. The neural network trained by Deep Deterministic Policy Gradients with Hindsight Experience Replay performs well on the low-level control task, however the high-level control of the path planning must be improved further.

Download Full-text

Learning Unmanned Aerial Vehicle Control for Autonomous Target Following

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/685 ◽

2018 ◽

Cited By ~ 2

Author(s):

Siyi Li ◽

Tianbo Liu ◽

Chi Zhang ◽

Dit-Yan Yeung ◽

Shaojie Shen

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Unmanned Aerial Vehicle ◽

Learning Process ◽

Real World ◽

Model Free ◽

The Neural Network ◽

Policy Gradient ◽

Aerial Vehicle ◽

Efficient Learning

While deep reinforcement learning (RL) methods have achieved unprecedented successes in a range of challenging problems, their applicability has been mainly limited to simulation or game domains due to the high sample complexity of the trial-and-error learning process. However, real-world robotic applications often need a data-efficient learning process with safety-critical constraints. In this paper, we consider the challenging problem of learning unmanned aerial vehicle (UAV) control for tracking a moving target. To acquire a strategy that combines perception and control, we represent the policy by a convolutional neural network. We develop a hierarchical approach that combines a model-free policy gradient method with a conventional feedback proportional-integral-derivative (PID) controller to enable stable learning without catastrophic failure. The neural network is trained by a combination of supervised learning from raw images and reinforcement learning from games of self-play. We show that the proposed approach can learn a target following policy in a simulator efficiently and the learned behavior can be successfully transferred to the DJI quadrotor platform for real-world UAV control.

Download Full-text

Utilizing reinforcement learning and deep neural networks to optimize non-pharmaceutical COVID-19 interventions in Florida

Journal of Student Research ◽

10.47611/jsrhs.v10i3.1802 ◽

2021 ◽

Vol 10 (3) ◽

Author(s):

Megan Yang ◽

Leya Joykutty

Keyword(s):

Neural Network ◽

Machine Learning ◽

Neural Networks ◽

Reinforcement Learning ◽

Deep Neural Networks ◽

The Neural Network ◽

Long Term Effect ◽

Training Samples ◽

Economy And Education

Under the umbrella of artificial intelligence is machine learning that allows a system to improve through experience without any explicit programs telling it to. It is able to find patterns in massive amounts of data from works, images, numbers, to statistics. One approach to machine learning is neural networks in which the computer learns to finish a task by analyzing training samples. Another approach used in this study is reinforcement learning which manipulates it environment to discover errors and rewards. This study aimed developed a deep neural network and used reinforcement learning to develop a system that was able to predict whether the cases will increase or decrease, then using that information, was able to predict which actions would most effectively cause a decline in cases while keeping things like economy and education in mind for a better long term effect. These models were made based on Florida using eight different counties’ data including things like mobility, temperature, dates of government actions, etc. Based on this information, data exploration and feature engineering was conducted to add dimensions that would further the accuracy of the neural network. The reinforcement learning model’s actions consisted of first, a shutdown for about two months before reopening schools and allowing things to return to normal. Then interestingly the model decided to keep school operating in a hybrid model with some students going back to school while others continue to study remotely.

Download Full-text

The Eminence Grise

How the Brain Makes Decisions ◽

10.1093/oso/9780198824367.003.0008 ◽

2020 ◽

pp. 61-69

Author(s):

Thomas Boraud

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Basal Ganglia ◽

Social Behaviour ◽

Dynamic Properties ◽

Leading Role ◽

Cortical Level ◽

The Neural Network ◽

Main Input ◽

Controlling Behaviour

This chapter demonstrates how the cortex communicates with the basal ganglia and the thalamus progressively during the evolution of vertebrates. The telencephalic loop is involved in social behaviour and cognitive processes. In mammals, the telencephalic loop, in which the cortex now replaces the pallium, takes a leading role in controlling behaviour. Thus, the telencephalon, a very minor input of the basal ganglia in anamniots, gradually becomes the main input as evolution progresses. Ultimately, the resulting neural network possesses the same dynamic properties as those described in Chapter 5. The neural network is also able to perform reinforcement learning through the subcortical loop, and also to automatize some skills at the cortical level.

Download Full-text