An Algorithm of Reinforcement Learning for Maneuvering Parameter Self-Tuning Applying in Satellite Cluster

Satellite cluster is a type of artificial cluster, which is attracting wide attention at present. Although the traditional empirical parameter method (TEPM) has the potential to deal with the mission of satellite flocking, it is difficult to select the proper parameters. In order to improve the flight effect in the problem of satellite cluster, as well as to make the selection of flight parameters more reasonable, the traditional sensing zones are improved. A 3σ position error ellipsoid and an induction ellipsoid are applied for substituting the traditional repulsing zone and attracting zone, respectively. Besides, we propose an algorithm of reinforcement learning for parameter self-tuning (RLPST), which is based on the actor-critic framework, to automatically learn the suitable flight parameters. To obtain the parameters in the repulsing zone, orientating zone, and attracting zone of each member in the cluster, a three-channel learning framework is designed. The learning process makes the framework finally find the suitable parameters. Numerical experimental results have shown the superiorities compared to the traditional method, which include trajectory deviation and sensing rate or terminal matching rate, as well as the improvement of the flight paths under the learning framework.

Download Full-text

An Introduction to Intertask Transfer for Reinforcement Learning

AI Magazine ◽

10.1609/aimag.v32i1.2329 ◽

2011 ◽

Vol 32 (1) ◽

pp. 15 ◽

Cited By ~ 18

Author(s):

Matthew E. Taylor ◽

Peter Stone

Keyword(s):

Reinforcement Learning ◽

Transfer Learning ◽

Learning Problem ◽

Open Problems ◽

Learning Framework ◽

Learning Domains ◽

Multiple Tasks ◽

Exciting Area ◽

Generalize Information ◽

Selection Of

Transfer learning has recently gained popularity due to the development of algorithms that can successfully generalize information across multiple tasks. This article focuses on transfer in the context of reinforcement learning domains, a general learning framework where an agent acts in an environment to maximize a reward signal. The goals of this article are to (1) familiarize readers with the transfer learning problem in reinforcement learning domains, (2) explain why the problem is both interesting and difﬁcult, (3) present a selection of existing techniques that demonstrate different solutions, and (4) provide representative open problems in the hope of encouraging additional research in this exciting area.

Download Full-text

Competition of Verb Aspects and Its Resolution as a Creative Act (Based on the Material of Aspects Variants in the Texts by F. M. Dostoevsky)

Russian language at school ◽

10.30515/0131-6141-2020-81-2-56-63 ◽

2020 ◽

Vol 81 (2) ◽

pp. 56-63

Author(s):

S. A. Karpukhin

Keyword(s):

Traditional Method ◽

Contextual Analysis ◽

Verb Aspect ◽

Component Theory ◽

Two Component ◽

Artistic Interpretation ◽

New Perspective ◽

First Time ◽

Selection Of ◽

Aspectual Meaning

The article considers the competition of verbal aspects from a new perspective. Instead of employing the traditional method of demonstrating this phenomenon — an empirical replacement of the aspect of a verb in a phrase with the opposite — the author examines Dostoevsky’s choice between the variants found in different manuscripts of the same text. For the first time, based on a two-component theory of the semantic invariant of a verb type, the aspectual meaning of the selection of a verb aspect is revealed and, as a result of contextual analysis, an artistic interpretation of the selected type is proposed.

Download Full-text

Self-Tuning Two Degree-of-Freedom Proportional–Integral Control System Based on Reinforcement Learning for a Multiple-Input Multiple-Output Industrial Process That Suffers from Spatial Input Coupling

Processes ◽

10.3390/pr9030487 ◽

2021 ◽

Vol 9 (3) ◽

pp. 487

Author(s):

Fumitake Fujii ◽

Akinori Kaneishi ◽

Takafumi Nii ◽

Ryu’ichiro Maenishi ◽

Soma Tanaka

Keyword(s):

Control System ◽

Reinforcement Learning ◽

Pid Control ◽

Multiple Input Multiple Output ◽

Industrial Process ◽

Pi Controller ◽

Multiple Input ◽

Input Multiple Output ◽

Self Tuning ◽

Two Degree Of Freedom

Proportional–integral–derivative (PID) control remains the primary choice for industrial process control problems. However, owing to the increased complexity and precision requirement of current industrial processes, a conventional PID controller may provide only unsatisfactory performance, or the determination of PID gains may become quite difficult. To address these issues, studies have suggested the use of reinforcement learning in combination with PID control laws. The present study aims to extend this idea to the control of a multiple-input multiple-output (MIMO) process that suffers from both physical coupling between inputs and a long input/output lag. We specifically target a thin film production process as an example of such a MIMO process and propose a self-tuning two-degree-of-freedom PI controller for the film thickness control problem. Theoretically, the self-tuning functionality of the proposed control system is based on the actor-critic reinforcement learning algorithm. We also propose a method to compensate for the input coupling. Numerical simulations are conducted under several likely scenarios to demonstrate the enhanced control performance relative to that of a conventional static gain PI controller.

Download Full-text

Hierarchical Reinforcement Learning Framework for Secure UAV Communication in the Presence of Multiple UAV Adaptive Eavesdroppers

2020 IEEE 6th International Conference on Computer and Communications (ICCC) ◽

10.1109/iccc51575.2020.9344970 ◽

2020 ◽

Author(s):

Liu Jue ◽

Yang Weiwei

Keyword(s):

Reinforcement Learning ◽

Hierarchical Reinforcement Learning ◽

Learning Framework

Download Full-text

A Reinforcement Learning Framework for Spiking Networks with Dynamic Synapses

Computational Intelligence and Neuroscience ◽

10.1155/2011/869348 ◽

2011 ◽

Vol 2011 ◽

pp. 1-12 ◽

Cited By ~ 3

Author(s):

Karim El-Laithy ◽

Martin Bogdan

Keyword(s):

Reinforcement Learning ◽

Spike Timing ◽

Neural Representation ◽

Model Parameters ◽

Learning Framework ◽

Reference Target ◽

Wide Range ◽

Spiking Network ◽

Dynamic Synapses ◽

Exclusive Or

An integration of both the Hebbian-based and reinforcement learning (RL) rules is presented for dynamic synapses. The proposed framework permits the Hebbian rule to update the hidden synaptic model parameters regulating the synaptic response rather than the synaptic weights. This is performed using both the value and the sign of the temporal difference in the reward signal after each trial. Applying this framework, a spiking network with spike-timing-dependent synapses is tested to learn the exclusive-OR computation on a temporally coded basis. Reward values are calculated with the distance between the output spike train of the network and a reference target one. Results show that the network is able to capture the required dynamics and that the proposed framework can reveal indeed an integrated version of Hebbian and RL. The proposed framework is tractable and less computationally expensive. The framework is applicable to a wide class of synaptic models and is not restricted to the used neural representation. This generality, along with the reported results, supports adopting the introduced approach to benefit from the biologically plausible synaptic models in a wide range of intuitive signal processing.

Download Full-text

Adaptive Reinforcement Learning Framework for NOMA-UAV Networks

IEEE Communications Letters ◽

10.1109/lcomm.2021.3093385 ◽

2021 ◽

pp. 1-1

Author(s):

Syed Khurram Mahmud ◽

Yuanwei Liu ◽

Yue Chen ◽

Kok Keong Chai

Keyword(s):

Reinforcement Learning ◽

Learning Framework

Download Full-text

A Dual-Critic Reinforcement Learning Framework for Frame-Level Bit Allocation in HEVC/H.265

2021 Data Compression Conference (DCC) ◽

10.1109/dcc50243.2021.00009 ◽

2021 ◽

Author(s):

Yung-Han Ho ◽

Guo-Lun Jin ◽

Yun Liang ◽

Wen-Hsiao Peng ◽

Xiaobo Li

Keyword(s):

Reinforcement Learning ◽

Bit Allocation ◽

Learning Framework

Download Full-text

A novel reinforcement learning framework for sensor subset selection

2010 International Conference on Networking, Sensing and Control (ICNSC) ◽

10.1109/icnsc.2010.5461532 ◽

2010 ◽

Cited By ~ 3

Author(s):

Omkar Tilak ◽

Snehasis Mukhopadhyay ◽

Mihran Tuceryan ◽

Rajeev Raje

Keyword(s):

Reinforcement Learning ◽

Subset Selection ◽

Learning Framework

Download Full-text

A Reinforcement Learning Framework for Relevance Feedback

Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval ◽

10.1145/3397271.3401099 ◽

2020 ◽

Cited By ~ 2

Author(s):

Ali Montazeralghaem ◽

Hamed Zamani ◽

James Allan

Keyword(s):

Reinforcement Learning ◽

Relevance Feedback ◽

Learning Framework

Download Full-text

Predicting Human Mobility with Reinforcement-Learning-Based Long-Term Periodicity Modeling

ACM Transactions on Intelligent Systems and Technology ◽

10.1145/3469860 ◽

2021 ◽

Vol 12 (6) ◽

pp. 1-23

Author(s):

Shuo Tao ◽

Jingang Jiang ◽

Defu Lian ◽

Kai Zheng ◽

Enhong Chen

Keyword(s):

Reinforcement Learning ◽

Human Mobility ◽

Recurrent Network ◽

Mobility Prediction ◽

Learning Framework ◽

Temporal Features ◽

Wide Range ◽

Spatio Temporal ◽

Historical Trajectory

Mobility prediction plays an important role in a wide range of location-based applications and services. However, there are three problems in the existing literature: (1) explicit high-order interactions of spatio-temporal features are not systemically modeled; (2) most existing algorithms place attention mechanisms on top of recurrent network, so they can not allow for full parallelism and are inferior to self-attention for capturing long-range dependence; (3) most literature does not make good use of long-term historical information and do not effectively model the long-term periodicity of users. To this end, we propose MoveNet and RLMoveNet. MoveNet is a self-attention-based sequential model, predicting each user’s next destination based on her most recent visits and historical trajectory. MoveNet first introduces a cross-based learning framework for modeling feature interactions. With self-attention on both the most recent visits and historical trajectory, MoveNet can use an attention mechanism to capture the user’s long-term regularity in a more efficient way. Based on MoveNet, to model long-term periodicity more effectively, we add the reinforcement learning layer and named RLMoveNet. RLMoveNet regards the human mobility prediction as a reinforcement learning problem, using the reinforcement learning layer as the regularization part to drive the model to pay attention to the behavior with periodic actions, which can help us make the algorithm more effective. We evaluate both of them with three real-world mobility datasets. MoveNet outperforms the state-of-the-art mobility predictor by around 10% in terms of accuracy, and simultaneously achieves faster convergence and over 4x training speedup. Moreover, RLMoveNet achieves higher prediction accuracy than MoveNet, which proves that modeling periodicity explicitly from the perspective of reinforcement learning is more effective.

Download Full-text