scholarly journals Reinforcement Learning Based Network Selection for Hybrid VLC and RF Systems

2018 ◽  
Vol 173 ◽  
pp. 03014 ◽  
Author(s):  
Chunxi Wang ◽  
Guofeng Wu ◽  
Zhiyong Du ◽  
Bin jiang

For hybrid indoor network scenario with LTE, WLAN and Visible Light Communication (VLC), selecting network intelligently based on user service requirement is essential for ensuring high user quality of experience. In order to tackle the challenge due to dynamic environment and complicated service requirement, we propose a reinforcement learning solution for indoor network selection. In particular, a transfer learning based network selection algorithm, i.e., reinforcement learning with knowledge transfer, is proposed by revealing and exploiting the context information about the features of traffic, networks and network load distribution. The simulations show that the proposed algorithm has an efficient online learning ability and could achieve much better performance with faster convergence speed than the traditional reinforcement learning algorithm.

Symmetry ◽  
2021 ◽  
Vol 13 (3) ◽  
pp. 471
Author(s):  
Jai Hoon Park ◽  
Kang Hoon Lee

Designing novel robots that can cope with a specific task is a challenging problem because of the enormous design space that involves both morphological structures and control mechanisms. To this end, we present a computational method for automating the design of modular robots. Our method employs a genetic algorithm to evolve robotic structures as an outer optimization, and it applies a reinforcement learning algorithm to each candidate structure to train its behavior and evaluate its potential learning ability as an inner optimization. The size of the design space is reduced significantly by evolving only the robotic structure and by performing behavioral optimization using a separate training algorithm compared to that when both the structure and behavior are evolved simultaneously. Mutual dependence between evolution and learning is achieved by regarding the mean cumulative rewards of a candidate structure in the reinforcement learning as its fitness in the genetic algorithm. Therefore, our method searches for prospective robotic structures that can potentially lead to near-optimal behaviors if trained sufficiently. We demonstrate the usefulness of our method through several effective design results that were automatically generated in the process of experimenting with actual modular robotics kit.


2021 ◽  
Author(s):  
Asghar Mohammadian ◽  
Houman Zarrabi ◽  
Sam Jabbehdari ◽  
Amir Masoud Rahmani

Abstract In this paper, using edge processing in IoT network ,a method to decide on task offloading to edge devices and improve management of consuming energy in network of edge devices is introduced. First, by defining the problem of maximizing utility and then by decomposing it based on the status of task offloading from end devices to smart gateway with the lowest battery consumption and the possible lowest use of the communication bandwidth, independent optimal models are obtained and then combining. Then the general problem of maximizing the usefulness and increasing the lifetime of the end devices with restrictions of processing time and energy resources will be obtained. Due to the unknown environment of the problem and how the end devices and the edge of the network, an iterative reinforcement learning algorithm is used to generate the optimal answer in order to maximize the utility gain.The results show the existence of processing overhead and network load with increasing number of devices. The proposed method, while improving energy consumption in existence of a small number of devices in end of edge, reduces latency and increases processing speed and maximizes system performance compared to the central cloud system. The operating efficiency of the whole system is improved by 36% and the energy consumption at the edge of the network is optimized by 12.5%. It should be noted that, with the addition of a large number of end devices, our method outperforms similar works.


2019 ◽  
Vol 2019 ◽  
pp. 1-9
Author(s):  
Wei Li ◽  
Kai Xian ◽  
Jiateng Yin ◽  
Dewang Chen

Train station parking (TSP) accuracy is important to enhance the efficiency of train operation and the safety of passengers for urban rail transit. However, TSP is always subject to a series of uncertain factors such as extreme weather and uncertain conditions of rail track resistances. To increase the parking accuracy, robustness, and self-learning ability, we propose new train station parking frameworks by using the reinforcement learning (RL) theory combined with the information of balises. Three algorithms were developed, involving a stochastic optimal selection algorithm (SOSA), a Q-learning algorithm (QLA), and a fuzzy function based Q-learning algorithm (FQLA) in order to reduce the parking error in urban rail transit. Meanwhile, five braking rates are adopted as the action vector of the three algorithms and some statistical indices are developed to evaluate parking errors. Simulation results based on real-world data show that the parking errors of the three algorithms are all within the ±30cm, which meet the requirement of urban rail transit.


2021 ◽  
Vol 2107 (1) ◽  
pp. 012027
Author(s):  
Annapoorni Mani ◽  
Shahriman Abu Bakar ◽  
Pranesh Krishnan ◽  
Sazali Yaacob

Abstract Reinforcement learning is the most preferred algorithms for optimization problems in industrial automation. Model-free reinforcement learning algorithms optimize for rewards without the knowledge of the environmental dynamics and require less computation. Regulating the quality of the raw materials in the inbound inventory can improve the manufacturing process. In this paper, the raw materials arriving at the incoming inspection process are categorized and labeled based on their quality through the path traveled. A model-free temporal difference learning approach is used to predict the acceptance and rejection path of raw materials in the incoming inspection process. The algorithm presented eight routes paths that the raw materials could travel. Four pathways correspond to material acceptance, while the rest lead to material refusal. The materials are annotated using the total scores acquired in the incoming inspection process. The materials traveling on the ideal path (path A) get the highest total score. The rest of the accepted materials in the acceptance path have a 7.37% lower score in path B, whereas path C and path D get 37.28% and 42.44% lower than the ideal approach.


Author(s):  
Ioannis Partalas ◽  
Dimitris Vrakas ◽  
Ioannis Vlahavas

This article presents a detailed survey on Artificial Intelligent approaches, that combine Reinforcement Learning and Automated Planning. There is a close relationship between those two areas as they both deal with the process of guiding an agent, situated in a dynamic environment, in order to achieve a set of predefined goals. Therefore, it is straightforward to integrate learning and planning, in a single guiding mechanism and there have been many approaches in this direction during the past years. The approaches are organized and presented according to various characteristics, as the used planning mechanism or the reinforcement learning algorithm.


Sensors ◽  
2021 ◽  
Vol 21 (2) ◽  
pp. 372
Author(s):  
Dongji Li ◽  
Shaoyi Xu ◽  
Pengyu Li

With the rapid development of vehicular networks, vehicle-to-everything (V2X) communications have huge number of tasks to be calculated, which brings challenges to the scarce network resources. Cloud servers can alleviate the terrible situation regarding the lack of computing abilities of vehicular user equipment (VUE), but the limited resources, the dynamic environment of vehicles, and the long distances between the cloud servers and VUE induce some potential issues, such as extra communication delay and energy consumption. Fortunately, mobile edge computing (MEC), a promising computing paradigm, can ameliorate the above problems by enhancing the computing abilities of VUE through allocating the computational resources to VUE. In this paper, we propose a joint optimization algorithm based on a deep reinforcement learning algorithm named the double deep Q network (double DQN) to minimize the cost constituted of energy consumption, the latency of computation, and communication with the proper policy. The proposed algorithm is more suitable for dynamic scenarios and requires low-latency vehicular scenarios in the real world. Compared with other reinforcement learning algorithms, the algorithm we proposed algorithm improve the performance in terms of convergence, defined cost, and speed by around 30%, 15%, and 17%.


2020 ◽  
Vol 2020 (4) ◽  
pp. 43-54
Author(s):  
S.V. Khoroshylov ◽  
◽  
M.O. Redka ◽  

The aim of the article is to approximate optimal relative control of an underactuated spacecraft using reinforcement learning and to study the influence of various factors on the quality of such a solution. In the course of this study, methods of theoretical mechanics, control theory, stability theory, machine learning, and computer modeling were used. The problem of in-plane spacecraft relative control using only control actions applied tangentially to the orbit is considered. This approach makes it possible to reduce the propellant consumption of reactive actuators and to simplify the architecture of the control system. However, in some cases, methods of the classical control theory do not allow one to obtain acceptable results. In this regard, the possibility of solving this problem by reinforcement learning methods has been investigated, which allows designers to find control algorithms close to optimal ones as a result of interactions of the control system with the plant using a reinforcement signal characterizing the quality of control actions. The well-known quadratic criterion is used as a reinforcement signal, which makes it possible to take into account both the accuracy requirements and the control costs. A search for control actions based on reinforcement learning is made using the policy iteration algorithm. This algorithm is implemented using the actor–critic architecture. Various representations of the actor for control law implementation and the critic for obtaining value function estimates using neural network approximators are considered. It is shown that the optimal control approximation accuracy depends on a number of features, namely, an appropriate structure of the approximators, the neural network parameter updating method, and the learning algorithm parameters. The investigated approach makes it possible to solve the considered class of control problems for controllers of different structures. Moreover, the approach allows the control system to refine its control algorithms during the spacecraft operation.


Cloud computing becomes the basic alternative platform for the most users application in the recent years. The complexity increasing in cloud environment due to the continuous development of resources and applications needs a concentrated integrated fault tolerance approach to provide the quality of service. Focusing on reliability enhancement in an environment with dynamic changes such as cloud environment, we developed a multi-agent scheduler using Reinforcement Learning (RL) algorithm and Neural Fitted Q (NFQ) to effectively schedule the user requests. Our approach considers the queue buffer size for each resource by implementing the queue theory to design a queue model in a way that each scheduler agent has its own queue which receives the user requests from the global queue. A central learning agent responsible of learning the output of the scheduler agents and direct those scheduler agents through the feedback claimed from the previous step. The dynamicity problem in cloud environment is managed in our system by employing neural network which supports the reinforcement learning algorithm through a specified function. The numerical result demonstrated an efficiency of our proposed approach and enhanced the reliability


Sign in / Sign up

Export Citation Format

Share Document