Reinforcement Learning Aided UAV Base Station Location Optimization for Rate Maximization

The application of unmanned aerial vehicles (UAV) as base station (BS) is gaining popularity. In this paper, we consider maximization of the overall data rate by intelligent deployment of UAV BS in the downlink of a cellular system. We investigate a reinforcement learning (RL)-aided approach to optimize the position of flying BSs mounted on board UAVs to support a macro BS (MBS). We propose an algorithm to avoid collision between multiple UAVs undergoing exploratory movements and to restrict UAV BSs movement within a predefined area. Q-learning technique is used to optimize UAV BS position, where the reward is equal to sum of user equipment (UE) data rates. We consider a framework where the UAV BSs carry out exploratory movements in the beginning and exploitary movements in later stages to maximize the overall data rate. Our results show that a cellular system with three UAV BSs and one MBS serving 72 UE reaches 69.2% of the best possible data rate, which is identified by brute force search. Finally, the RL algorithm is compared with a K-means algorithm to study the need of accurate UE locations. Our results show that the RL algorithm outperforms the K-means clustering algorithm when the measure of imperfection is higher. The proposed algorithm can be made use of by a practical MBS–UAV BSs–UEs system to provide protection to UAV BSs while maximizing data rate.

Download Full-text

A Reinforcement Learning Approach for Interference Management in Heterogeneous Wireless Networks

International Journal of Interactive Mobile Technologies (iJIM) ◽

10.3991/ijim.v15i12.20751 ◽

2021 ◽

Vol 15 (12) ◽

pp. 65

Author(s):

Akindele Segun Afolabi ◽

Shehu Ahmed ◽

Olubunmi Adewale Akinola

Keyword(s):

Reinforcement Learning ◽

Power Level ◽

Heterogeneous Wireless Networks ◽

Interference Management ◽

Base Station ◽

User Equipment ◽

Base Stations ◽

Multi Agent Systems ◽

Q Learning ◽

Macro Cell

<span lang="EN-US">Due to the increased demand for scarce wireless bandwidth, it has become insufficient to serve the network user equipment using macrocell base stations only. Network densification through the addition of low power nodes (picocell) to conventional high power nodes addresses the bandwidth dearth issue, but unfortunately introduces unwanted interference into the network which causes a reduction in throughput. This paper developed a reinforcement learning model that assisted in coordinating interference in a heterogeneous network comprising macro-cell and pico-cell base stations. The learning mechanism was derived based on Q-learning, which consisted of agent, state, action, and reward. The base station was modeled as the agent, while the state represented the condition of the user equipment in terms of Signal to Interference Plus Noise Ratio. The action was represented by the transmission power level and the reward was given in terms of throughput. Simulation results showed that the proposed Q-learning scheme improved the performances of average user equipment throughput in the network. In particular, </span><span lang="EN-US">multi-agent systems with a normal learning rate increased the throughput of associated user equipment by a whooping 212.5% compared to a macrocell-only scheme.</span>

Download Full-text

Impact of base station location in MIMO cellular system using cooperative transmission

2010 IFIP Wireless Days ◽

10.1109/wd.2010.5657726 ◽

2010 ◽

Author(s):

Tetsuki Taniguchi ◽

Yoshio Karasawa ◽

Nobuo Nakajima

Keyword(s):

Base Station ◽

Cellular System ◽

Cooperative Transmission ◽

Base Station Location ◽

Station Location

Download Full-text

Multi-Cell LTE-U/Wi-Fi Coexistence Evaluation Using a Reinforcement Learning Framework

Sensors ◽

10.3390/s20071855 ◽

2020 ◽

Vol 20 (7) ◽

pp. 1855 ◽

Cited By ~ 3

Author(s):

José M. de C. Neto ◽

Sildolfo F. G. Neto ◽

Pedro M. de Santana ◽

Vicente A. de Sousa

Keyword(s):

Reinforcement Learning ◽

Mobile Networks ◽

New Technologies ◽

Data Rate ◽

Added Value ◽

Broadband Internet ◽

Learning Framework ◽

Data Rates ◽

User Data ◽

5 Ghz

Cellular broadband Internet of Things (IoT) applications are expected to keep growing year-by-year, generating demands from high throughput services. Since some of these applications are deployed over licensed mobile networks, as long term evolution (LTE), one already common problem is faced: the scarcity of licensed spectrum to cope with the increasing demand for data rate. The LTE-Unlicensed (LTE-U) forum, aiming to tackle this problem, proposed LTE-U to operate in the 5 GHz unlicensed spectrum. However, Wi-Fi is already the consolidated technology operating in this portion of the spectrum, besides the fact that new technologies for unlicensed band need mechanisms to promote fair coexistence with the legacy ones. In this work, we extend the literature by analyzing a multi-cell LTE-U/Wi-Fi coexistence scenario, with a high interference profile and data rates targeting a cellular broadband IoT deployment. Then, we propose a centralized, coordinated reinforcement learning framework to improve LTE-U/Wi-Fi aggregate data rates. The added value of the proposed solution is assessed by a ns-3 simulator, showing improvements not only in the overall system data rate but also in average user data rate, even with the high interference of a multi-cell environment.

Download Full-text

Performance Analysis of Base Station Cooperation in Multiantenna Cellular System

IEICE Transactions on Fundamentals of Electronics Communications and Computer Sciences ◽

10.1587/transfun.e94.a.2254 ◽

2011 ◽

Vol E94-A (11) ◽

pp. 2254-2262 ◽

Cited By ~ 3

Author(s):

Tetsuki TANIGUCHI ◽

Yoshio KARASAWA ◽

Nobuo NAKAJIMA

Keyword(s):

Performance Analysis ◽

Base Station ◽

Cellular System ◽

Base Station Cooperation

Download Full-text

Personalized project recommendations: using reinforcement learning

EURASIP Journal on Wireless Communications and Networking ◽

10.1186/s13638-019-1619-6 ◽

2019 ◽

Vol 2019 (1) ◽

Cited By ~ 1

Author(s):

Faxin Qi ◽

Xiangrong Tong ◽

Lei Yu ◽

Yingjie Wang

Keyword(s):

Reinforcement Learning ◽

User Behavior ◽

Collaborative Work ◽

Recursive Least Squares ◽

The Internet ◽

Dynamic Impact ◽

Rls Algorithm ◽

Trust Value ◽

Q Learning ◽

Actual Evaluation

AbstractWith the development of the Internet and the progress of human-centered computing (HCC), the mode of man-machine collaborative work has become more and more popular. Valuable information in the Internet, such as user behavior and social labels, is often provided by users. A recommendation based on trust is an important human-computer interaction recommendation application in a social network. However, previous studies generally assume that the trust value between users is static, unable to respond to the dynamic changes of user trust and preferences in a timely manner. In fact, after receiving the recommendation, there is a difference between actual evaluation and expected evaluation which is correlated with trust value. Based on the dynamics of trust and the changing process of trust between users, this paper proposes a trust boost method through reinforcement learning. Recursive least squares (RLS) algorithm is used to learn the dynamic impact of evaluation difference on user’s trust. In addition, a reinforcement learning method Deep Q-Learning (DQN) is studied to simulate the process of learning user’s preferences and boosting trust value. Experiments indicate that our method applied to recommendation systems could respond to the changes quickly on user’s preferences. Compared with other methods, our method has better accuracy on recommendation.

Download Full-text

Mobility-Aware Trajectory Design for Aerial Base Station Using Deep Reinforcement Learning

2020 International Conference on Wireless Communications and Signal Processing (WCSP) ◽

10.1109/wcsp49889.2020.9299676 ◽

2020 ◽

Author(s):

Guoliang Hao ◽

Wanli Ni ◽

Hui Tian ◽

Leilei Cao

Keyword(s):

Reinforcement Learning ◽

Base Station ◽

Trajectory Design

Download Full-text

Integrating Production Planning with Truck-Dispatching Decisions through Reinforcement Learning While Managing Uncertainty

Minerals ◽

10.3390/min11060587 ◽

2021 ◽

Vol 11 (6) ◽

pp. 587

Author(s):

Joao Pedro de Carvalho ◽

Roussos Dimitrakopoulos

Keyword(s):

Reinforcement Learning ◽

Discrete Event ◽

Mining Operations ◽

Fixed Sequence ◽

Q Learning ◽

Reward Function ◽

Copper Gold ◽

Mining Complex ◽

Learning Reinforcement ◽

Operational Plan

This paper presents a new truck dispatching policy approach that is adaptive given different mining complex configurations in order to deliver supply material extracted by the shovels to the processors. The method aims to improve adherence to the operational plan and fleet utilization in a mining complex context. Several sources of operational uncertainty arising from the loading, hauling and dumping activities can influence the dispatching strategy. Given a fixed sequence of extraction of the mining blocks provided by the short-term plan, a discrete event simulator model emulates the interaction arising from these mining operations. The continuous repetition of this simulator and a reward function, associating a score value to each dispatching decision, generate sample experiences to train a deep Q-learning reinforcement learning model. The model learns from past dispatching experience, such that when a new task is required, a well-informed decision can be quickly taken. The approach is tested at a copper–gold mining complex, characterized by uncertainties in equipment performance and geological attributes, and the results show improvements in terms of production targets, metal production, and fleet management.

Download Full-text

Deep Reinforcement Learning With Spatio-Temporal Traffic Forecasting for Data-Driven Base Station Sleep Control

IEEE/ACM Transactions on Networking ◽

10.1109/tnet.2021.3053771 ◽

2021 ◽

pp. 1-14

Author(s):

Qiong Wu ◽

Xu Chen ◽

Zhi Zhou ◽

Liang Chen ◽

Junshan Zhang

Keyword(s):

Reinforcement Learning ◽

Base Station ◽

Data Driven ◽

Traffic Forecasting ◽

Spatio Temporal ◽

Sleep Control

Download Full-text

Aircraft Maintenance Check Scheduling Using Reinforcement Learning

Aerospace ◽

10.3390/aerospace8040113 ◽

2021 ◽

Vol 8 (4) ◽

pp. 113

Author(s):

Pedro Andrade ◽

Catarina Silva ◽

Bernardete Ribeiro ◽

Bruno F. Santos

Keyword(s):

Reinforcement Learning ◽

Time Horizon ◽

Learning Algorithm ◽

Initial Conditions ◽

Q Learning ◽

Scheduling Policy ◽

Real Scenario ◽

Maintenance Plan ◽

Small Disturbances

This paper presents a Reinforcement Learning (RL) approach to optimize the long-term scheduling of maintenance for an aircraft fleet. The problem considers fleet status, maintenance capacity, and other maintenance constraints to schedule hangar checks for a specified time horizon. The checks are scheduled within an interval, and the goal is to, schedule them as close as possible to their due date. In doing so, the number of checks is reduced, and the fleet availability increases. A Deep Q-learning algorithm is used to optimize the scheduling policy. The model is validated in a real scenario using maintenance data from 45 aircraft. The maintenance plan that is generated with our approach is compared with a previous study, which presented a Dynamic Programming (DP) based approach and airline estimations for the same period. The results show a reduction in the number of checks scheduled, which indicates the potential of RL in solving this problem. The adaptability of RL is also tested by introducing small disturbances in the initial conditions. After training the model with these simulated scenarios, the results show the robustness of the RL approach and its ability to generate efficient maintenance plans in only a few seconds.

Download Full-text

Improved Q-Learning Algorithm Based on Approximate State Matching in Agricultural Plant Protection Environment

Entropy ◽

10.3390/e23060737 ◽

2021 ◽

Vol 23 (6) ◽

pp. 737

Author(s):

Fengjie Sun ◽

Xianchang Wang ◽

Rui Zhang

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Optimal Policy ◽

Feasible Solution ◽

Learning Algorithm ◽

Plant Protection ◽

Agricultural Plant ◽

Q Learning ◽

Aerial Vehicle ◽

Optimal Action

An Unmanned Aerial Vehicle (UAV) can greatly reduce manpower in the agricultural plant protection such as watering, sowing, and pesticide spraying. It is essential to develop a Decision-making Support System (DSS) for UAVs to help them choose the correct action in states according to the policy. In an unknown environment, the method of formulating rules for UAVs to help them choose actions is not applicable, and it is a feasible solution to obtain the optimal policy through reinforcement learning. However, experiments show that the existing reinforcement learning algorithms cannot get the optimal policy for a UAV in the agricultural plant protection environment. In this work we propose an improved Q-learning algorithm based on similar state matching, and we prove theoretically that there has a greater probability for UAV choosing the optimal action according to the policy learned by the algorithm we proposed than the classic Q-learning algorithm in the agricultural plant protection environment. This proposed algorithm is implemented and tested on datasets that are evenly distributed based on real UAV parameters and real farm information. The performance evaluation of the algorithm is discussed in detail. Experimental results show that the algorithm we proposed can efficiently learn the optimal policy for UAVs in the agricultural plant protection environment.

Download Full-text