Trajectory Optimization for Autonomous Flying Base Station via Reinforcement Learning

This paper presents a data driven framework for performance optimisation of Narrow-Band IoT user equipment. The proposed framework is an edge micro-service that suggests one-time configurations to user equipment communicating with a base station. Suggested configurations are delivered from a Configuration Advocate, to improve energy consumption, delay, throughput or a combination of those metrics, depending on the user-end device and the application. Reinforcement learning utilising gradient descent and genetic algorithm is adopted synchronously with machine and deep learning algorithms to predict the environmental states and suggest an optimal configuration. The results highlight the adaptability of the Deep Neural Network in the prediction of intermediary environmental states, additionally the results present superior performance of the genetic reinforcement learning algorithm regarding its performance optimisation.

Download Full-text

A Reinforcement Learning Approach for Interference Management in Heterogeneous Wireless Networks

International Journal of Interactive Mobile Technologies (iJIM) ◽

10.3991/ijim.v15i12.20751 ◽

2021 ◽

Vol 15 (12) ◽

pp. 65

Author(s):

Akindele Segun Afolabi ◽

Shehu Ahmed ◽

Olubunmi Adewale Akinola

Keyword(s):

Reinforcement Learning ◽

Power Level ◽

Heterogeneous Wireless Networks ◽

Interference Management ◽

Base Station ◽

User Equipment ◽

Base Stations ◽

Multi Agent Systems ◽

Q Learning ◽

Macro Cell

<span lang="EN-US">Due to the increased demand for scarce wireless bandwidth, it has become insufficient to serve the network user equipment using macrocell base stations only. Network densification through the addition of low power nodes (picocell) to conventional high power nodes addresses the bandwidth dearth issue, but unfortunately introduces unwanted interference into the network which causes a reduction in throughput. This paper developed a reinforcement learning model that assisted in coordinating interference in a heterogeneous network comprising macro-cell and pico-cell base stations. The learning mechanism was derived based on Q-learning, which consisted of agent, state, action, and reward. The base station was modeled as the agent, while the state represented the condition of the user equipment in terms of Signal to Interference Plus Noise Ratio. The action was represented by the transmission power level and the reward was given in terms of throughput. Simulation results showed that the proposed Q-learning scheme improved the performances of average user equipment throughput in the network. In particular, </span><span lang="EN-US">multi-agent systems with a normal learning rate increased the throughput of associated user equipment by a whooping 212.5% compared to a macrocell-only scheme.</span>

Download Full-text

Two-Stage Pursuit Strategy for Incomplete-Information Impulsive Space Pursuit-Evasion Mission Using Reinforcement Learning

Aerospace ◽

10.3390/aerospace8100299 ◽

2021 ◽

Vol 8 (10) ◽

pp. 299

Author(s):

Bin Yang ◽

Pengxuan Liu ◽

Jinglang Feng ◽

Shuang Li

Keyword(s):

Reinforcement Learning ◽

Incomplete Information ◽

Trajectory Optimization ◽

Game Problem ◽

Gradient Algorithm ◽

Two Stage ◽

Pursuit Evasion ◽

Evasion Game ◽

Close Distance ◽

Pursuit Strategy

This paper presents a novel and robust two-stage pursuit strategy for the incomplete-information impulsive space pursuit-evasion missions considering the J2 perturbation. The strategy firstly models the impulsive pursuit-evasion game problem into a far-distance rendezvous stage and a close-distance game stage according to the perception range of the evader. For the far-distance rendezvous stage, it is transformed into a rendezvous trajectory optimization problem and a new objective function is proposed to obtain the pursuit trajectory with the optimal terminal pursuit capability. For the close-distance game stage, a closed-loop pursuit approach is proposed using one of the reinforcement learning algorithms, i.e., the deep deterministic policy gradient algorithm, to solve and update the pursuit trajectory for the incomplete-information impulsive pursuit-evasion missions. The feasibility of this novel strategy and its robustness to different initial states of the pursuer and evader and to the evasion strategies are demonstrated for the sun-synchronous orbit pursuit-evasion game scenarios. The results of the Monte Carlo tests show that the successful pursuit ratio of the proposed method is over 91% for all the given scenarios.

Download Full-text

DRAG: Deep Reinforcement Learning Based Base Station Activation in Heterogeneous Networks

IEEE Transactions on Mobile Computing ◽

10.1109/tmc.2019.2922602 ◽

2020 ◽

Vol 19 (9) ◽

pp. 2076-2087 ◽

Cited By ~ 4

Author(s):

Junhong Ye ◽

Ying-Jun Angela Zhang

Keyword(s):

Reinforcement Learning ◽

Heterogeneous Networks ◽

Base Station

Download Full-text

Decentralized and Dynamic Band Selection in Uplink Enhanced Licensed-Assisted Access: Deep Reinforcement Learning Approach

Wireless Communications and Mobile Computing ◽

10.1155/2020/5937358 ◽

2020 ◽

Vol 2020 ◽

pp. 1-9

Author(s):

Fitsum Debebe Tilahun ◽

Chung G. Kang

Keyword(s):

Reinforcement Learning ◽

Resource Utilization ◽

Base Station ◽

Band Selection ◽

Selection Strategy ◽

Operational Mode ◽

Selection Scheme ◽

Aggregation Technology

Enhanced licensed-assisted access (eLAA) is an operational mode that allows the use of unlicensed band to support long-term evolution (LTE) service via carrier aggregation technology. The extension of additional bandwidth is beneficial to meet the demands of the growing mobile traffic. In the uplink eLAA, which is prone to unexpected interference from WiFi access points, resource scheduling by the base station, and then performing a listen before talk (LBT) mechanism by the users can seriously affect the resource utilization. In this paper, we present a decentralized deep reinforcement learning (DRL)-based approach in which each user independently learns dynamic band selection strategy that maximizes its own rate. Through extensive simulations, we show that the proposed DRL-based band selection scheme improves resource utilization while supporting certain minimum quality of service (QoS).

Download Full-text

Bandwidth, Power and Trajectory Optimization for UAV Base Station Networks With Backhaul and User QoS Constraints

IEEE Access ◽

10.1109/access.2020.2986075 ◽

2020 ◽

Vol 8 ◽

pp. 67625-67634

Author(s):

Yingqian Huang ◽

Miao Cui ◽

Guangchi Zhang ◽

Wei Chen

Keyword(s):

Trajectory Optimization ◽

Base Station

Download Full-text

RLAM: A Dynamic and Efficient Reinforcement Learning-Based Adaptive Mapping Scheme in Mobile WiMAX Networks

Mobile Information Systems ◽

10.1155/2014/213056 ◽

2014 ◽

Vol 10 (2) ◽

pp. 173-196 ◽

Cited By ~ 2

Author(s):

M. Louta ◽

P. Sarigiannidis ◽

S. Misra ◽

P. Nicopolitidis ◽

G. Papadimitriou

Keyword(s):

Reinforcement Learning ◽

Learning Automata ◽

Base Station ◽

Research Literature ◽

Average Error ◽

Mobile Wimax ◽

Width Ratio ◽

Dynamic Adjustment ◽

Extensive Evaluation ◽

Improved Performance

WiMAX (Worldwide Interoperability for Microwave Access) constitutes a candidate networking technology towards the 4G vision realization. By adopting the Orthogonal Frequency Division Multiple Access (OFDMA) technique, the latest IEEE 802.16x amendments manage to provide QoS-aware access services with full mobility support. A number of interesting scheduling and mapping schemes have been proposed in research literature. However, they neglect a considerable asset of the OFDMA-based wireless systems: the dynamic adjustment of the downlink-to-uplink width ratio. In order to fully exploit the supported mobile WiMAX features, we design, develop, and evaluate a rigorous adaptive model, which inherits its main aspects from the reinforcement learning field. The model proposed endeavours to efficiently determine the downlink-to-uplinkwidth ratio, on a frame-by-frame basis, taking into account both the downlink and uplink traffic in the Base Station (BS). Extensive evaluation results indicate that the model proposed succeeds in providing quite accurate estimations, keeping the average error rate below 15% with respect to the optimal sub-frame configurations. Additionally, it presents improved performance compared to other learning methods (e.g., learning automata) and notable improvements compared to static schemes that maintain a fixed predefined ratio in terms of service ratio and resource utilization.

Download Full-text

Trajectory Optimization using Reinforcement Learning for Map Exploration

The International Journal of Robotics Research ◽

10.1177/0278364907087426 ◽

2008 ◽

Vol 27 (2) ◽

pp. 175-196 ◽

Cited By ~ 68

Author(s):

Thomas Kollar ◽

Nicholas Roy

Keyword(s):

Reinforcement Learning ◽

Trajectory Optimization

Download Full-text

Trajectory Optimization for Autonomous Flying Base Station via Reinforcement Learning

Mobility-Aware Trajectory Design for Aerial Base Station Using Deep Reinforcement Learning

Deep Reinforcement Learning With Spatio-Temporal Traffic Forecasting for Data-Driven Base Station Sleep Control

Optimising Performance for NB-IoT UE Devices through Data Driven Models

A Reinforcement Learning Approach for Interference Management in Heterogeneous Wireless Networks

Two-Stage Pursuit Strategy for Incomplete-Information Impulsive Space Pursuit-Evasion Mission Using Reinforcement Learning

DRAG: Deep Reinforcement Learning Based Base Station Activation in Heterogeneous Networks

Decentralized and Dynamic Band Selection in Uplink Enhanced Licensed-Assisted Access: Deep Reinforcement Learning Approach

Bandwidth, Power and Trajectory Optimization for UAV Base Station Networks With Backhaul and User QoS Constraints

RLAM: A Dynamic and Efficient Reinforcement Learning-Based Adaptive Mapping Scheme in Mobile WiMAX Networks

Trajectory Optimization using Reinforcement Learning for Map Exploration

Export Citation Format