Resource allocation in clustered M2M networks: a q-learning approach

10.32920/ryerson.14649174.v1 ◽

2021 ◽

Author(s):

Fatima Hussain

Keyword(s):

Resource Allocation ◽

Learning Algorithm ◽

Random Access ◽

Cumulative Distribution ◽

Convergence Time ◽

Mixed Integer ◽

M2m Communication ◽

Q Learning ◽

Threshold Levels ◽

Slot Assignment

Machine to machine (M2M) communication has received increasing attention in recent years. A M2M network exhibits salient features such as large number of machines/devices, low data rates, delay tolerant/sensitive, small sized packets, energy-constrained and low or no mobility. A large number of M2M terminals may exist in a small area with many trying to simultaneously and randomly access for channel resources - which will result in overload and access problem. This increased signaling overhead and diverse requirements of machine type communication devices (MTCDs) call for the development of flexible and efficient scheduling and random access techniques. In this thesis, we first review and compare various scheduling and random access techniques in LTE-based cellular networks for M2M communication. We also discuss how successful they are to fulfill the unique requirements of M2M communication and networking. Resource management in M2M networks with a large number devices is also reviewed from the access point of view. We propose a multi-objective optimization based solution to the problem of resource allocation in interference-limited M2M communication. We consider MTCDs in a clustered network structure, where they are divided into clusters and the devices belonging to a cluster communicate to cluster head (or controller). We maximize the number of admitted MTCD controllers and throughput with least interference caused to conventional primary users. We formulate the problem as a mixed-integer non-linear problem with multiple objectives and solve it using meshed adaptive direct search (MADS) algorithm. Simulation results show the effects of varying different parameters on cumulative throughput and the number of admitted iii MTCD controllers. We then formulate the slot selection problem in M2M networks with admitted MTCDs as an optimization problem. We present a solution using the Q-learning algorithm to select conflict-free slot assignment in a random access network with MTCD controllers. The performance of the solution is dependent on parameters such as learning rate and reward. We thoroughly analyze the performance of the proposed algorithm considering different parameters related to its operation. We also compare it with simple ALOHA and channel-based scheduled allocation and show that the proposed Q-learning based technique has a higher probability of assigning slots compared to these techniques. We then present a block based Q-learning algorithm for the scheduling of MTCDs in clustered M2M communication networks. At first centralized slot assignment is done and an algorithm is proposed for minimizing the inter-cluster interference. Then we propose to use an Q-learning algorithm to assign slots in a distributed manner and comparison is made between the two schemes. Afterwards, we show the effects of distributed slot-assignment with respect to varying signal-to-interference ratio on convergence rate and convergence probability. Cumulative distribution function is used to study the effect of various SIR threshold levels on the convergence probability. With the increase in SIR threshold levels, increase in convergence time and decrease in convergence probability are observed, as less block configuration fulfills the required threshold in the M2M network.

Download Full-text

Q-learning algorithm for resource allocation in WDMA-based optical wireless communication networks

10.23919/splitech52315.2021.9566383 ◽

2021 ◽

Author(s):

Abdelrahman S. Elgamal ◽

Osama Z. Alsulami ◽

Ahmad Adnan Qidan ◽

Taisir E.H. El-Gorashi ◽

Jaafar M. H. Elmirghani

Keyword(s):

Resource Allocation ◽

Wireless Communication ◽

Communication Networks ◽

Learning Algorithm ◽

Wireless Communication Networks ◽

Optical Wireless Communication ◽

Optical Wireless ◽

Q Learning

Download Full-text

Resource allocation and congestion control in clustered M2M communication using Q-learning

Transactions on Emerging Telecommunications Technologies ◽

10.1002/ett.3039 ◽

2016 ◽

Vol 28 (4) ◽

pp. e3039 ◽

Cited By ~ 8

Author(s):

Fatima Hussain ◽

Alagan Anpalagan ◽

Ahmed Shaharyar Khwaja ◽

Muhammad Naeem

Keyword(s):

Resource Allocation ◽

Congestion Control ◽

M2m Communication ◽

Q Learning

Download Full-text

Degree-Constrained k -Minimum Spanning Tree Problem

Complexity ◽

10.1155/2020/7628105 ◽

2020 ◽

Vol 2020 ◽

pp. 1-25

Author(s):

Pablo Adasme ◽

Ali Dehghan Firoozabadi

Keyword(s):

Learning Strategies ◽

Spanning Tree ◽

Minimum Spanning Tree ◽

Learning Strategy ◽

Learning Algorithm ◽

Hamiltonian Path ◽

Minimum Cost ◽

Mixed Integer ◽

Neighborhood Search ◽

Q Learning

Let G V , E be a simple undirected complete graph with vertex and edge sets V and E , respectively. In this paper, we consider the degree-constrained k -minimum spanning tree (DC k MST) problem which consists of finding a minimum cost subtree of G formed with at least k vertices of V where the degree of each vertex is less than or equal to an integer value d ≤ k − 2 . In particular, in this paper, we consider degree values of d ∈ 2,3 . Notice that DC k MST generalizes both the classical degree-constrained and k -minimum spanning tree problems simultaneously. In particular, when d = 2 , it reduces to a k -Hamiltonian path problem. Application domains where DC k MST can be adapted or directly utilized include backbone network structures in telecommunications, facility location, and transportation networks, to name a few. It is easy to see from the literature that the DC k MST problem has not been studied in depth so far. Thus, our main contributions in this paper can be highlighted as follows. We propose three mixed-integer linear programming (MILP) models for the DC k MST problem and derive for each one an equivalent counterpart by using the handshaking lemma. Then, we further propose ant colony optimization (ACO) and variable neighborhood search (VNS) algorithms. Each proposed ACO and VNS method is also compared with another variant of it which is obtained while embedding a Q-learning strategy. We also propose a pure Q-learning algorithm that is competitive with the ACO ones. Finally, we conduct substantial numerical experiments using benchmark input graph instances from TSPLIB and randomly generated ones with uniform and Euclidean distance costs with up to 400 nodes. Our numerical results indicate that the proposed models and algorithms allow obtaining optimal and near-optimal solutions, respectively. Moreover, we report better solutions than CPLEX for the large-size instances. Ultimately, the empirical evidence shows that the proposed Q-learning strategies can bring considerable improvements.

Download Full-text

Distributed Q-Learning Algorithm for Dynamic Resource Allocation With Unknown Objective Functions and Application to Microgrid

IEEE Transactions on Cybernetics ◽

10.1109/tcyb.2021.3082639 ◽

2021 ◽

pp. 1-11

Author(s):

Pengcheng Dai ◽

Wenwu Yu ◽

Duxin Chen

Keyword(s):

Resource Allocation ◽

Learning Algorithm ◽

Dynamic Resource Allocation ◽

Objective Functions ◽

Q Learning ◽

Dynamic Resource

Download Full-text

Optimal Control of Microgrids with Multi-stage Mixed-integer Nonlinear Programming Guided Q-learning Algorithm

Journal of Modern Power Systems and Clean Energy ◽

10.35833/mpce.2020.000506 ◽

2020 ◽

Vol 8 (6) ◽

pp. 1151-1159

Author(s):

Yeliz Yoldas ◽

Selcuk Goren ◽

Ahmet Onen

Keyword(s):

Optimal Control ◽

Nonlinear Programming ◽

Learning Algorithm ◽

Mixed Integer Nonlinear Programming ◽

Mixed Integer ◽

Q Learning ◽

Multi Stage ◽

Integer Nonlinear Programming

Download Full-text

Base Station Selection in M2M Communication Using Q-Learning Algorithm in LTE-A Networks

2015 IEEE 29th International Conference on Advanced Information Networking and Applications ◽

10.1109/aina.2015.160 ◽

2015 ◽

Cited By ~ 7

Author(s):

A.H. Mohammed ◽

A.S. Khwaja ◽

A. Anpalagan ◽

I. Woungang

Keyword(s):

Learning Algorithm ◽

Base Station ◽

M2m Communication ◽

Q Learning

Download Full-text

Improved Q -Learning Method for Multirobot Formation and Path Planning with Concave Obstacles

Journal of Sensors ◽

10.1155/2021/4294841 ◽

2021 ◽

Vol 2021 ◽

pp. 1-14

Author(s):

Zhilin Fan ◽

Fei Liu ◽

Xinshun Ning ◽

Yilin Han ◽

Jian Wang ◽

...

Keyword(s):

Path Planning ◽

Learning Algorithm ◽

Convergence Time ◽

Multirobot Systems ◽

Selection Strategy ◽

Q Learning ◽

Unknown Environment ◽

Traditional Algorithm ◽

Leader Following ◽

Tracking Strategy

Aiming at the formation and path planning of multirobot systems in an unknown environment, a path planning method for multirobot formation based on improved Q -learning is proposed. Based on the leader-following approach, the leader robot uses an improved Q -learning algorithm to plan the path and the follower robot achieves a tracking strategy of gravitational potential field (GPF) by designing a cost function to select actions. Specifically, to improve the Q-learning, Q -value is initialized by environmental guidance of the target’s GPF. Then, the virtual obstacle-filling avoidance strategy is presented to fill non-obstacles which is judged to tend to concave obstacles with virtual obstacles. Besides, the simulated annealing (SA) algorithm whose controlling temperature is adjusted in real time according to the learning situation of the Q -learning is applied to improve the action selection strategy. The experimental results show that the improved Q -learning algorithm reduces the convergence time by 89.9% and the number of convergence rounds by 63.4% compared with the traditional algorithm. With the help of the method, multiple robots have a clear division of labor and quickly plan a globally optimized formation path in a completely unknown environment.

Download Full-text

Resource Allocation For D2D Communications With A Novel Distributed Q-Learning Algorithm In Heterogeneous Networks

2018 International Conference on Machine Learning and Cybernetics (ICMLC) ◽

10.1109/icmlc.2018.8526955 ◽

2018 ◽

Cited By ~ 3

Author(s):

Yung-Fa Huang ◽

Tan-Hsu Tan ◽

Neng-Chung Wang ◽

Young-Long Chen ◽

Yu-Ling Li

Keyword(s):

Resource Allocation ◽

Heterogeneous Networks ◽

Learning Algorithm ◽

D2d Communications ◽

Q Learning

Download Full-text

Mobility, Residual Energy, and Link Quality Aware Multipath Routing in MANETs with Q-learning Algorithm

Applied Sciences ◽

10.3390/app9081582 ◽

2019 ◽

Vol 9 (8) ◽

pp. 1582 ◽

Cited By ~ 12

Author(s):

Valmik Tilwari ◽

Kaharudin Dimyati ◽

MHD Hindia ◽

Anas Fattouh ◽

Iraj Amiri

Keyword(s):

Ad Hoc ◽

Learning Algorithm ◽

Residual Energy ◽

Quality Parameters ◽

Link Quality ◽

Convergence Time ◽

Mobile Nodes ◽

Q Learning ◽

Routing Scheme ◽

End To End

To facilitate connectivity to the internet, the easiest way to establish communication infrastructure in areas affected by natural disaster and in remote locations with intermittent cellular services and/or lack of Wi-Fi coverage is to deploy an end-to-end connection over Mobile Ad-hoc Networks (MANETs). However, the potentials of MANETs are yet to be fully realized as existing MANETs routing protocols still suffer some major technical drawback in the areas of mobility, link quality, and battery constraint of mobile nodes between the overlay connections. To address these problems, a routing scheme named Mobility, Residual energy and Link quality Aware Multipath (MRLAM) is proposed for routing in MANETs. The proposed scheme makes routing decisions by determining the optimal route with energy efficient nodes to maintain the stability, reliability, and lifetime of the network over a sustained period of time. The MRLAM scheme uses a Q-Learning algorithm for the selection of optimal intermediate nodes based on the available status of energy level, mobility, and link quality parameters, and then provides positive and negative reward values accordingly. The proposed routing scheme reduces energy cost by 33% and 23%, end to end delay by 15% and 10%, packet loss ratio by 30.76% and 24.59%, and convergence time by 16.49% and 11.34% approximately, compared with other well-known routing schemes such as Multipath Optimized Link State Routing protocol (MP-OLSR) and MP-OLSRv2, respectively. Overall, the acquired results indicate that the proposed MRLAM routing scheme significantly improves the overall performance of the network.

Download Full-text