Multipath TCP Path Scheduling optimization Based on Q-Learning in Vehicular Heterogeneous Networks

Author(s):  
Haitao Zhao ◽  
Mengkang Zhang ◽  
Hongsu Yu ◽  
Tianqi Mao ◽  
Hongbo Zhu
2019 ◽  
Vol 2019 ◽  
pp. 1-12
Author(s):  
Lin Sun ◽  
Qi Zhu

This paper proposes a WiFi offloading algorithm based on Q-learning and MADM (multiattribute decision making) in heterogeneous networks for a mobile user scenario where cellular networks and WiFi networks coexist. The Markov model is used to describe the changes of the network environment. Four attributes including user throughput, terminal power consumption, user cost, and communication delay are considered to define the user satisfaction function reflecting QoS (Quality of Service), and Q-learning is used to optimize it. Through AHP (Analytic Hierarchy Process) and TOPSIS (Technique for Order Preference by Similarity to an Ideal Solution) in MADM, the intrinsic connection between each attribute and the reward function is obtained. The user uses Q-learning to make offloading decisions based on current network conditions and their own offloading history, ultimately maximizing their satisfaction. The simulation results show that the user satisfaction of the proposed algorithm is better than the traditional WiFi offloading algorithm.


2021 ◽  
Author(s):  
Shiva Raj Pokhrel ◽  
Anwar Walid

Multipath TCP (MPTCP) has emerged as a facilitator for harnessing and pooling available bandwidth in wireless/wireline communication networks and in data centers. Existing implementations of MPTCP such as, Linked Increase Algorithm (LIA), Opportunistic LIA (OLIA) and BAlanced LInked Adaptation (BALIA) include separate algorithms for congestion control and packet scheduling, with pre-selected control parameters. We propose a Deep Q-Learning (DQL) based framework for joint congestion control and packet scheduling for MPTCP. At the heart of the solution is an intelligent agent for interface, learning and actuation, which learns from experience optimal congestion control and scheduling mechanism using DQL techniques with policy gradients. We provide a rigorous stability analysis of system dynamics which provides important practical design insights. In addition, the proposed DQL-MPTCPalgorithm utilizes the ‘recurrent neural network’ and integrates it with ‘long short-term memory’ for continuously i) learning dynamic behavior of subflows (paths) and ii) responding promptly to their behavior using prioritized experience replay. With extensive emulations, we show that the proposed DQL-based MPTCP algorithm outperforms MPTCP LIA, OLIA and BALIA algorithms. Moreover, the DQL-MPTCP algorithm is robust to time-varying network characteristics and provides dynamic exploration and exploitation of paths. The revised version is to appear in IEEE Trans. in Mobile Computing soon.<br>


Sign in / Sign up

Export Citation Format

Share Document