scholarly journals Intelligent Stretch Optimization in Information Centric Networking-Based Tactile Internet Applications

2021 ◽  
Vol 11 (16) ◽  
pp. 7351
Author(s):  
Hussain Ahmad ◽  
Muhammad Zubair Islam ◽  
Rashid Ali ◽  
Amir Haider ◽  
Hyungseok Kim

The fifth-generation (5G) mobile network services are currently being made available for different use case scenarios like enhanced mobile broadband, ultra-reliable and low latency communication, and massive machine-type communication. The ever-increasing data requests from the users have shifted the communication paradigm to be based on the type of the requested data content or the so-called information-centric networking (ICN). The ICN primarily aims to enhance the performance of the network infrastructure in terms of the stretch to opt for the best routing path. Reduction in stretch merely reduces the end-to-end (E2E) latency to ensure the requirements of the 5G-enabled tactile internet (TI) services. The foremost challenge tackled by the ICN-based system is to minimize the stretch while selecting an optimal routing path. Therefore, in this work, a reinforcement learning-based intelligent stretch optimization (ISO) strategy has been proposed to reduce stretch and obtain an optimal routing path in ICN-based systems for the realization of 5G-enabled TI services. A Q-learning algorithm is utilized to explore and exploit the different routing paths within the ICN infrastructure. The problem is designed as a Markov decision process and solved with the help of the Q-learning algorithm. The simulation results indicate that the proposed strategy finds the optimal routing path for the delay-sensitive haptic-driven services of 5G-enabled TI based upon their stretch profile over ICN, such as the augmented reality /virtual reality applications. Moreover, we compare and evaluate the simulation results of propsoed ISO strategy with random routing strategy and history aware routing protocol (HARP). The proposed ISO strategy reduces 33.33% and 33.69% delay as compared to random routing and HARP, respectively. Thus, the proposed strategy suggests an optimal routing path with lesser stretch to minimize the E2E latency.

2017 ◽  
Vol 7 (1.5) ◽  
pp. 269
Author(s):  
D. Ganesha ◽  
Vijayakumar Maragal Venkatamuni

This research introduces a self learning modified (Q-Learning) techniques in a EMCAP (Enhanced Mind Cognitive Architecture of pupils). Q-learning is a modelless reinforcement learning (RL) methodology technique. In Specific, Q-learning can be applied to establish an optimal action-selection strategy for any respective Markov decision process. In this research introduces the modified Q-learning in a EMCAP (Enhanced Mind Cognitive Architecture of pupils). EMCAP architecture [1] enables and presents various agent control strategies for static and dynamic environment.  Experiment are conducted to evaluate the performace for each agent individually. For result comparison among different agent, the same statistics were collected. This work considered varied kind of agents in different level of architecture for experiment analysis. The Fungus world testbed has been considered for experiment which is has been implemented using SwI-Prolog 5.4.6. The fixed obstructs tend to be more versatile, to make a location that is specific to Fungus world testbed environment. The various parameters are introduced in an environment to test a agent’s performance.his modified q learning algorithm can be more suitable in EMCAP architecture.  The experiments are conducted the modified Q-Learning system gets more rewards compare to existing Q-learning.


1995 ◽  
Vol 4 (1) ◽  
pp. 3-28 ◽  
Author(s):  
Mance E. Harmon ◽  
Leemon C. Baird ◽  
A. Harry Klopf

An application of reinforcement learning to a linear-quadratic, differential game is presented. The reinforcement learning system uses a recently developed algorithm, the residual-gradient form of advantage updating. The game is a Markov decision process with continuous time, states, and actions, linear dynamics, and a quadratic cost function. The game consists of two players, a missile and a plane; the missile pursues the plane and the plane evades the missile. Although a missile and plane scenario was the chosen test bed, the reinforcement learning approach presented here is equally applicable to biologically based systems, such as a predator pursuing prey. The reinforcement learning algorithm for optimal control is modified for differential games to find the minimax point rather than the maximum. Simulation results are compared to the analytical solution, demonstrating that the simulated reinforcement learning system converges to the optimal answer. The performance of both the residual-gradient and non-residual-gradient forms of advantage updating and Q-learning are compared, demonstrating that advantage updating converges faster than Q-learning in all simulations. Advantage updating also is demonstrated to converge regardless of the time step duration; Q-learning is unable to converge as the time step duration grows small.


2020 ◽  
Vol 17 (9) ◽  
pp. 4683-4687
Author(s):  
Yogesh Chaba ◽  
Mridul Chaba

Now days wireless networks have become popular as the mobile applications are increasing day by day and mobility of nodes has become an important feature. The desirable property which separates mobile network from wireless networks is the mobility of communication devices. Therefore, there is a need to design routing mechanism in such a way that they can easily adopt to the frequent changes in the mobility pattern of the network. In this paper, Optimized Link State Routing protocol has been modified by implementing Q-Learning concept, a reinforcement learning algorithm which guides network to select next node to which it should forward packets by first calculating the reward R and then calculation of Q-value with neighbors. Performance of this modified routing protocol has been evaluated for parameters like delay, throughput and delivery ratio. Two mobility models have been used, Random Waypoint and Walk. It is observed that performance in terms of above parameters improve considerably in both mobility patterns when intelligent Q-Learning algorithm is implemented in Optimized Link State Routing.


2012 ◽  
Vol 433-440 ◽  
pp. 6033-6037
Author(s):  
Xiao Ming Liu ◽  
Xiu Ying Wang

The movement characteristics of traffic flow nearby have the important influence on the main line. The control method of expressway off-ramp based on Q-learning and extension control is established by analyzing parameters of off-ramp and auxiliary road. First, the basic description of Q-learning algorithm and extension control is given and analyzed necessarily. Then reward function is gained through the extension control theory to judge the state of traffic light. Simulation results show that compared to the queue lengths of off-ramp and auxiliary road, control method based on Q-learning algorithm and extension control greatly reduced queue length of off-ramp, which demonstrates the feasibility of control strategies.


Author(s):  
Mohamed A. Aref ◽  
Sudharman K. Jayaweera

This article presents a design of a wideband autonomous cognitive radio (WACR) for anti-jamming and interference-avoidance. The proposed system model allows multiple WACRs to simultaneously operate over the same spectrum range producing a multi-agent environment. The objective of each radio is to predict and evade a dynamic jammer signal as well as avoiding transmissions of other WACRs. The proposed cognitive framework is made of two operations: sensing and transmission. Each operation is helped by its own learning algorithm based on Q-learning, but both will be experiencing the same RF environment. The simulation results indicate that the proposed cognitive anti-jamming technique has low computational complexity and significantly outperforms non-cognitive sub-band selection policy while being sufficiently robust against the impact of sensing errors.


In recent days people share their images in many social media and also they are not certain about the privacy. This leads to many issues. In light of these incidents, a tool is needed for providing access control method during user’s file sharing. To do that, Adaptive Privacy Policy Prediction (A3P) system is proposed for setting the privacy parameters to their images. The image data, user’s social context and their meta data are used in the privacy setting preferences. The author proposed a two-stage framework for determining and setting the existing privacy policy in accordance to the user’s history over the website. The proposed solution depends on the image categories and the social features of the user. Also, it depends on the user’s privacy data and the relevant features. From the experimental and simulation results it is proved that the privacy preservation is one of the important task in internet applications.


Author(s):  
Ying Wang ◽  
Clarence W. de Silva

Multi-robot systems have received more and more attentions in the robotics community in the past decade. The most important issue in this area is multi-robot coordination, which focuses on how to make multiple autonomous robots cooperate or compete with each other to complete a common task. Due to its complexity, the conventional planning-based or behavior-based approaches can not work well in multi-robot coordination, especially in a dynamic unknown environment. Therefore, machine learning is becoming a promising method to help robots work in an unknown dynamic environment and improve their performance increasingly. The Q-learning algorithm was selected by most of multi-robot researchers to accomplish the above objective because of its simplicity and low computational requirements. However, directly extending the single-agent Q-learning algorithm will violate its Markov assumption and result in a low convergence speed and failing to learn a good cooperative policy. In this paper, the team Q-learning algorithm, which was originally designed for the framework of Stochastic Games (SG), is proposed to make decisions for a multi-robot purely cooperative project: Multi-robot object transportation. Firstly, the basic idea of the framework of Stochastic Games and the team Q-learning algorithm are introduced. Next, the algorithm is extended to a multi-robot object transportation task, and the implementation details are presented. Some computer simulation results are presented to demonstrate that the team Q-learning algorithm works well to make decisions for the proposed multi-robot system. Finally, effects of some parameters of team Q-learning are assessed and some interesting conclusions are drawn. In particular, the simulation results show that training is helpful for improving the performance of multi-robot decision-making, but its effect is very limited. In addition, it is also pointed out that the team Q-learning will result in a huge learning space when the robot number is bigger than ten, which indicates that a new Q-learning algorithm integrating single-agent Q-learning and Team Q-learning is urgent to be developed for multi-robot systems.


Author(s):  
Shuhuan Wen ◽  
Xueheng Hu ◽  
Zhen Li ◽  
Hak Keung Lam ◽  
Fuchun Sun ◽  
...  

Purpose This paper aims to propose a novel active SLAM framework to realize avoid obstacles and finish the autonomous navigation in indoor environment. Design/methodology/approach The improved fuzzy optimized Q-Learning (FOQL) algorithm is used to solve the avoidance obstacles problem of the robot in the environment. To reduce the motion deviation of the robot, fractional controller is designed. The localization of the robot is based on FastSLAM algorithm. Findings Simulation results of avoiding obstacles using traditional Q-learning algorithm, optimized Q-learning algorithm and FOQL algorithm are compared. The simulation results show that the improved FOQL algorithm has a faster learning speed than other two algorithms. To verify the simulation result, the FOQL algorithm is implemented on a NAO robot and the experimental results demonstrate that the improved fuzzy optimized Q-Learning obstacle avoidance algorithm is feasible and effective. Originality/value The improved fuzzy optimized Q-Learning (FOQL) algorithm is used to solve the avoidance obstacles problem of the robot in the environment. To reduce the motion deviation of the robot, fractional controller is designed. To verify the simulation result, the FOQL algorithm is implemented on a NAO robot and the experimental results demonstrate that the improved fuzzy optimized Q-Learning obstacle avoidance algorithm is feasible and effective.


Sign in / Sign up

Export Citation Format

Share Document