Uprising E-sports Industry: machine learning/AI improve in-game performance using deep reinforcement learning

Author(s):  
Xianzuo Du ◽  
Xiwei Fuqian ◽  
Jiaxi Hu ◽  
Zechen Wang ◽  
Dongju Yang
Author(s):  
Ivan Herreros

This chapter discusses basic concepts from control theory and machine learning to facilitate a formal understanding of animal learning and motor control. It first distinguishes between feedback and feed-forward control strategies, and later introduces the classification of machine learning applications into supervised, unsupervised, and reinforcement learning problems. Next, it links these concepts with their counterparts in the domain of the psychology of animal learning, highlighting the analogies between supervised learning and classical conditioning, reinforcement learning and operant conditioning, and between unsupervised and perceptual learning. Additionally, it interprets innate and acquired actions from the standpoint of feedback vs anticipatory and adaptive control. Finally, it argues how this framework of translating knowledge between formal and biological disciplines can serve us to not only structure and advance our understanding of brain function but also enrich engineering solutions at the level of robot learning and control with insights coming from biology.


Photonics ◽  
2021 ◽  
Vol 8 (2) ◽  
pp. 33
Author(s):  
Lucas Lamata

Quantum machine learning has emerged as a promising paradigm that could accelerate machine learning calculations. Inside this field, quantum reinforcement learning aims at designing and building quantum agents that may exchange information with their environment and adapt to it, with the aim of achieving some goal. Different quantum platforms have been considered for quantum machine learning and specifically for quantum reinforcement learning. Here, we review the field of quantum reinforcement learning and its implementation with quantum photonics. This quantum technology may enhance quantum computation and communication, as well as machine learning, via the fruitful marriage between these previously unrelated fields.


2021 ◽  
pp. 027836492098785
Author(s):  
Julian Ibarz ◽  
Jie Tan ◽  
Chelsea Finn ◽  
Mrinal Kalakrishnan ◽  
Peter Pastor ◽  
...  

Deep reinforcement learning (RL) has emerged as a promising approach for autonomously acquiring complex behaviors from low-level sensor observations. Although a large portion of deep RL research has focused on applications in video games and simulated control, which does not connect with the constraints of learning in real environments, deep RL has also demonstrated promise in enabling physical robots to learn complex skills in the real world. At the same time, real-world robotics provides an appealing domain for evaluating such algorithms, as it connects directly to how humans learn: as an embodied agent in the real world. Learning to perceive and move in the real world presents numerous challenges, some of which are easier to address than others, and some of which are often not considered in RL research that focuses only on simulated domains. In this review article, we present a number of case studies involving robotic deep RL. Building off of these case studies, we discuss commonly perceived challenges in deep RL and how they have been addressed in these works. We also provide an overview of other outstanding challenges, many of which are unique to the real-world robotics setting and are not often the focus of mainstream RL research. Our goal is to provide a resource both for roboticists and machine learning researchers who are interested in furthering the progress of deep RL in the real world.


Author(s):  
Ali Fakhry

The applications of Deep Q-Networks are seen throughout the field of reinforcement learning, a large subsect of machine learning. Using a classic environment from OpenAI, CarRacing-v0, a 2D car racing environment, alongside a custom based modification of the environment, a DQN, Deep Q-Network, was created to solve both the classic and custom environments. The environments are tested using custom made CNN architectures and applying transfer learning from Resnet18. While DQNs were state of the art years ago, using it for CarRacing-v0 appears somewhat unappealing and not as effective as other reinforcement learning techniques. Overall, while the model did train and the agent learned various parts of the environment, attempting to reach the reward threshold for the environment with this reinforcement learning technique seems problematic and difficult as other techniques would be more useful.


Author(s):  
Dómhnall J. Jennings ◽  
Eduardo Alonso ◽  
Esther Mondragón ◽  
Charlotte Bonardi

Standard associative learning theories typically fail to conceptualise the temporal properties of a stimulus, and hence cannot easily make predictions about the effects such properties might have on the magnitude of conditioning phenomena. Despite this, in intuitive terms we might expect that the temporal properties of a stimulus that is paired with some outcome to be important. In particular, there is no previous research addressing the way that fixed or variable duration stimuli can affect overshadowing. In this chapter we report results which show that the degree of overshadowing depends on the distribution form - fixed or variable - of the overshadowing stimulus, and argue that conditioning is weaker under conditions of temporal uncertainty. These results are discussed in terms of models of conditioning and timing. We conclude that the temporal difference model, which has been extensively applied to the reinforcement learning problem in machine learning, accounts for the key findings of our study.


Author(s):  
Yongbiao Gao ◽  
Yu Zhang ◽  
Xin Geng

Label distribution learning (LDL) is a novel machine learning paradigm that gives a description degree of each label to an instance. However, most of training datasets only contain simple logical labels rather than label distributions due to the difficulty of obtaining the label distributions directly. We propose to use the prior knowledge to recover the label distributions. The process of recovering the label distributions from the logical labels is called label enhancement. In this paper, we formulate the label enhancement as a dynamic decision process. Thus, the label distribution is adjusted by a series of actions conducted by a reinforcement learning agent according to sequential state representations. The target state is defined by the prior knowledge. Experimental results show that the proposed approach outperforms the state-of-the-art methods in both age estimation and image emotion recognition.


2012 ◽  
pp. 1434-1444
Author(s):  
Adam E. Gaweda

This chapter presents application of reinforcement learning to drug dosing personalization in treatment of chronic conditions. Reinforcement learning is a machine learning paradigm that mimics the trialand- error skill acquisition typical for humans and animals. In treatment of chronic illnesses, finding the optimal dose amount for an individual is also a process that is usually based on trial-and-error. In this chapter, the author focuses on the challenge of personalized anemia treatment with recombinant human erythropoietin. The author demonstrates the application of a standard reinforcement learning method, called Q-learning, to guide the physician in selecting the optimal erythropoietin dose. The author further addresses the issue of random exploration in Q-learning from the drug dosing perspective and proposes a “smart” exploration method. Finally, the author performs computer simulations to compare the outcomes from reinforcement learning-based anemia treatment to those achieved by a standard dosing protocol used at a dialysis unit.


2021 ◽  
Author(s):  
Daoming Lyu ◽  
Fangkai Yang ◽  
Hugh Kwon ◽  
Bo Liu ◽  
Wen Dong ◽  
...  

Human-robot interactive decision-making is increasingly becoming ubiquitous, and explainability is an influential factor in determining the reliance on autonomy. However, it is not reasonable to trust systems beyond our comprehension, and typical machine learning and data-driven decision-making are black-box paradigms that impede explainability. Therefore, it is critical to establish computational efficient decision-making mechanisms enhanced by explainability-aware strategies. To this end, we propose the Trustworthy Decision-Making (TDM), which is an explainable neuro-symbolic approach by integrating symbolic planning into hierarchical reinforcement learning. The framework of TDM enables the subtask-level explainability from the causal relational and understandable subtasks. Besides, TDM also demonstrates the advantage of the integration between symbolic planning and reinforcement learning, reaping the benefits of both worlds. Experimental results validate the effectiveness of proposed method while improving the explainability in the process of decision-making.


Author(s):  
Jonathan Becker ◽  
Aveek Purohit ◽  
Zheng Sun

USARSim group at NIST developed a simulated robot that operated in the Unreal Tournament 3 (UT3) gaming environment. They used a software PID controller to control the robot in UT3 worlds. Unfortunately, the PID controller did not work well, so NIST asked us to develop a better controller using machine learning techniques. In the process, we characterized the software PID controller and the robot’s behavior in UT3 worlds. Using data collected from our simulations, we compared different machine learning techniques including linear regression and reinforcement learning (RL). Finally, we implemented a RL based controller in Matlab and ran it in the UT3 environment via a TCP/IP link between Matlab and UT3.


Author(s):  
Chang-Shing Lee ◽  
Mei-Hui Wang ◽  
Yi-Lin Tsai ◽  
Wei-Shan Chang ◽  
Marek Reformat ◽  
...  

The currently observed developments in Artificial Intelligence (AI) and its influence on different types of industries mean that human-robot cooperation is of special importance. Various types of robots have been applied to the so-called field of Edutainment, i.e., the field that combines education with entertainment. This paper introduces a novel fuzzy-based system for a human-robot cooperative Edutainment. This co-learning system includes a brain-computer interface (BCI) ontology model and a Fuzzy Markup Language (FML)-based Reinforcement Learning Agent (FRL-Agent). The proposed FRL-Agent is composed of (1) a human learning agent, (2) a robotic teaching agent, (3) a Bayesian estimation agent, (4) a robotic BCI agent, (5) a fuzzy machine learning agent, and (6) a fuzzy BCI ontology. In order to verify the effectiveness of the proposed system, the FRL-Agent is used as a robot teacher in a number of elementary schools, junior high schools, and at a university to allow robot teachers and students to learn together in the classroom. The participated students use handheld devices to indirectly or directly interact with the robot teachers to learn English. Additionally, a number of university students wear a commercial EEG device with eight electrode channels to learn English and listen to music. In the experiments, the robotic BCI agent analyzes the collected signals from the EEG device and transforms them into five physiological indices when the students are learning or listening. The Bayesian estimation agent and fuzzy machine learning agent optimize the parameters of the FRL agent and store them in the fuzzy BCI ontology. The experimental results show that the robot teachers motivate students to learn and stimulate their progress. The fuzzy machine learning agent is able to predict the five physiological indices based on the eight-channel EEG data and the trained model. In addition, we also train the model to predict the other students’ feelings based on the analyzed physiological indices and labeled feelings. The FRL agent is able to provide personalized learning content based on the developed human and robot cooperative edutainment approaches. To our knowledge, the FRL agent has not applied to the teaching fields such as elementary schools before and it opens up a promising new line of research in human and robot co-learning. In the future, we hope the FRL agent will solve such an existing problem in the classroom that the high-performing students feel the learning contents are too simple to motivate their learning or the low-performing students are unable to keep up with the learning progress to choose to give up learning.


Sign in / Sign up

Export Citation Format

Share Document