Deep Reinforcement Learning in Ice Hockey for Context-Aware Player Evaluation

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/478 ◽

2018 ◽

Cited By ~ 9

Author(s):

Guiliang Liu ◽

Oliver Schulte

Keyword(s):

Reinforcement Learning ◽

Empirical Evaluation ◽

Professional Sports ◽

Ice Hockey ◽

New Approach ◽

The Neural Network ◽

Overall Performance ◽

Game Context ◽

Player Performance ◽

Q Function

A variety of machine learning models have been proposed to assess the performance of players in professional sports. However, they have only a limited ability to model how player performance depends on the game context. This paper proposes a new approach to capturing game context: we apply Deep Reinforcement Learning (DRL) to learn an action-value Q function from 3M play-by-play events in the National Hockey League (NHL). The neural network representation integrates both continuous context signals and game history, using a possession-based LSTM. The learned Q-function is used to value players' actions under different game contexts. To assess a player's overall performance, we introduce a novel Game Impact Metric (GIM) that aggregates the values of the player's actions. Empirical Evaluation shows GIM is consistent throughout a play season, and correlates highly with standard success measures and future salary.

Download Full-text

Inverse Reinforcement Learning for Team Sports: Valuing Actions and Players

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/464 ◽

2020 ◽

Author(s):

Yudong Luo ◽

Oliver Schulte ◽

Pascal Poupart

Keyword(s):

Reinforcement Learning ◽

Single Agent ◽

Empirical Evaluation ◽

Team Sports ◽

Ice Hockey ◽

Inverse Reinforcement Learning ◽

Reward Function ◽

Player Ranking ◽

The Impact ◽

Q Function

A major task of sports analytics is to rank players based on the impact of their actions. Recent methods have applied reinforcement learning (RL) to assess the value of actions from a learned action value or Q-function. A fundamental challenge for estimating action values is that explicit reward signals (goals) are very sparse in many team sports, such as ice hockey and soccer. This paper combines Q-function learning with inverse reinforcement learning (IRL) to provide a novel player ranking method. We treat professional play as expert demonstrations for learning an implicit reward function. Our method alternates single-agent IRL to learn a reward function for multiple agents; we provide a theoretical justification for this procedure. Knowledge transfer is used to combine learned rewards and observed rewards from goals. Empirical evaluation, based on 4.5M play-by-play events in the National Hockey League (NHL), indicates that player ranking using the learned rewards achieves high correlations with standard success measures and temporal consistency throughout a season.

Download Full-text

Impact of Dataset Size on Classification Performance: An Empirical Evaluation in the Medical Domain

Applied Sciences ◽

10.3390/app11020796 ◽

2021 ◽

Vol 11 (2) ◽

pp. 796

Author(s):

Alhanoof Althnian ◽

Duaa AlSaeed ◽

Heyam Al-Baity ◽

Amani Samha ◽

Alanoud Bin Dris ◽

...

Keyword(s):

Empirical Evaluation ◽

Classification Performance ◽

Support Vector ◽

Robust Model ◽

Original Distribution ◽

C4.5 Decision Tree ◽

Dataset Size ◽

Overall Performance ◽

Medical Domain ◽

The Impact

Dataset size is considered a major concern in the medical domain, where lack of data is a common occurrence. This study aims to investigate the impact of dataset size on the overall performance of supervised classification models. We examined the performance of six widely-used models in the medical field, including support vector machine (SVM), neural networks (NN), C4.5 decision tree (DT), random forest (RF), adaboost (AB), and naïve Bayes (NB) on eighteen small medical UCI datasets. We further implemented three dataset size reduction scenarios on two large datasets and analyze the performance of the models when trained on each resulting dataset with respect to accuracy, precision, recall, f-score, specificity, and area under the ROC curve (AUC). Our results indicated that the overall performance of classifiers depend on how much a dataset represents the original distribution rather than its size. Moreover, we found that the most robust model for limited medical data is AB and NB, followed by SVM, and then RF and NN, while the least robust model is DT. Furthermore, an interesting observation is that a robust machine learning model to limited dataset does not necessary imply that it provides the best performance compared to other models.

Download Full-text

A new approach for structural credit assignment in distributed reinforcement learning systems

2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422) ◽

10.1109/robot.2003.1241758 ◽

2004 ◽

Author(s):

Zhong Yu ◽

Gu Guochang ◽

Zhang Rubo

Keyword(s):

Reinforcement Learning ◽

Learning Systems ◽

Credit Assignment ◽

New Approach ◽

Distributed Reinforcement

Download Full-text

A New Approach for the Solution of Multiple Objective Optimization Problems Based on Reinforcement Learning

Lecture Notes in Computer Science - MICAI 2000: Advances in Artificial Intelligence ◽

10.1007/10720076_20 ◽

2000 ◽

pp. 212-223 ◽

Cited By ~ 4

Author(s):

Carlos Mariano ◽

Eduardo Morales

Keyword(s):

Reinforcement Learning ◽

Optimization Problems ◽

Multiple Objective ◽

Multiple Objective Optimization ◽

New Approach

Download Full-text

Off-Policy Deep Reinforcement Learning by Bootstrapping the Covariate Shift

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33013647 ◽

2019 ◽

Vol 33 ◽

pp. 3647-3655

Author(s):

Carles Gelada ◽

Marc G. Bellemare

Keyword(s):

Reinforcement Learning ◽

Learning Algorithm ◽

Theoretical Perspective ◽

Empirical Evaluation ◽

Nonlinear Function ◽

Policy Learning ◽

Extensive Evaluation ◽

Probability Simplex ◽

Performance Gains ◽

Nonlinear Function Approximation

In this paper we revisit the method of off-policy corrections for reinforcement learning (COP-TD) pioneered by Hallak et al. (2017). Under this method, online updates to the value function are reweighted to avoid divergence issues typical of off-policy learning. While Hallak et al.’s solution is appealing, it cannot easily be transferred to nonlinear function approximation. First, it requires a projection step onto the probability simplex; second, even though the operator describing the expected behavior of the off-policy learning algorithm is convergent, it is not known to be a contraction mapping, and hence, may be more unstable in practice. We address these two issues by introducing a discount factor into COP-TD. We analyze the behavior of discounted COP-TD and find it better behaved from a theoretical perspective. We also propose an alternative soft normalization penalty that can be minimized online and obviates the need for an explicit projection step. We complement our analysis with an empirical evaluation of the two techniques in an off-policy setting on the game Pong from the Atari domain where we find discounted COP-TD to be better behaved in practice than the soft normalization penalty. Finally, we perform a more extensive evaluation of discounted COP-TD in 5 games of the Atari domain, where we find performance gains for our approach.

Download Full-text

A New Approach for Multi-agent Reinforcement Learning

Advances in Intelligent Systems and Computing - Artificial Intelligence and Industrial Applications ◽

10.1007/978-3-030-51186-9_19 ◽

2020 ◽

pp. 263-275

Author(s):

Elmehdi Amhraoui ◽

Tawfik Masrour

Keyword(s):

Reinforcement Learning ◽

New Approach ◽

Multi Agent

Download Full-text

A new approach to overall performance evaluation based on multiple contexts: An application to the logistics of China

Computers & Industrial Engineering ◽

10.1016/j.cie.2018.05.055 ◽

2018 ◽

Vol 122 ◽

pp. 170-180 ◽

Cited By ~ 4

Author(s):

Jin-Xiao Chen

Keyword(s):

Performance Evaluation ◽

New Approach ◽

Multiple Contexts ◽

Overall Performance

Download Full-text

A New Approach for Classification of Fault in Transmission Line with Combination of Wavelet Multi Resolution Analysis and Neural Networks

International Journal of Power Electronics and Drive Systems (IJPEDS) ◽

10.11591/ijpeds.v8.i1.pp505-512 ◽

2017 ◽

Vol 8 (1) ◽

pp. 505

Author(s):

Y Srinivasa Rao ◽

G. Ravi Kumar ◽

G. Kesava Rao

Keyword(s):

Neural Networks ◽

Wavelet Transform ◽

Transmission Line ◽

Discrete Wavelet Transform ◽

Discrete Wavelet ◽

New Approach ◽

The Neural Network ◽

Multi Resolution Analysis ◽

Resolution Analysis

An appropriate fault detection and classification of power system transmission line using discrete wavelet transform and artificial neural networks is performed in this paper. The analysis is carried out by applying discrete wavelet transform for obtained fault phase currents. The work represented in this paper are mainly concentrated on classification of fault and this classification is done based on the obtained energy values after applying discrete wavelet transform by taking this values as an input for the neural network. The proposed system and analysis is carried out in Matlab Simulink.

Download Full-text

Dialogue Generation: From Imitation Learning to Inverse Reinforcement Learning

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33016722 ◽

2019 ◽

Vol 33 ◽

pp. 6722-6729 ◽

Cited By ~ 4

Author(s):

Ziming Li ◽

Julia Kiseleva ◽

Maarten De Rijke

Keyword(s):

Reinforcement Learning ◽

State Of The Art ◽

The State ◽

Experimental Results ◽

Imitation Learning ◽

Local Optimum ◽

Inverse Reinforcement Learning ◽

High Quality ◽

Overall Performance

The performance of adversarial dialogue generation models relies on the quality of the reward signal produced by the discriminator. The reward signal from a poor discriminator can be very sparse and unstable, which may lead the generator to fall into a local optimum or to produce nonsense replies. To alleviate the first problem, we first extend a recently proposed adversarial dialogue generation method to an adversarial imitation learning solution. Then, in the framework of adversarial inverse reinforcement learning, we propose a new reward model for dialogue generation that can provide a more accurate and precise reward signal for generator training. We evaluate the performance of the resulting model with automatic metrics and human evaluations in two annotation settings. Our experimental results demonstrate that our model can generate more high-quality responses and achieve higher overall performance than the state-of-the-art.

Download Full-text

Sports Rescue: The South End Mustangs Professional Ice Hockey Team

Case Studies in Sport Management ◽

10.1123/cssm.2014-0035 ◽

2015 ◽

Vol 4 (1) ◽

pp. 44-53

Author(s):

Chris Chard ◽

Kirsty K. Spence

Keyword(s):

United Kingdom ◽

Professional Sports ◽

Ice Hockey ◽

Professional Sport ◽

Business Owner ◽

Television Show ◽

The South ◽

The United Kingdom ◽

Sports Team

Three years ago, Steve Thornton purchased the South End Mustangs, a professional ice hockey team competing in the D1 division in the United Kingdom. Unfortunately, Thornton has experienced challenging times during his ownership tenure. The team has achieved mediocre results on the ice and poor results off the ice. Thornton knows he needs help to turn the Mustangs franchise around. Thus, as a result, he turns to John Tapner, a sport business owner, operator, entrepreneur, and advisor. Tapner is best known as a professional sport consultant and TV personality, representing his company Sports Rescue, which is the same name as his hit television show. When an owner calls Tapner, it is because a professional sports team is in trouble and needs to be rescued.

Download Full-text