scholarly journals Expected hypothetical completion probability

2020 ◽  
Vol 16 (2) ◽  
pp. 85-94
Author(s):  
Sameer K. Deshpande ◽  
Katherine Evans

AbstractUsing high-resolution player tracking data made available by the National Football League (NFL) for their 2019 Big Data Bowl competition, we introduce the Expected Hypothetical Completion Probability (EHCP), a objective framework for evaluating plays. At the heart of EHCP is the question “on a given passing play, did the quarterback throw the pass to the receiver who was most likely to catch it?” To answer this question, we first built a Bayesian non-parametric catch probability model that automatically accounts for complex interactions between inputs like the receiver’s speed and distances to the ball and nearest defender. While building such a model is, in principle, straightforward, using it to reason about a hypothetical pass is challenging because many of the model inputs corresponding to a hypothetical are necessarily unobserved. To wit, it is impossible to observe how close an un-targeted receiver would be to his nearest defender had the pass been thrown to him instead of the receiver who was actually targeted. To overcome this fundamental difficulty, we propose imputing the unobservable inputs and averaging our model predictions across these imputations to derive EHCP. In this way, EHCP can track how the completion probability evolves for each receiver over the course of a play in a way that accounts for the uncertainty about missing inputs.

2020 ◽  
Vol 16 (2) ◽  
pp. 163-182
Author(s):  
Ronald Yurko ◽  
Francesca Matano ◽  
Lee F. Richardson ◽  
Nicholas Granered ◽  
Taylor Pospisil ◽  
...  

AbstractContinuous-time assessments of game outcomes in sports have become increasingly common in the last decade. In American football, only discrete-time estimates of play value were possible, since the most advanced public football datasets were recorded at the play-by-play level. While measures such as expected points and win probability are useful for evaluating football plays and game situations, there has been no research into how these values change throughout the course of a play. In this work, we make two main contributions: First, we introduce a general framework for continuous-time within-play valuation in the National Football League using player-tracking data. Our modular framework incorporates several modular sub-models, to easily incorporate recent work involving player tracking data in football. Second, we use a long short-term memory recurrent neural network to construct a ball-carrier model to estimate how many yards the ball-carrier is expected to gain from their current position, conditional on the locations and trajectories of the ball-carrier, their teammates and opponents. Additionally, we demonstrate an extension with conditional density estimation so that the expectation of any measure of play value can be calculated in continuous-time, which was never before possible at such a granular level.


2020 ◽  
Vol 16 (2) ◽  
pp. 73-79
Author(s):  
Michael J. Lopez

AbstractMost historical National Football League (NFL) analysis, both mainstream and academic, has relied on public, play-level data to generate team and player comparisons. Given the number of oft omitted variables that impact on-field results, such as play call, game situation, and opponent strength, findings tend to be more anecdotal than actionable. With the release of player tracking data, however, analysts can better ask and answer questions to isolate skill and strategy. In this article, we highlight the limitations of traditional analyses, and use a decades-old punching bag for analysts, fourth-down strategy, as a microcosm for why tracking data is needed. Specifically, we assert that, in absence of using the precise yardage needed for a first down, past findings supporting an aggressive fourth down strategy may have been overstated. Next, we synthesize recent work that comprises this special Journal of Quantitative Analysis in Sports issue into player tracking data in football. Finally, we conclude with some best practices and limitations regarding usage of this data. The release of player tracking data marks a transition for the league and its’ analysts, and we hope this issue helps guide innovation in football analytics for years to come.


2005 ◽  
Vol 201 ◽  
pp. 476-477
Author(s):  
Lindsay King ◽  
Douglas Clowe ◽  
Peter Schneider ◽  
Volker Springel

In our ongoing work, we use high resolution cluster simulations to study gravitational lensing. These simulations have a softening length of 0.7 h-1 kpc and a particle mass of 4.68 × 107M⊙ (Springel 1999). Questions that can be addressed include the accuracy with which substructure on various scales can be recovered using the information from lensing. This is very important in determining the power of lensing in studying the evolution of cluster substructure as a function of redshift. We briefly consider how a weak lensing non-parametric reconstruction technique and the Map-statistic can be applied to the simulations.


Author(s):  
Brett Pollard ◽  
Fabian Held ◽  
Lina Engelen ◽  
Lauren Powell ◽  
Richard de Dear

Sports ◽  
2018 ◽  
Vol 6 (4) ◽  
pp. 130 ◽  
Author(s):  
Varuna De Silva ◽  
Mike Caine ◽  
James Skinner ◽  
Safak Dogan ◽  
Ahmet Kondoz ◽  
...  

Background: Global positioning system (GPS) based player movement tracking data are widely used by professional football (soccer) clubs and academies to provide insight into activity demands during training and competitive matches. However, the use of movement tracking data to inform the design of training programmes is still an open research question. Objectives: The objective of this study is to analyse player tracking data to understand activity level differences between training and match sessions, with respect to different playing positions. Methods: This study analyses the per-session summary of historical movement data collected through GPS tracking to profile high-speed running activity as well as distance covered during training sessions as a whole and competitive matches. We utilise 20,913 data points collected from 53 football players aged between 18 and 23 at an elite football academy across four full seasons (2014–2018). Through ANOVA analysis and probability distribution analysis, we compare the activity demands, measured by the number of high-speed runs, the amount of high-speed distance, and distance covered by players in key playing positions, such as Central Midfielders, Full Backs, and Centre Forwards. Results and Implications: While there are significant positional differences in physical activity demands during competitive matches, the physical activity levels during training sessions do not show positional variations. In matches, the Centre Forwards face the highest demand for High Speed Runs (HSRs), compared to Central Midfielders and Full Backs. However, on average the Central Midfielders tend to cover more distance than Centre Forwards and Full Backs. An increase in high-speed work demand in matches and training over the past four seasons, also shown by a gradual change in the extreme values of high-speed running activity, was also found. This large-scale, longitudinal study makes an important contribution to the literature, providing novel insights from an elite performance environment about the relationship between player activity levels during training and match play, and how these vary by playing position.


2015 ◽  
Vol 12 (12) ◽  
pp. 12987-13018
Author(s):  
C. I. Meier ◽  
J. S. Moraga ◽  
G. Pranzini ◽  
P. Molnar

Abstract. Traditional frequency analysis of annual precipitation requires the fitting of a probability model to yearly precipitation totals. There are three potential problems with this approach: a long record (at least 25 ~ 30 years) is required in order to fit the model, years with missing data cannot be used, and the data need to be homogeneous. To overcome these limitations, we test an alternative methodology proposed by Eagleson (1978), based on the derived distribution approach (DDA). This allows for better estimation of the probability density function (pdf) of annual rainfall without requiring long records, provided that high-resolution precipitation data are available to derive external storm properties. The DDA combines marginal pdfs for storm depth and inter-arrival time to arrive at an analytical formulation of the distribution of annual precipitation under the assumption of independence between events. We tested the DDA at two temperate locations in different climates (Concepción, Chile, and Lugano, Switzerland), quantifying the effects of record length. Our results show that, as compared to the fitting of a normal or log-normal distribution, the DDA significantly reduces the uncertainty in annual precipitation estimates (especially interannual variability) when only short records are available. The DDA also reduces the bias in annual precipitation quantiles with high return periods. We also show that using precipitation data aggregated every 24 h, as commonly available at most weather stations, introduces a noticeable bias in the DDA. Our results point to the tangible benefits of installing high-resolution (hourly or less) precipitation gauges at previously ungauged locations. We show that the DDA, in combination with high resolution gauging, provides more accurate and less uncertain estimates of long-term precipitation statistics such as interannual variability and quantiles of annual precipitation with high return periods even for records as short as 5 years.


2016 ◽  
Vol 79 (1) ◽  
pp. 148-152 ◽  
Author(s):  
TIAN DING ◽  
YAN-YAN YU ◽  
CHENG-AN HWANG ◽  
QING-LI DONG ◽  
SHI-GUO CHEN ◽  
...  

ABSTRACT The objectives of this study were to develop a probability model of Staphylococcus aureus enterotoxin A (SEA) production as affected by water activity (aw), pH, and temperature in broth and assess its applicability for milk. The probability of SEA production was assessed in tryptic soy broth using 24 combinations of aw (0.86 to 0.99), pH (5.0 to 7.0), and storage temperature (10 to 30°C). The observed probabilities were fitted with a logistic regression to develop a probability model. The model had a concordant value of 97.5% and concordant index of 0.98, indicating that the model satisfactorily describes the probability of SEA production. The model showed that aw, pH, and temperature were significant factors affecting the probability of toxin production. The model predictions were in good agreement with the observed values obtained from milk. The model may help manufacturers in selecting product pH and aw and storage temperatures to prevent SEA production.


2016 ◽  
Vol 144 (11) ◽  
pp. 4395-4420 ◽  
Author(s):  
Falko Judt ◽  
Shuyi S. Chen

Abstract Rapid intensification (RI) of tropical cyclones (TCs) remains one of the most challenging issues in TC prediction. This study investigates the predictability of RI, the uncertainty in predicting RI timing, and the dynamical processes associated with RI. To address the question of environmental versus internal control of RI, five high-resolution ensembles of Hurricane Earl (2010) were generated with scale-dependent stochastic perturbations from synoptic to convective scales. Although most members undergo RI and intensify into major hurricanes, the timing of RI is highly uncertain. While environmental conditions including SST control the maximum TC intensity and the likelihood of RI during the TC lifetime, both environmental and internal factors contribute to uncertainty in RI timing. Complex interactions among environmental vertical wind shear, the mean vortex, and internal convective processes govern the TC intensification process and lead to diverse pathways to maturity. Although the likelihood of Earl undergoing RI seems to be predictable, the exact timing of RI has a stochastic component and low predictability. Despite RI timing uncertainty, two dominant modes of RI emerged. One group of members undergoes RI early in the storm life cycle; the other one later. In the early RI cases, a rapidly contracting radius of maximum wind accompanies the development of the eyewall during RI. The late RI cases have a well-developed eyewall prior to RI, while an upper-level warm core forms during the RI process. These differences indicate that RI is associated with distinct physical processes during particular stages of the TC life cycle.


Sign in / Sign up

Export Citation Format

Share Document