scholarly journals Use of Machine Learning to Automate the Identification of Basketball Strategies Using Whole Team Player Tracking Data

2019 ◽  
Vol 10 (1) ◽  
pp. 24 ◽  
Author(s):  
Changjia Tian ◽  
Varuna De Silva ◽  
Michael Caine ◽  
Steve Swanson

The use of machine learning to identify and classify offensive and defensive strategies in team sports through spatio-temporal tracking data has received significant interest recently in the literature and the global sport industry. This paper focuses on data-driven defensive strategy learning in basketball. Most research to date on basketball strategy learning has focused on offensive effectiveness and is based on the interaction between the on-ball player and principle on-ball defender, thereby ignoring the contribution of the remaining players. Furthermore, most sports analytical systems that provide play-by-play data is heavily biased towards offensive metrics such as passes, dribbles, and shots. The aim of the current study was to use machine learning to classify the different defensive strategies basketball players adopt when deviating from their initial defensive action. An analytical model was developed to recognise the one-on-one (matched) relationships of the players, which is utilised to automatically identify any change of defensive strategy. A classification model is developed based on a player and ball tracking dataset from National Basketball Association (NBA) game play to classify the adopted defensive strategy against pick-and-roll play. The methodology described is the first to analyse the defensive strategy of all in-game players (both on-ball players and off-ball players). The cross-validation results indicate that the proposed technique for automatic defensive strategy identification can achieve up to 69% accuracy of classification. Machine learning techniques, such as the one adopted here, have the potential to enable a deeper understanding of player decision making and defensive game strategies in basketball and other sports, by leveraging the player and ball tracking data.

Impact ◽  
2019 ◽  
Vol 2019 (10) ◽  
pp. 84-86
Author(s):  
Keisuke Fujii

The coordination and movement of people in large crowds, during sports games or when socialising, seems readily explicable. Sometimes this occurs according to specific rules or instructions such as in a sport or game, at other times the motivations for movement may be more focused around an individual's needs or fears. Over the last decade, the computational ability to identify and track a given individual in video footage has increased. The conventional methods of how data is gathered and interpreted in biology rely on fitting statistical results to particular models or hypotheses. However, data from tracking movements in social groups or team sports are so complex as they cannot easily analyse the vast amounts of information and highly varied patterns. The author is an expert in human behaviour and machine learning who is based at the Graduate School of Informatics at Nagoya University. His challenge is to bridge the gap between rule-based theoretical modelling and data-driven modelling. He is employing machine learning techniques to attempt to solve this problem, as a visiting scientist in RIKEN Center for Advanced Intelligence Project.


Author(s):  
KM Jyoti Rani

Diabetes is a chronic disease with the potential to cause a worldwide health care crisis. According to International Diabetes Federation 382 million people are living with diabetes across the whole world. By 2035, this will be doubled as 592 million. Diabetes is a disease caused due to the increase level of blood glucose. This high blood glucose produces the symptoms of frequent urination, increased thirst, and increased hunger. Diabetes is a one of the leading cause of blindness, kidney failure, amputations, heart failure and stroke. When we eat, our body turns food into sugars, or glucose. At that point, our pancreas is supposed to release insulin. Insulin serves as a key to open our cells, to allow the glucose to enter and allow us to use the glucose for energy. But with diabetes, this system does not work. Type 1 and type 2 diabetes are the most common forms of the disease, but there are also other kinds, such as gestational diabetes, which occurs during pregnancy, as well as other forms. Machine learning is an emerging scientific field in data science dealing with the ways in which machines learn from experience. The aim of this project is to develop a system which can perform early prediction of diabetes for a patient with a higher accuracy by combining the results of different machine learning techniques. The algorithms like K nearest neighbour, Logistic Regression, Random forest, Support vector machine and Decision tree are used. The accuracy of the model using each of the algorithms is calculated. Then the one with a good accuracy is taken as the model for predicting the diabetes.


2021 ◽  
pp. 1-9
Author(s):  
Dimitrios P. Panagoulias ◽  
Dionisios N. Sotiropoulos ◽  
George A. Tsihrintzis

The doctrine of the “one size fits all” approach in the field of disease diagnosis and patient management is being replaced by a more per patient approach known as “personalized medicine”. In this spirit, biomarkers are key variables in the research and development of new methods for prognostic and classification model training based on advances in the field of artificial intelligence [1, 2, 3]. Metabolomics refers to the systematic study of the unique chemical fingerprints that cellular processes leave behind. The metabolic profile of a person can provide a snapshot of cell physiology and, by extension, metabolomics provide a direct “functional reading of the physiological state” of an organism. Via employing machine learning methodologies, a general evaluation chart of nutritional biomarkers is formulated and an optimised prediction method for body to mass index is investigated with the aim to discover dietary patterns.


2020 ◽  
Vol 21 (15) ◽  
pp. 5280
Author(s):  
Irini Furxhi ◽  
Finbarr Murphy

The practice of non-testing approaches in nanoparticles hazard assessment is necessary to identify and classify potential risks in a cost effective and timely manner. Machine learning techniques have been applied in the field of nanotoxicology with encouraging results. A neurotoxicity classification model for diverse nanoparticles is presented in this study. A data set created from multiple literature sources consisting of nanoparticles physicochemical properties, exposure conditions and in vitro characteristics is compiled to predict cell viability. Pre-processing techniques were applied such as normalization methods and two supervised instance methods, a synthetic minority over-sampling technique to address biased predictions and production of subsamples via bootstrapping. The classification model was developed using random forest and goodness-of-fit with additional robustness and predictability metrics were used to evaluate the performance. Information gain analysis identified the exposure dose and duration, toxicological assay, cell type, and zeta potential as the five most important attributes to predict neurotoxicity in vitro. This is the first tissue-specific machine learning tool for neurotoxicity prediction caused by nanoparticles in in vitro systems. The model performs better than non-tissue specific models.


2020 ◽  
Vol 24 (5) ◽  
pp. 1141-1160
Author(s):  
Tomás Alegre Sepúlveda ◽  
Brian Keith Norambuena

In this paper, we apply sentiment analysis methods in the context of the first round of the 2017 Chilean elections. The purpose of this work is to estimate the voting intention associated with each candidate in order to contrast this with the results from classical methods (e.g., polls and surveys). The data are collected from Twitter, because of its high usage in Chile and in the sentiment analysis literature. We obtained tweets associated with the three main candidates: Sebastián Piñera (SP), Alejandro Guillier (AG) and Beatriz Sánchez (BS). For each candidate, we estimated the voting intention and compared it to the traditional methods. To do this, we first acquired the data and labeled the tweets as positive or negative. Afterward, we built a model using machine learning techniques. The classification model had an accuracy of 76.45% using support vector machines, which yielded the best model for our case. Finally, we use a formula to estimate the voting intention from the number of positive and negative tweets for each candidate. For the last period, we obtained a voting intention of 35.84% for SP, compared to a range of 34–44% according to traditional polls and 36% in the actual elections. For AG we obtained an estimate of 37%, compared with a range of 15.40% to 30.00% for traditional polls and 20.27% in the elections. For BS we obtained an estimate of 27.77%, compared with the range of 8.50% to 11.00% given by traditional polls and an actual result of 22.70% in the elections. These results are promising, in some cases providing an estimate closer to reality than traditional polls. Some differences can be explained due to the fact that some candidates have been omitted, even though they held a significant number of votes.


2020 ◽  
Vol 10 (18) ◽  
pp. 6527 ◽  
Author(s):  
Omar Sharif ◽  
Mohammed Moshiul Hoque ◽  
A. S. M. Kayes ◽  
Raza Nowrozy ◽  
Iqbal H. Sarker

Due to the substantial growth of internet users and its spontaneous access via electronic devices, the amount of electronic contents has been growing enormously in recent years through instant messaging, social networking posts, blogs, online portals and other digital platforms. Unfortunately, the misapplication of technologies has increased with this rapid growth of online content, which leads to the rise in suspicious activities. People misuse the web media to disseminate malicious activity, perform the illegal movement, abuse other people, and publicize suspicious contents on the web. The suspicious contents usually available in the form of text, audio, or video, whereas text contents have been used in most of the cases to perform suspicious activities. Thus, one of the most challenging issues for NLP researchers is to develop a system that can identify suspicious text efficiently from the specific contents. In this paper, a Machine Learning (ML)-based classification model is proposed (hereafter called STD) to classify Bengali text into non-suspicious and suspicious categories based on its original contents. A set of ML classifiers with various features has been used on our developed corpus, consisting of 7000 Bengali text documents where 5600 documents used for training and 1400 documents used for testing. The performance of the proposed system is compared with the human baseline and existing ML techniques. The SGD classifier ‘tf-idf’ with the combination of unigram and bigram features are used to achieve the highest accuracy of 84.57%.


Geosciences ◽  
2021 ◽  
Vol 11 (7) ◽  
pp. 265
Author(s):  
Stefan Rauter ◽  
Franz Tschuchnigg

The classification of soils into categories with a similar range of properties is a fundamental geotechnical engineering procedure. At present, this classification is based on various types of cost- and time-intensive laboratory and/or in situ tests. These soil investigations are essential for each individual construction site and have to be performed prior to the design of a project. Since Machine Learning could play a key role in reducing the costs and time needed for a suitable site investigation program, the basic ability of Machine Learning models to classify soils from Cone Penetration Tests (CPT) is evaluated. To find an appropriate classification model, 24 different Machine Learning models, based on three different algorithms, are built and trained on a dataset consisting of 1339 CPT. The applied algorithms are a Support Vector Machine, an Artificial Neural Network and a Random Forest. As input features, different combinations of direct cone penetration test data (tip resistance qc, sleeve friction fs, friction ratio Rf, depth d), combined with “defined”, thus, not directly measured data (total vertical stresses σv, effective vertical stresses σ’v and hydrostatic pore pressure u0), are used. Standard soil classes based on grain size distributions and soil classes based on soil behavior types according to Robertson are applied as targets. The different models are compared with respect to their prediction performance and the required learning time. The best results for all targets were obtained with models using a Random Forest classifier. For the soil classes based on grain size distribution, an accuracy of about 75%, and for soil classes according to Robertson, an accuracy of about 97–99%, was reached.


Sign in / Sign up

Export Citation Format

Share Document