Data Mining and Polar Coordinates in the Analysis by Gender of Finishing Behaviors in Professional Basketball Pick and Roll

The open nature of basketball gives it a large uncertainty that makes hard the tactical analysis of the situations that happen in the game. Specifically, screens are one of the offensive tactical elements most used in basketball and one example of a tactical situation that needs the highest preparation level to get a good performance in the competition. The aim of this study is to differentiate these player behaviors by gender using data mining and polar coordinates analysis. Therefore, one ad hoc observational tool made by 17 criteria and 97 exhaustive and mutually exclusive (E/ME) categories has been designed and validated using the data quality analysis (correlation coefficients and concordance index 0.98) and generalizability analysis (G coefficients 0.94) to perform such a study. The observational design is nomothetic, punctual, and multidimensional. A total of 176 ball screens situations have been analyzed for the men's category and 118 for women's category, corresponding to three different teams of each gender playing in the highest competition level in Spain during the 2018/2019 season using Hoisan software tool. The analysis of the relationships among behaviors has been performed using Polar Coordinates analysis as well as data mining analysis: clustering and decision tree classifier. Results show significant relationships that allow us to tactically interpret the pick and roll situations in men's and women's professional basketball players in Spain, allowing us to develop more intervention programs which will optimize training and improve players performance.

Download Full-text

Student Academic Performance Prediction using Supervised Learning Techniques

International Journal of Emerging Technologies in Learning (iJET) ◽

10.3991/ijet.v14i14.10310 ◽

2019 ◽

Vol 14 (14) ◽

pp. 92 ◽

Cited By ~ 1

Author(s):

Muhammad Imran ◽

Shahzad Latif ◽

Danish Mehmood ◽

Muhammad Saqlain Shah

Keyword(s):

Data Mining ◽

Supervised Learning ◽

Student Performance ◽

Performance Prediction ◽

Class Imbalance ◽

Ensemble Methods ◽

Fine Tuning ◽

Classification Error ◽

Decision Tree Classifier ◽

Tree Classifier

Automatic Student performance prediction is a crucial job due to the large volume of data in educational databases. This job is being addressed by educational data mining (EDM). EDM develop methods for discovering data that is derived from educational environment. These methods are used for understanding student and their learning environment. The educational institutions are often curious that how many students will be pass/fail for necessary arrangements. In previous studies, it has been observed that many researchers have intension on the selection of appropriate algorithm for just classification and ignores the solutions of the problems which comes during data mining phases such as data high dimensionality ,class imbalance and classification error etc. Such types of problems reduced the accuracy of the model. Several well-known classification algorithms are applied in this domain but this paper proposed a student performance prediction model based on supervised learning decision tree classifier. In addition, an ensemble method is applied to improve the performance of the classifier. Ensemble methods approach is designed to solve classification, predictions problems. This study proves the importance of data preprocessing and algorithms fine-tuning tasks to resolve the data quality issues. The experimental dataset used in this work belongs to Alentejo region of Portugal which is obtained from UCI Machine Learning Repository. Three supervised learning algorithms (J48, NNge and MLP) are employed in this study for experimental purposes. The results showed that J48 achieved highest accuracy 95.78% among others.

Download Full-text

Data Mining: A Bagged Decision Tree Classifier Algorithm For Ids Intrusion Detection System Based Attacks Classification

Design Engineering ◽

10.17762/de.v2021i04.1800 ◽

2021 ◽

pp. 1826-1839

Author(s):

Sandeep Adhikari, Dr. Sunita Chaudhary

Keyword(s):

Data Mining ◽

Intrusion Detection ◽

Decision Tree ◽

Intrusion Detection System ◽

Detection System ◽

Large Data ◽

Large Data Sets ◽

Data Sets ◽

Decision Tree Classifier ◽

Tree Classifier

The exponential growth in the use of computers over networks, as well as the proliferation of applications that operate on different platforms, has drawn attention to network security. This paradigm takes advantage of security flaws in all operating systems that are both technically difficult and costly to fix. As a result, intrusion is used as a key to worldwide a computer resource's credibility, availability, and confidentiality. The Intrusion Detection System (IDS) is critical in detecting network anomalies and attacks. In this paper, the data mining principle is combined with IDS to efficiently and quickly identify important, secret data of interest to the user. The proposed algorithm addresses four issues: data classification, high levels of human interaction, lack of labeled data, and the effectiveness of distributed denial of service attacks. We're also working on a decision tree classifier that has a variety of parameters. The previous algorithm classified IDS up to 90% of the time and was not appropriate for large data sets. Our proposed algorithm was designed to accurately classify large data sets. Aside from that, we quantify a few more decision tree classifier parameters.

Download Full-text

Student Performance Predictions Using Knowledge Discovery Database and Data Mining, DPU Students Records as Sample

Academic Journal of Nawroz University ◽

10.25007/ajnu.v10n3a875 ◽

2021 ◽

Vol 10 (3) ◽

pp. 121-127

Author(s):

Bareen Haval ◽

Karwan Jameel Abdulrahman ◽

Araz Rajab

Keyword(s):

Data Mining ◽

Decision Tree ◽

Student Performance ◽

Educational Data Mining ◽

Data Sets ◽

Decision Tree Classifier ◽

Data Mining Techniques ◽

Academic History ◽

Tree Classifier ◽

Using Data

This article presents the results of connecting an educational data mining techniques to the academic performance of students. Three classification models (Decision Tree, Random Forest and Deep Learning) have been developed to analyze data sets and predict the performance of students. The projected submission of the three classificatory was calculated and matched. The academic history and data of the students from the Office of the Registrar were used to train the models. Our analysis aims to evaluate the results of students using various variables such as the student's grade. Data from (221) students with (9) different attributes were used. The results of this study are very important, provide a better understanding of student success assessments and stress the importance of data mining in education. The main purpose of this study is to show the student successful forecast using data mining techniques to improve academic programs. The results of this research indicate that the Decision Tree classifier overtakes two other classifiers by achieving a total prediction accuracy of 97%.

Download Full-text

DEVELOPING A PARALLEL CLASSIFIER FOR MINING IN BIG DATA SETS

IIUM Engineering Journal ◽

10.31436/iiumej.v22i2.1541 ◽

2021 ◽

Vol 22 (2) ◽

pp. 119-134

Author(s):

Ahad Shamseen ◽

Morteza Mohammadi Zanjireh ◽

Mahdi Bahaghighat ◽

Qin Xin

Keyword(s):

Data Mining ◽

Big Data ◽

Decision Tree ◽

Main Memory ◽

Experimental Results ◽

Primary Data ◽

Data Sets ◽

Decision Tree Classifier ◽

Vast Amount ◽

Tree Classifier

Data mining is the extraction of information and its roles from a vast amount of data. This topic is one of the most important topics these days. Nowadays, massive amounts of data are generated and stored each day. This data has useful information in different fields that attract programmers’ and engineers’ attention. One of the primary data mining classifying algorithms is the decision tree. Decision tree techniques have several advantages but also present drawbacks. One of its main drawbacks is its need to reside its data in the main memory. SPRINT is one of the decision tree builder classifiers that has proposed a fix for this problem. In this paper, our research developed a new parallel decision tree classifier by working on SPRINT results. Our experimental results show considerable improvements in terms of the runtime and memory requirements compared to the SPRINT classifier. Our proposed classifier algorithm could be implemented in serial and parallel environments and can deal with big data. ABSTRAK: Perlombongan data adalah pengekstrakan maklumat dan peranannya dari sejumlah besar data. Topik ini adalah salah satu topik yang paling penting pada masa ini. Pada masa ini, data yang banyak dihasilkan dan disimpan setiap hari. Data ini mempunyai maklumat berguna dalam pelbagai bidang yang menarik perhatian pengaturcara dan jurutera. Salah satu algoritma pengkelasan perlombongan data utama adalah pokok keputusan. Teknik pokok keputusan mempunyai beberapa kelebihan tetapi kekurangan. Salah satu kelemahan utamanya adalah keperluan menyimpan datanya dalam memori utama. SPRINT adalah salah satu pengelasan pembangun pokok keputusan yang telah mengemukakan untuk masalah ini. Dalam makalah ini, penyelidikan kami sedang mengembangkan pengkelasan pokok keputusan selari baru dengan mengusahakan hasil SPRINT. Hasil percubaan kami menunjukkan peningkatan yang besar dari segi jangka masa dan keperluan memori berbanding dengan pengelasan SPRINT. Algoritma pengklasifikasi yang dicadangkan kami dapat dilaksanakan dalam persekitaran bersiri dan selari dan dapat menangani data besar.

Download Full-text

Mixed Methods in Tactical Analysis Through Polar Coordinates and Function Estimation: The Transition Play in ACB Basketball

Frontiers in Sports and Active Living ◽

10.3389/fspor.2021.739308 ◽

2021 ◽

Vol 3 ◽

Author(s):

José Luis Pastrana-Brincones ◽

Belén Troyano-Gallegos ◽

Juan Pablo Morillo-Baro ◽

Raimundo López de Vinuesa-Piote ◽

Juan Antonio Vázquez-Diz ◽

...

Keyword(s):

Ad Hoc ◽

High Reliability ◽

Reliability And Validity ◽

Correlation Coefficients ◽

Polar Coordinates ◽

Function Estimation ◽

Mating Behaviors ◽

The One ◽

High Level ◽

And Function

Nowadays, getting advantageous offensive situations in high-level basketball is being increasingly harder, so taking advantage of any situation in the game since the team has the ball is essential to be competitive. Therefore, the goal to achieve in this study is to evaluate using a mixed method strategy the behaviors happening in the application of the technical–tactical means performed in the transition play of professional basketball in Spain. An ad hoc observation tool made of 11 criteria and 83 exhaustive and mutually exclusive categories (E/ME) has been designed and validated by means of data quality and generalizability analyses. Indexes obtained show high reliability and validity allowing the proposed actions to be recorded (correlation coefficients are above 0.95 and generalizability coefficients are above 0.90 in all cases). A total number of 128 situations corresponding to eight games of Unicaja de Málaga in the Endesa League in the 18/19 season were observed with the Hoisan software. The analysis of the relationships among behaviors was performed using the polar coordinates technique where the one-on-one initiation, outside the zone, has been used as focal behavior. The estimation of the functions representing the vectors has also been performed to model the best fit that estimates, starting from a focal category, the relationship among this focal behavior and the rest of the mating behaviors for possible future observations. The results show significant relationships between the selected focal behavior and the mating behaviors, showing indications of behaviors allowing tactical interpretation of the game and the definition of intervention programs to improve the performance of the team.

Download Full-text

Prediction of warning level in aircraft accidents using data mining techniques

The Aeronautical Journal ◽

10.1017/s0001924000009623 ◽

2014 ◽

Vol 118 (1206) ◽

pp. 935-952 ◽

Cited By ~ 6

Author(s):

A. B. Arockia Christopher ◽

S. Appavu alias Balamurugan

Keyword(s):

Data Mining ◽

Principal Components ◽

Information Gain ◽

Decision Tree Classifier ◽

Aircraft Accidents ◽

Analysis Process ◽

Tree Classifier ◽

Using Data ◽

Amount Of Knowledge ◽

Better Than

Abstract Data mining is a data analysis process which is designed for large amounts of data. It proposes a methodology for evaluating risk and safety and describes the main issues of aircraft accidents. We have a huge amount of knowledge and data collection in aviation companies. This paper focuses on different feature selectwindion techniques applied to the datasets of airline databases to understand and clean the dataset. CFS subset evaluator, consistency subset evaluator, gain ratio feature evaluator, information gain attribute evaluator, OneR attribute evaluator, principal components attribute transformer, ReliefF attribute evaluatoboundar and symmetrical uncertainty attribute evaluator are used in this study in order to reduce the number of initial attributes. The classification algorithms, such as DT, KNN, SVM, NN and NB, are used to predict the warning level of the component as the class attribute. We have explored the use of different classification techniques on aviation components data. For this purpose Weka software tools are used. This study also proves that the principal components attribute with decision tree classifier would perform better than other attributes and techniques on airline data. Accuracy is also very highly improved. This work may be useful for an aviation company to make better predictions. Some safety recommendations are also addressed to airline companies.

Download Full-text

Using T3, an Improved Decision Tree Classifier, for Mining Stroke-related Medical Data

Methods of Information in Medicine ◽

10.1160/me0317 ◽

2007 ◽

Vol 46 (05) ◽

pp. 523-529 ◽

Cited By ~ 8

Author(s):

M. Saraee ◽

B. Theodoulidis ◽

J. A. Keane ◽

C. Tjortjis

Keyword(s):

Data Mining ◽

Decision Tree ◽

Predictive Models ◽

Medical Data ◽

Classification Algorithm ◽

Medical Decision ◽

Classification Error ◽

Decision Tree Classifier ◽

Data Set ◽

Tree Classifier

Summary Objectives: Medical data are a valuable resource from which novel and potentially useful knowledge can be discovered by using data mining. Data mining can assist and support medical decision making and enhance clinical managementand investigative research. The objective of this work is to propose a method for building accurate descriptive and predictive models based on classification of past medical data. We also aim to compare this method with other well established data mining methods and identify strengths and weaknesses. Method: We propose T3, a decision tree classifier which builds predictive models based on known classes, by allowing for a certain amount of misclassification error in training in order to achieve better descriptive and predictive accuracy. We then experiment with a real medical data set on stroke, and various subsets, in order to identify strengths and weaknesses. We also compare performance with a very successful and well established decision tree classifier. Results: T3 demonstrated impressive performance when predicting unseen cases of stroke resulting in as little as 0.4% classification error while the state of the art decision tree classifier resulted in 33.6% classification error respectively. Conclusions: This paper presents and evaluates T3, a classification algorithm that builds decision trees of depth at most three, and results in high accuracy whilst keeping the tree size reasonably small. T3 demonstrates strong descriptive and predictive power without compromising simplicity and clarity. We evaluate T3 based on real stroke register data and compare it with C4.5, a well-known classification algorithm, showing that T3 produces significantly more accurate and readable classifiers.

Download Full-text

Analysing Road Accident Criticality using Data mining

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit1953138 ◽

2019 ◽

pp. 408-415

Author(s):

Shahsitha Siddique V ◽

Nithin Ramakrishnan

Keyword(s):

Data Mining ◽

Road Transport ◽

Road Accident ◽

Machine Learning Algorithms ◽

Road Accidents ◽

Decision Tree Classifier ◽

Efficient Manner ◽

Tree Classifier ◽

Accident Severity ◽

Accident Data

Road transport is one of the most vital forms of transportation system, connecting both long and short distances in our country. There are several attributes, which affect the intensity of a road accident like speed of the vehicle, road conditions, time of the accident etc. Analysing these attributes gives an idea about the factors lead to the severity of the accident. Data mining is a method to analyse huge amount of traffic data in an efficient manner, which gives the factors, affect the road accidents. Several machine learning algorithms can be used to find the relation between traffic attributes the lead to the severity of the accidents. In this work, we use three methods for predicting accident criticality. First, Naive Bayesian Classifier is used to get the accident severity based on Bayes rule. Then, Decision Tree classifier is used for same purpose for accident severity calculation. Finally K-Nearest Neighbour(KNN) classifier is employed for severity calculation. The accuracy of the algorithms are compared and it is found that KNN performs better than the other two algorithms employed. The major aim of the work is to find the accident severity. Also the work aims to reduce road accidents by giving awareness to public using the above method.

Download Full-text

Data mining approach for in-hospital treatment outcome in patients with acute coronary syndrome

Medicinski pregled ◽

10.2298/mpns1506157s ◽

2015 ◽

Vol 68 (5-6) ◽

pp. 157-161 ◽

Cited By ~ 5

Author(s):

Miroslava Sladojevic ◽

Milenko Cankovic ◽

Snezana Cemerlic ◽

Bojan Mihajlovic ◽

Filip Adjic ◽

...

Keyword(s):

Data Mining ◽

Acute Coronary Syndrome ◽

Coronary Intervention ◽

Left Ventricular ◽

Left Ventricular Ejection ◽

Decision Tree Classifier ◽

Ventricular Ejection Fraction ◽

Coronary Syndrome ◽

Data Mining Approach ◽

Tree Classifier

Introduction. Risk stratification is nowadays crucial when estimating the patient?s prognosis in terms of treatment outcome and it also helps in clinical decision making. Several risk assessment models have been developed to predict short-term outcomes in patients with acute coronary syndrome. This study was aimed at developing an outcome prediction model for patients with acute coronary syndrome submitted to percutaneus coronary intervention using data mining approach. Material and Methods. A total of 2030 patients hospitalized for acute coronary syndrome and treated with percutaneous coronary intervention from December 2008 to December 2011 were assigned to a derivation cohort. Demographic and anamnestic data, clinical characteristics on admission, biochemical analysis of blood parameters on admission, and left ventricular ejection fraction formed the basis of the study. A number of machine learning algorithms available within Waikato Environment for Knowledge Discovery had been evaluated and the most successful was chosen. The predictive model was subsequently validated in a different population of 931 patients (validation cohort), hospitalized during 2012. Results. The best prediction results were achieved using Alternating Decision Tree classifier, which was able to predict in-hospital mortality with 89% accuracy, and preserved good performance on validation cohort with 87% accuracy. Alternating Decision Tree classifier identified a subset of 6 attributes most relevant to mortality prediction: systolic and diastolic blood pressure, heart rate, left ventricular ejection fraction, age, and troponin value. Conclusion. Data mining approach enabled the authors to develop a model capable of predicting the in-hospital outcome following percutaneous coronary intervention. The model showed excellent sensitivity and specificity during internal validation.

Download Full-text