Online Data Migration Model and ID3 Algorithm in Sports Competition Action Data Mining Application

The ID3 algorithm is a key and important method in existing data mining, and its rules are simple and easy to understand and have high application value. If the decision tree algorithm is applied to the online data migration of sports competition actions, it can grasp the sports competition rules in the relationship between massive data to guide sports competition. This paper analyzes the application performance of the traditional ID3 algorithm in online data migration of sports competition actions; realizes the application steps and data processing process of the traditional ID3 algorithm, including original data collection, original data preprocessing, data preparation, constructing a decision tree, data mining, and making a comprehensive evaluation of the traditional ID3 algorithm; and clarifies the problems of the traditional ID3 algorithm. Mainly, the problems of missing attributes and overfitting are clarified, which provide directions for the subsequent algorithm optimization. Then, this paper proposes a k -nearest neighbor-based ID3 optimization algorithm, which selects values similar to k -nearest neighbors to fill in the missing values for the attribute missing problem of the traditional ID3 algorithm. Based on this, the improved algorithm is applied to the online data migration of sports competition actions, and the application effect is evaluated. The results show that the performance of the k -nearest neighbor-based ID3 optimization algorithm is significantly improved, and it can also solve the overfitting problem existing in the traditional ID3 algorithm. For the overall classification problem of six types of samples of travel patterns, the experimental data samples have the characteristics of high data quality, a considerable number of samples, and obvious sample differentiation. Therefore, this paper also uses the deep factorization machine algorithm based on deep learning to classify the six classes of travel patterns of sports competition action data using the previously extracted relevant features. The research in this paper provides a more accurate method and a higher-performance online data migration model for sports competition action data mining.

Download Full-text

The Application of Data Mining Technology in the Teaching Evaluation in Colleges and Universities

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2017.6115 ◽

2017 ◽

Vol 14 (1) ◽

pp. 7-12 ◽

Cited By ~ 2

Author(s):

Xiaoqi Liu

Keyword(s):

Data Mining ◽

Decision Tree ◽

Evaluation System ◽

Information Gain ◽

Original Data ◽

Teaching Evaluation ◽

Personal Factors ◽

Mining Technology ◽

Id3 Algorithm ◽

College Work

As the teaching management informationization level is higher and higher, Network based teaching evaluation system has been widely used, and a lot of evaluation of the original data has been accumulated. This research, taking recent five years teaching evaluation data of the college work for as basis, analyzes teachers’ personal factors and teaching operation factors respectively with the data mining technology of decision tree ID3 algorithm. By calculating the factors of information entropy and information gain value, the corresponding decision tree is gained. The teaching evaluation results are made use of really rather than become a mere formality, and thus provide powerful basis for the effectiveness and scientificalness of teaching evaluation.

Download Full-text

Research on data mining and prediction of large-scale competitions based on online data migration model

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-219062 ◽

2021 ◽

pp. 1-12

Author(s):

Junjian Wang

Keyword(s):

Data Mining ◽

Control Strategy ◽

Large Scale ◽

Cost Optimization ◽

System Structure ◽

Data Migration ◽

Target System ◽

Relay Nodes ◽

Online Data ◽

Migration Model

To improve the data mining effect of large-scale (sports) competitions and improve the results of competition prediction and analysis, based on the online data migration model, this paper establishes a system model for processing applications of transmitting nodes, relay nodes and receiving nodes in the competition network, and proposes an online distributed cost optimization control strategy to be responsible for the operation and processing of applications in the communication system. The control strategy realizes that while ensuring the stability of the application queue, the optimization target system overhead is infinitely close to the theoretical optimal value. In addition, according to the competition data mining and prediction requirements, this paper constructs a system structure model, and designs experiments to verify the system performance. The research results show that the performance of the data mining and prediction system of large-scale (sports) competition constructed in this paper meets actual needs.

Download Full-text

Penerapan Algoritma ID3 dalam Prediksi Kebutuhan Pupuk

Journal of Information System Research (JOSH) ◽

10.47065/josh.v2i4.565 ◽

2021 ◽

Vol 2 (4) ◽

pp. 247-253

Author(s):

Milyani Aritonang

Keyword(s):

Data Mining ◽

Decision Tree ◽

Plant Protection ◽

Organic Fertilizer ◽

Urea Fertilizer ◽

Id3 Algorithm ◽

Development Unit ◽

Npk Fertilizer

The need for fertilizer at the Plant Protection Development Unit (UPPT) is uncertain depending on the demand of farmers, therefore it is necessary to predict fertilizer needs. There are five types of fertilizers predicted by the Plant Protection Development Unit (UPPT), including Urea fertilizer, ZA fertilizer, SP-36 fertilizer, NPK fertilizer, and Organic fertilizer, so fertilizer needs can be predicted. In predicting data mining on fertilizer needs using the ID3 algorithm. Where it works is calculating the value of entropy and gain to get the final result in the form of a tree to the decision and rule. Testing is done using the tanagra software. The results of the tests carried out on the tanagra application using the ID3 algorithm are in the form of a decision tree, while in the calculation the results obtained are in the form of a decision tree.

Download Full-text

Developed third iterative dichotomizer based on feature decisive values for educational data mining

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v18.i1.pp209-217 ◽

2020 ◽

Vol 18 (1) ◽

pp. 209

Author(s):

Saja Taha Ahmed ◽

Rafah Al-Hamdani ◽

Muayad Sadik Croock

Keyword(s):

Data Mining ◽

Feature Selection ◽

Decision Tree ◽

Predictive Analytics ◽

Educational Data Mining ◽

Target Class ◽

Id3 Algorithm ◽

Feature Weight ◽

Holdout Validation ◽

Fold Cross Validation

Recently, the decision trees have been adopted among the preeminent utilized classification models. They acquire their fame from their efficiency in predictive analytics, easy to interpret and implicitly perform feature selection. This latter perspective is one of essential significance in Educational Data Mining (EDM), in which selecting the most relevant features has a major impact on classification accuracy enhancement. The main contribution is to build a new multi-objective decision tree, which can be used for feature selection and classification. The proposed Decisive Decision Tree (DDT) is introduced and constructed based on a decisive feature value as a feature weight related to the target class label. The traditional Iterative Dichotomizer 3 (ID3) algorithm and the proposed DDT are compared using three datasets in terms of some ID3 issues, including logarithmic calculation complexity and multi-values featuresselection. The results indicated that the proposed DDT outperforms the ID3 in the developing time. The accuracy of the classification is improved on the basis of 10-fold cross-validation for all datasets with the highest accuracy achieved by the proposed method is 92% for the student.por dataset and holdout validation for two datasets, i.e. Iraqi and Student-Math. The experiment also shows that the proposed DDT tends to select attributes that are important rather than multi-value.

Download Full-text

Appraisal of the Classification Technique in Data Mining of Student Performance using J48 Decision Tree, K-Nearest Neighbor and Multilayer Perceptron Algorithms

International Journal of Computer Applications ◽

10.5120/ijca2018916751 ◽

2018 ◽

Vol 179 (33) ◽

pp. 39-46 ◽

Cited By ~ 1

Author(s):

Faiza Umar ◽

Najim Ussiph

Keyword(s):

Data Mining ◽

Decision Tree ◽

Student Performance ◽

Multilayer Perceptron ◽

Nearest Neighbor ◽

K Nearest Neighbor ◽

Classification Technique ◽

J48 Decision Tree

Download Full-text

Analisis Komparatif Evaluasi Performa Algoritma Klasifikasi pada Readmisi Pasien Diabetes

Jurnal Buana Informatika ◽

10.24002/jbi.v7i4.770 ◽

2016 ◽

Vol 7 (4) ◽

Author(s):

Mochammad Yusa ◽

Ema Utami ◽

Emha T. Luthfi

Keyword(s):

Data Mining ◽

Decision Tree ◽

Cross Validation ◽

Nearest Neighbor ◽

Naive Bayes ◽

Kappa Statistic ◽

Naïve Bayes ◽

Validation Dataset ◽

K Nearest Neighbor ◽

Fold Cross Validation

Abstract. Readmission is associated with quality measures on patients in hospitals. Different attributes related to diabetic patients such as medication, ethnicity, race, lifestyle, age, and others result in the calculation of quality care that tends to be complicated. Classification techniques of data mining can solve this problem. In this paper, the evaluation on three different classifiers, i.e. Decision Tree, k-Nearest Neighbor (k-NN), dan Naive Bayes with various settingparameter, is developed by using 10-Fold Cross Validation technique. The targets of parameter performance evaluated is based on term of Accuracy, Mean Absolute Error (MAE), dan Kappa Statistic. The selected dataset consists of 47 attributes and 49.735 records. The result shows that k-NN classifier with k=100 has a better performance in term of accuracy and Kappa Statistic, but Naive Bayes outperforms in term of MAE among other classifiers. Keywords: k-NN, naive bayes, diabetes, readmissionAbstrak. Proses Readmisi dikaitkan dengan perhitungan kualitas penanganan pasien di rumah sakit. Perbedaan atribut-atribut yang berhubungan dengan pasien diabetes proses medikasi, etnis, ras, gaya hidup, umur, dan lain-lain, mengakibatkan perhitungan kualitas cenderung rumit. Teknik klasifikasi data mining dapat menjadi solusi dalam perhitungan kualitas ini. Teknik klasifikasi merupakan salah satu teknik data mining yang perkembangannya cukup signifikan. Di dalam penelitian ini, model algoritma klasifikasi Decision Tree, k-Nearest Neighbor (k-NN), dan Naive Bayes dengan berbagai parameter setting akan dievaluasi performanya berdasarkan nilai performa Accuracy, Mean AbsoluteError (MAE), dan Kappa Statistik dengan metode 10-Fold Cross Validation. Dataset yang dievaluasi memiliki 47 atribut dengan 49.735 records. Hasil penelitian menunjukan bahwa performa accuracy, MAE, dan Kappa Statistik terbaik didapatkan dari Model Algoritma Naive Bayes.Kata Kunci: k-NN, naive bayes, diabetes, readmisi

Download Full-text

A Combination Strategy of Feature Selection Based on an Integrated Optimization Algorithm and Weighted K-Nearest Neighbor to Improve the Performance of Network Intrusion Detection

Electronics ◽

10.3390/electronics9081206 ◽

2020 ◽

Vol 9 (8) ◽

pp. 1206

Author(s):

Hui Xu ◽

Krzysztof Przystupa ◽

Ce Fang ◽

Andrzej Marciniak ◽

Orest Kochan ◽

...

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Optimization Algorithm ◽

Nearest Neighbor ◽

Original Data ◽

Network Intrusion Detection ◽

K Nearest Neighbor ◽

Combination Strategy ◽

Integrated Optimization ◽

Network Intrusion

With the widespread use of the Internet, network security issues have attracted more and more attention, and network intrusion detection has become one of the main security technologies. As for network intrusion detection, the original data source always has a high dimension and a large amount of data, which greatly influence the efficiency and the accuracy. Thus, both feature selection and the classifier then play a significant role in raising the performance of network intrusion detection. This paper takes the results of classification optimization of weighted K-nearest neighbor (KNN) with those of the feature selection algorithm into consideration, and proposes a combination strategy of feature selection based on an integrated optimization algorithm and weighted KNN, in order to improve the performance of network intrusion detection. Experimental results show that the weighted KNN can increase the efficiency at the expense of a small amount of the accuracy. Thus, the proposed combination strategy of feature selection based on an integrated optimization algorithm and weighted KNN can then improve both the efficiency and the accuracy of network intrusion detection.

Download Full-text

Application of Decision Tree Algorithm in Lumber Hierarchies

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.466-467.308 ◽

2012 ◽

Vol 466-467 ◽

pp. 308-313

Author(s):

Dan Guo

Keyword(s):

Data Mining ◽

Decision Tree ◽

Inductive Reasoning ◽

Decision Tree Algorithm ◽

Discrete Function ◽

Tree Algorithm ◽

Id3 Algorithm ◽

Noise Data ◽

Construction Model

The decision tree algorithm is a kind of approximate discrete function value method with high precision, construction model of classification of noise data is simple and has good robustness etc, it is currently the most widely used in one of the inductive reasoning algorithms in data mining, extensive attention by researchers. This paper selects the decision tree ID3 algorithm to realize the standardization of lumber level division, to ensure the accuracy of the lumber division, while improving the partition of speed.

Download Full-text

Sistem Prediksi Penyakit Kanker Serviks Menggunakan CART, Naive Bayes, dan k-NN

Creative Information Technology Journal ◽

10.24076/citec.2017v4i2.100 ◽

2018 ◽

Vol 4 (2) ◽

pp. 83

Author(s):

Tutus Praningki ◽

Indra Budi

Keyword(s):

Data Mining ◽

Decision Tree ◽

Pap Smear ◽

Nearest Neighbor ◽

Naive Bayes ◽

Confusion Matrix ◽

Regression Trees ◽

Naïve Bayes ◽

K Nearest Neighbor ◽

Classification And Regression

Tersedianya data histori rekam medis pasien kanker serviks pada institusi pelayanan kesehatan, tidak disertai dengan proses ekstraksi menjadi sebuah pengetahuan atau informasi. Penggunaan teknik data mining sangat berpotensi untuk diimplementasikan kedalam sistem yang dapat melakukan prediksi penyakit kanker serviks. Pada penelitian ini berfokus pada dataset diagnosa medis pasien yang akan melakukan tes Pap Smear. Algoritma yang digunakan untuk melakukan klasifikasi penyakit kanker serviks adalah Classification And Regression Trees (CART), Naive Bayes, dan k-Nearest Neighbor (k-NN). Pengujian yang dilakukan terhadap algoritma CART Decision Tree, Naive Bayes, dan k-NN, menggunakan formula Confusion Matrix, dengan menggunakan teknik pemecahan dataset Holdout. Hasil pengujian terhadap algoritma yang digunakan, menunjukkan algoritma Naive Bayes memiliki akurasi terbaik sebesar 94,44%, sedangkan tingkat akurasi yang dihasilkan algoritma CART dan k-NN adalah 88,89%, 85,04%. Performa yang didapatkan oleh masing-masing algoritma yang digunakan, memungkinkan penggunaan sistem prediksi penyakit kanker serviks untuk mendukung keputusan klinis pada pasien baru.

Download Full-text

Human Heart Disease Prediction System Using Data Mining Techniques

Sir Syed Research Journal of Engineering & Technology ◽

10.33317/ssurj.v8iii.92 ◽

2019 ◽

Vol 8 (II) ◽

Author(s):

Sidra Javed ◽

Hamza Javed ◽

Ayesha Saddique ◽

Beenish Rafiq

Keyword(s):

Data Mining ◽

Heart Disease ◽

Decision Tree ◽

Heart Diseases ◽

Heavy Load ◽

Data Mining Techniques ◽

Id3 Algorithm ◽

Wide Range ◽

Using Data ◽

Human Heart Disease

— Prediction of heart disease is a big concern now a days because everyone is busy and due to heavy load of work people do not give attention to their health. To diagnose a disease is a big challenge. The issue is to extract data that have some meaningful knowledge. For this purpose, data mining techniques are used to extract meaningful data. Decision Tree and ID3 are used to predict heart diseases. Many researchers and practitioners are familiar with prediction of heart diseases and wide range of techniques is available to predict disease. To address this problem, Decision Tree is used to predict the heart disease. In this study the collected data is pre-processed, Decision Tree algorithm and ID3 were then applied to predict the heart disease. Index Terms— Decision Tree, ID3 Algorithm, Data Mining, Decision Support System (DSS), knowledge Discovery from Databases (KDD).

Download Full-text