scholarly journals Online Data Migration Model and ID3 Algorithm in Sports Competition Action Data Mining Application

2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Li Ju ◽  
Lei Huang ◽  
Sang-Bing Tsai

The ID3 algorithm is a key and important method in existing data mining, and its rules are simple and easy to understand and have high application value. If the decision tree algorithm is applied to the online data migration of sports competition actions, it can grasp the sports competition rules in the relationship between massive data to guide sports competition. This paper analyzes the application performance of the traditional ID3 algorithm in online data migration of sports competition actions; realizes the application steps and data processing process of the traditional ID3 algorithm, including original data collection, original data preprocessing, data preparation, constructing a decision tree, data mining, and making a comprehensive evaluation of the traditional ID3 algorithm; and clarifies the problems of the traditional ID3 algorithm. Mainly, the problems of missing attributes and overfitting are clarified, which provide directions for the subsequent algorithm optimization. Then, this paper proposes a k -nearest neighbor-based ID3 optimization algorithm, which selects values similar to k -nearest neighbors to fill in the missing values for the attribute missing problem of the traditional ID3 algorithm. Based on this, the improved algorithm is applied to the online data migration of sports competition actions, and the application effect is evaluated. The results show that the performance of the k -nearest neighbor-based ID3 optimization algorithm is significantly improved, and it can also solve the overfitting problem existing in the traditional ID3 algorithm. For the overall classification problem of six types of samples of travel patterns, the experimental data samples have the characteristics of high data quality, a considerable number of samples, and obvious sample differentiation. Therefore, this paper also uses the deep factorization machine algorithm based on deep learning to classify the six classes of travel patterns of sports competition action data using the previously extracted relevant features. The research in this paper provides a more accurate method and a higher-performance online data migration model for sports competition action data mining.

2017 ◽  
Vol 14 (1) ◽  
pp. 7-12 ◽  
Author(s):  
Xiaoqi Liu

As the teaching management informationization level is higher and higher, Network based teaching evaluation system has been widely used, and a lot of evaluation of the original data has been accumulated. This research, taking recent five years teaching evaluation data of the college work for as basis, analyzes teachers’ personal factors and teaching operation factors respectively with the data mining technology of decision tree ID3 algorithm. By calculating the factors of information entropy and information gain value, the corresponding decision tree is gained. The teaching evaluation results are made use of really rather than become a mere formality, and thus provide powerful basis for the effectiveness and scientificalness of teaching evaluation.


2021 ◽  
pp. 1-12
Author(s):  
Junjian Wang

To improve the data mining effect of large-scale (sports) competitions and improve the results of competition prediction and analysis, based on the online data migration model, this paper establishes a system model for processing applications of transmitting nodes, relay nodes and receiving nodes in the competition network, and proposes an online distributed cost optimization control strategy to be responsible for the operation and processing of applications in the communication system. The control strategy realizes that while ensuring the stability of the application queue, the optimization target system overhead is infinitely close to the theoretical optimal value. In addition, according to the competition data mining and prediction requirements, this paper constructs a system structure model, and designs experiments to verify the system performance. The research results show that the performance of the data mining and prediction system of large-scale (sports) competition constructed in this paper meets actual needs.


2021 ◽  
Vol 2 (4) ◽  
pp. 247-253
Author(s):  
Milyani Aritonang

The need for fertilizer at the Plant Protection Development Unit (UPPT) is uncertain depending on the demand of farmers, therefore it is necessary to predict fertilizer needs. There are five types of fertilizers predicted by the Plant Protection Development Unit (UPPT), including Urea fertilizer, ZA fertilizer, SP-36 fertilizer, NPK fertilizer, and Organic fertilizer, so fertilizer needs can be predicted. In predicting data mining on fertilizer needs using the ID3 algorithm. Where it works is calculating the value of entropy and gain to get the final result in the form of a tree to the decision and rule. Testing is done using the tanagra software. The results of the tests carried out on the tanagra application using the ID3 algorithm are in the form of a decision tree, while in the calculation the results obtained are in the form of a decision tree.


Author(s):  
Saja Taha Ahmed ◽  
Rafah Al-Hamdani ◽  
Muayad Sadik Croock

<p><span>Recently, the decision trees have been adopted among the preeminent utilized classification models. They acquire their fame from their efficiency in predictive analytics, easy to interpret and implicitly perform feature selection. This latter perspective is one of essential significance in Educational Data Mining (EDM), in which selecting the most relevant features has a major impact on classification accuracy enhancement. <br /> The main contribution is to build a new multi-objective decision tree, which can be used for feature selection and classification. The proposed Decisive Decision Tree (DDT) is introduced and constructed based on a decisive feature value as a feature weight related to the target class label. The traditional Iterative Dichotomizer 3 (ID3) algorithm and the proposed DDT are compared using three datasets in terms of some ID3 issues, including logarithmic calculation complexity and multi-values features<em></em>selection. The results indicated that the proposed DDT outperforms the ID3 in the developing time. The accuracy of the classification is improved on the basis of 10-fold cross-validation for all datasets with the highest accuracy achieved by the proposed method is 92% for the student.por dataset and holdout validation for two datasets, i.e. Iraqi and Student-Math. The experiment also shows that the proposed DDT tends to select attributes that are important rather than multi-value. </span></p>


2016 ◽  
Vol 7 (4) ◽  
Author(s):  
Mochammad Yusa ◽  
Ema Utami ◽  
Emha T. Luthfi

Abstract. Readmission is associated with quality measures on patients in hospitals. Different attributes related to diabetic patients such as medication, ethnicity, race, lifestyle, age, and others result in the calculation of quality care that tends to be complicated. Classification techniques of data mining can solve this problem. In this paper, the evaluation on three different classifiers, i.e. Decision Tree, k-Nearest Neighbor (k-NN), dan Naive Bayes with various settingparameter, is developed by using 10-Fold Cross Validation technique. The targets of parameter performance evaluated is based on term of Accuracy, Mean Absolute Error (MAE), dan Kappa Statistic. The selected dataset consists of 47 attributes and 49.735 records. The result shows that k-NN classifier with k=100 has a better performance in term of accuracy and Kappa Statistic, but Naive Bayes outperforms in term of MAE among other classifiers. Keywords: k-NN, naive bayes, diabetes, readmissionAbstrak. Proses Readmisi dikaitkan dengan perhitungan kualitas penanganan pasien di rumah sakit. Perbedaan atribut-atribut yang berhubungan dengan pasien diabetes proses medikasi, etnis, ras, gaya hidup, umur, dan lain-lain, mengakibatkan perhitungan kualitas cenderung rumit. Teknik klasifikasi data mining dapat menjadi solusi dalam perhitungan kualitas ini. Teknik klasifikasi merupakan salah satu teknik data mining yang perkembangannya cukup signifikan. Di dalam penelitian ini, model algoritma klasifikasi Decision Tree, k-Nearest Neighbor (k-NN), dan Naive Bayes dengan berbagai parameter setting akan dievaluasi performanya berdasarkan nilai performa Accuracy, Mean AbsoluteError (MAE), dan Kappa Statistik dengan metode 10-Fold Cross Validation. Dataset yang dievaluasi memiliki 47 atribut dengan 49.735 records. Hasil penelitian menunjukan bahwa performa accuracy, MAE, dan Kappa Statistik terbaik didapatkan dari Model Algoritma Naive Bayes.Kata Kunci: k-NN, naive bayes, diabetes, readmisi


Electronics ◽  
2020 ◽  
Vol 9 (8) ◽  
pp. 1206
Author(s):  
Hui Xu ◽  
Krzysztof Przystupa ◽  
Ce Fang ◽  
Andrzej Marciniak ◽  
Orest Kochan ◽  
...  

With the widespread use of the Internet, network security issues have attracted more and more attention, and network intrusion detection has become one of the main security technologies. As for network intrusion detection, the original data source always has a high dimension and a large amount of data, which greatly influence the efficiency and the accuracy. Thus, both feature selection and the classifier then play a significant role in raising the performance of network intrusion detection. This paper takes the results of classification optimization of weighted K-nearest neighbor (KNN) with those of the feature selection algorithm into consideration, and proposes a combination strategy of feature selection based on an integrated optimization algorithm and weighted KNN, in order to improve the performance of network intrusion detection. Experimental results show that the weighted KNN can increase the efficiency at the expense of a small amount of the accuracy. Thus, the proposed combination strategy of feature selection based on an integrated optimization algorithm and weighted KNN can then improve both the efficiency and the accuracy of network intrusion detection.


2012 ◽  
Vol 466-467 ◽  
pp. 308-313
Author(s):  
Dan Guo

The decision tree algorithm is a kind of approximate discrete function value method with high precision, construction model of classification of noise data is simple and has good robustness etc, it is currently the most widely used in one of the inductive reasoning algorithms in data mining, extensive attention by researchers. This paper selects the decision tree ID3 algorithm to realize the standardization of lumber level division, to ensure the accuracy of the lumber division, while improving the partition of speed.


2018 ◽  
Vol 4 (2) ◽  
pp. 83
Author(s):  
Tutus Praningki ◽  
Indra Budi

Tersedianya data histori rekam medis pasien kanker serviks pada institusi pelayanan kesehatan, tidak disertai dengan proses ekstraksi menjadi sebuah pengetahuan atau informasi. Penggunaan teknik data mining sangat berpotensi untuk diimplementasikan kedalam sistem yang dapat melakukan prediksi penyakit kanker serviks. Pada penelitian ini berfokus pada dataset diagnosa medis pasien yang akan melakukan tes Pap Smear. Algoritma yang digunakan untuk melakukan klasifikasi penyakit kanker serviks adalah Classification And Regression Trees (CART), Naive Bayes, dan k-Nearest Neighbor (k-NN). Pengujian yang dilakukan terhadap algoritma CART Decision Tree, Naive Bayes, dan k-NN, menggunakan formula Confusion Matrix, dengan menggunakan teknik pemecahan dataset Holdout. Hasil pengujian terhadap algoritma yang digunakan, menunjukkan algoritma Naive Bayes memiliki akurasi terbaik sebesar 94,44%, sedangkan tingkat akurasi yang dihasilkan algoritma CART dan k-NN adalah 88,89%, 85,04%. Performa yang didapatkan oleh masing-masing algoritma yang digunakan, memungkinkan penggunaan sistem prediksi penyakit kanker serviks untuk mendukung keputusan klinis pada pasien baru. 


Author(s):  
Sidra Javed ◽  
Hamza Javed ◽  
Ayesha Saddique ◽  
Beenish Rafiq

— Prediction of heart disease is a big concern now a days because everyone is busy and due to heavy load of work people do not give attention to their health. To diagnose a disease is a big challenge. The issue is to extract data that have some meaningful knowledge. For this purpose, data mining techniques are used to extract meaningful data. Decision Tree and ID3 are used to predict heart diseases. Many researchers and practitioners are familiar with prediction of heart diseases and wide range of techniques is available to predict disease. To address this problem, Decision Tree is used to predict the heart disease. In this study the collected data is pre-processed, Decision Tree algorithm and ID3 were then applied to predict the heart disease.   Index Terms— Decision Tree, ID3 Algorithm, Data Mining, Decision Support System (DSS), knowledge Discovery from Databases (KDD).


Sign in / Sign up

Export Citation Format

Share Document