scholarly journals Data Mining and Application of Decision Tree Modelling on Electrochemical Data Used for Damaged Starch Detection

Author(s):  
Nilüfer YILDIRIM ◽  
Alper TAPAN
2019 ◽  
Vol 15 (2) ◽  
pp. 275-280
Author(s):  
Agus Setiyono ◽  
Hilman F Pardede

It is now common for a cellphone to receive spam messages. Great number of received messages making it difficult for human to classify those messages to Spam or no Spam.  One way to overcome this problem is to use Data Mining for automatic classifications. In this paper, we investigate various data mining techniques, named Support Vector Machine, Multinomial Naïve Bayes and Decision Tree for automatic spam detection. Our experimental results show that Support Vector Machine algorithm is the best algorithm over three evaluated algorithms. Support Vector Machine achieves 98.33%, while Multinomial Naïve Bayes achieves 98.13% and Decision Tree is at 97.10 % accuracy.


2021 ◽  
Vol 12 (2) ◽  
Author(s):  
Mohammad Haekal ◽  
Henki Bayu Seta ◽  
Mayanda Mega Santoni

Untuk memprediksi kualitas air sungai Ciliwung, telah dilakukan pengolahan data-data hasil pemantauan secara Online Monitoring dengan menggunakan Metode Data Mining. Pada metode ini, pertama-tama data-data hasil pemantauan dibuat dalam bentuk tabel Microsoft Excel, kemudian diolah menjadi bentuk Pohon Keputusan yang disebut Algoritma Pohon Keputusan (Decision Tree) mengunakan aplikasi WEKA. Metode Pohon Keputusan dipilih karena lebih sederhana, mudah dipahami dan mempunyai tingkat akurasi yang sangat tinggi. Jumlah data hasil pemantauan kualitas air sungai Ciliwung yang diolah sebanyak 5.476 data. Hasil klarifikasi dengan Pohon Keputusan, dari 5.476 data ini diperoleh jumlah data yang mengindikasikan sungai Ciliwung Tidak Tercemar sebanyak 1.059 data atau sebesar 19,3242%, dan yang mengindikasikan Tercemar sebanyak 4.417 data atau 80,6758%. Selanjutnya data-data hasil pemantauan ini dievaluasi menggunakan 4 Opsi Tes (Test Option) yaitu dengan Use Training Set, Supplied Test Set, Cross-Validation folds 10, dan Percentage Split 66%. Hasil evaluasi dengan 4 opsi tes yang digunakan ini, semuanya menunjukkan tingkat akurasi yang sangat tinggi, yaitu diatas 99%. Dari data-data hasil peneltian ini dapat diprediksi bahwa sungai Ciliwung terindikasi sebagai sungai tercemar bila mereferensi kepada Peraturan Pemerintah Republik Indonesia nomor 82 tahun 2001 dan diketahui pula bahwa penggunaan aplikasi WEKA dengan Algoritma Pohon Keputusan untuk mengolah data-data hasil pemantauan dengan mengambil tiga parameter (pH, DO dan Nitrat) adalah sangat akuran dan tepat. Kata Kunci : Kualitas air sungai, Data Mining, Algoritma Pohon Keputusan, Aplikasi WEKA.


2020 ◽  
Vol 7 (2) ◽  
pp. 200
Author(s):  
Puji Santoso ◽  
Rudy Setiawan

One of the tasks in the field of marketing finance is to analyze customer data to find out which customers have the potential to do credit again. The method used to analyze customer data is by classifying all customers who have completed their credit installments into marketing targets, so this method causes high operational marketing costs. Therefore this research was conducted to help solve the above problems by designing a data mining application that serves to predict the criteria of credit customers with the potential to lend (credit) to Mega Auto Finance. The Mega Auto finance Fund Section located in Kotim Regency is a place chosen by researchers as a case study, assuming the Mega Auto finance Fund Section has experienced the same problems as described above. Data mining techniques that are applied to the application built is a classification while the classification method used is the Decision Tree (decision tree). While the algorithm used as a decision tree forming algorithm is the C4.5 Algorithm. The data processed in this study is the installment data of Mega Auto finance loan customers in July 2018 in Microsoft Excel format. The results of this study are an application that can facilitate the Mega Auto finance Funds Section in obtaining credit marketing targets in the future


2009 ◽  
Vol 147-149 ◽  
pp. 588-593 ◽  
Author(s):  
Marcin Derlatka ◽  
Jolanta Pauk

In the paper the procedure of processing biomechanical data has been proposed. It consists of selecting proper noiseless data, preprocessing data by means of model’s identification and Kernel Principal Component Analysis and next classification using decision tree. The obtained results of classification into groups (normal and two selected pathology of gait: Spina Bifida and Cerebral Palsy) were very good.


2010 ◽  
Vol 40-41 ◽  
pp. 156-161 ◽  
Author(s):  
Yang Li ◽  
Yan Qiang Li ◽  
Zhi Xue Wang

With the rapid development of automotive ECUs(Electronic Control Unit), the fault diagnosis becomes increasingly complicated. And the link between fault and symptom becomes less obvious. In order to improve the maintenance quality and efficiency, the paper proposes a fault diagnosis approach based on data mining technologies. By making full use of data stream, we firstly extract fault symptom vectors by processing data stream, and then establish a diagnosis decision tree through the ID3 decision tree algorithm, and finally store the link rules between faults and the related symptoms into historical fault database as a foundation for the fault diagnosis. The database provides the basis of trend judgments for a future fault. To verify this approach, an example of diagnosing faults of entertainment ECU is showed. The test result testifies the reliability and validity of this diagnostic method and reduces the cost of ECU diagnosis.


2021 ◽  
pp. 1-10
Author(s):  
Chao Dong ◽  
Yan Guo

The wide application of artificial intelligence technology in various fields has accelerated the pace of people exploring the hidden information behind large amounts of data. People hope to use data mining methods to conduct effective research on higher education management, and decision tree classification algorithm as a data analysis method in data mining technology, high-precision classification accuracy, intuitive decision results, and high generalization ability make it become a more ideal method of higher education management. Aiming at the sensitivity of data processing and decision tree classification to noisy data, this paper proposes corresponding improvements, and proposes a variable precision rough set attribute selection standard based on scale function, which considers both the weighted approximation accuracy and attribute value of the attribute. The number improves the anti-interference ability of noise data, reduces the bias in attribute selection, and improves the classification accuracy. At the same time, the suppression factor threshold, support and confidence are introduced in the tree pre-pruning process, which simplifies the tree structure. The comparative experiments on standard data sets show that the improved algorithm proposed in this paper is better than other decision tree algorithms and can effectively realize the differentiated classification of higher education management.


2014 ◽  
Vol 538 ◽  
pp. 460-464
Author(s):  
Xue Li

Based on inter-correlation and permeability among disciplines, the author makes an attempt to apply the information science to cognitive linguistics to provide a new perspective for the study of foreign languages. The correlation between self-efficacy and such four factors as anxiety, learning strategies, motivation and learners’ past achievement is analyzed by means of data mining and the extent to which the above factors affect self-efficacy in language learning is explored in this paper. The paper employs the decision tree algorithm in SPSS Clementine. C5.0 decision tree algorithm is adopted to analyze data in the study. The results are elicited from the researches carried out in this paper. The increased anxiety is bound to weaken learners’ motivation over time. It is obvious that learners have low self-efficacy. It is very important to employ strategies in foreign language learning. Ignorance of using learning strategies may result in unplanned learning with unsatisfactory achievements in spite of more efforts involved. Self-efficacy in foreign language learning may be weakened accordingly. Learners’ past achievement is a reference dimension in measuring self-efficacy with weaker influence.


2011 ◽  
Vol 403-408 ◽  
pp. 1804-1807
Author(s):  
Ning Zhao ◽  
Shao Hua Dong ◽  
Qing Tian

In order to optimize electric- arc welding (ERW) welded tube scheduling , the paper introduces data cleaning, data extraction and transformation in detail and defines the datasets of sample attribute, which is based on analysis of production process of ERW welded tube. Furthermore, Decision-Tree method is adopted to achieve data mining and summarize scheduling rules which are validated by an example.


Sign in / Sign up

Export Citation Format

Share Document