Data Mining and Application of Decision Tree Modelling on Electrochemical Data Used for Damaged Starch Detection

It is now common for a cellphone to receive spam messages. Great number of received messages making it difficult for human to classify those messages to Spam or no Spam. One way to overcome this problem is to use Data Mining for automatic classifications. In this paper, we investigate various data mining techniques, named Support Vector Machine, Multinomial Naïve Bayes and Decision Tree for automatic spam detection. Our experimental results show that Support Vector Machine algorithm is the best algorithm over three evaluated algorithms. Support Vector Machine achieves 98.33%, while Multinomial Naïve Bayes achieves 98.13% and Decision Tree is at 97.10 % accuracy.

Download Full-text

Adaptive Random Decision Tree: A New Approach for Data Mining with Privacy Preserving

International Journal of Innovative Research in Computer and Communication Engineering ◽

10.15680/ijircce.2015.0307004 ◽

2015 ◽

Vol 03 (07) ◽

pp. 6378-6384

Author(s):

Hemlata B. Deorukhakar, Prof. Pradnya Kasture

Keyword(s):

Data Mining ◽

Decision Tree ◽

Privacy Preserving ◽

New Approach

Download Full-text

PREDIKSI KUALITAS AIR SUNGAI CILIWUNG DENGAN MENGGUNAKAN ALGORITMA POHON KEPUTUSAN

Jurnal Air Indonesia ◽

10.29122/jai.v12i2.4364 ◽

2021 ◽

Vol 12 (2) ◽

Author(s):

Mohammad Haekal ◽

Henki Bayu Seta ◽

Mayanda Mega Santoni

Keyword(s):

Data Mining ◽

Decision Tree ◽

Cross Validation ◽

Online Monitoring ◽

Training Set ◽

Microsoft Excel ◽

Test Set

Untuk memprediksi kualitas air sungai Ciliwung, telah dilakukan pengolahan data-data hasil pemantauan secara Online Monitoring dengan menggunakan Metode Data Mining. Pada metode ini, pertama-tama data-data hasil pemantauan dibuat dalam bentuk tabel Microsoft Excel, kemudian diolah menjadi bentuk Pohon Keputusan yang disebut Algoritma Pohon Keputusan (Decision Tree) mengunakan aplikasi WEKA. Metode Pohon Keputusan dipilih karena lebih sederhana, mudah dipahami dan mempunyai tingkat akurasi yang sangat tinggi. Jumlah data hasil pemantauan kualitas air sungai Ciliwung yang diolah sebanyak 5.476 data. Hasil klarifikasi dengan Pohon Keputusan, dari 5.476 data ini diperoleh jumlah data yang mengindikasikan sungai Ciliwung Tidak Tercemar sebanyak 1.059 data atau sebesar 19,3242%, dan yang mengindikasikan Tercemar sebanyak 4.417 data atau 80,6758%. Selanjutnya data-data hasil pemantauan ini dievaluasi menggunakan 4 Opsi Tes (Test Option) yaitu dengan Use Training Set, Supplied Test Set, Cross-Validation folds 10, dan Percentage Split 66%. Hasil evaluasi dengan 4 opsi tes yang digunakan ini, semuanya menunjukkan tingkat akurasi yang sangat tinggi, yaitu diatas 99%. Dari data-data hasil peneltian ini dapat diprediksi bahwa sungai Ciliwung terindikasi sebagai sungai tercemar bila mereferensi kepada Peraturan Pemerintah Republik Indonesia nomor 82 tahun 2001 dan diketahui pula bahwa penggunaan aplikasi WEKA dengan Algoritma Pohon Keputusan untuk mengolah data-data hasil pemantauan dengan mengambil tiga parameter (pH, DO dan Nitrat) adalah sangat akuran dan tepat. Kata Kunci : Kualitas air sungai, Data Mining, Algoritma Pohon Keputusan, Aplikasi WEKA.

Download Full-text

Penerapan Metode Klasifikasi Decision Tree dan Algoritma C4.5 dalam Memprediksi Kriteria Nasabah Kredit Mega Auto Finance

JURIKOM (Jurnal Riset Komputer) ◽

10.30865/jurikom.v7i2.1762 ◽

2020 ◽

Vol 7 (2) ◽

pp. 200

Author(s):

Puji Santoso ◽

Rudy Setiawan

Keyword(s):

Data Mining ◽

Decision Tree ◽

Microsoft Excel ◽

Customer Data ◽

Data Mining Techniques ◽

C4.5 Algorithm ◽

Marketing Costs ◽

Excel Format ◽

Data Mining Application

One of the tasks in the field of marketing finance is to analyze customer data to find out which customers have the potential to do credit again. The method used to analyze customer data is by classifying all customers who have completed their credit installments into marketing targets, so this method causes high operational marketing costs. Therefore this research was conducted to help solve the above problems by designing a data mining application that serves to predict the criteria of credit customers with the potential to lend (credit) to Mega Auto Finance. The Mega Auto finance Fund Section located in Kotim Regency is a place chosen by researchers as a case study, assuming the Mega Auto finance Fund Section has experienced the same problems as described above. Data mining techniques that are applied to the application built is a classification while the classification method used is the Decision Tree (decision tree). While the algorithm used as a decision tree forming algorithm is the C4.5 Algorithm. The data processed in this study is the installment data of Mega Auto finance loan customers in July 2018 in Microsoft Excel format. The results of this study are an application that can facilitate the Mega Auto finance Funds Section in obtaining credit marketing targets in the future

Download Full-text

Exploring the Determinants of Korean Dance Recognition and Importance: Application of Decision Tree Analysis based on Data Mining

Dance Research Journal of Dance ◽

10.21317/ksd.77.1.2 ◽

2019 ◽

Vol 77 (1) ◽

pp. 17-29

Author(s):

Hye-Ryeon Kim

Keyword(s):

Data Mining ◽

Decision Tree ◽

Decision Tree Analysis ◽

Tree Analysis ◽

Korean Dance

Download Full-text

Data Mining in Analysis of Biomechanical Signals

Solid State Phenomena ◽

10.4028/www.scientific.net/ssp.147-149.588 ◽

2009 ◽

Vol 147-149 ◽

pp. 588-593 ◽

Cited By ~ 3

Author(s):

Marcin Derlatka ◽

Jolanta Pauk

Keyword(s):

Data Mining ◽

Principal Component Analysis ◽

Cerebral Palsy ◽

Spina Bifida ◽

Decision Tree ◽

Principal Component ◽

Data Preprocessing ◽

Component Analysis ◽

Kernel Principal Component Analysis

In the paper the procedure of processing biomechanical data has been proposed. It consists of selecting proper noiseless data, preprocessing data by means of model’s identification and Kernel Principal Component Analysis and next classification using decision tree. The obtained results of classification into groups (normal and two selected pathology of gait: Spina Bifida and Cerebral Palsy) were very good.

Download Full-text

Fault Diagnosis of Automobile ECUs with Data Mining Technologies

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.40-41.156 ◽

2010 ◽

Vol 40-41 ◽

pp. 156-161 ◽

Cited By ~ 1

Author(s):

Yang Li ◽

Yan Qiang Li ◽

Zhi Xue Wang

Keyword(s):

Data Mining ◽

Fault Diagnosis ◽

Decision Tree ◽

Data Stream ◽

Electronic Control Unit ◽

Rapid Development ◽

Reliability And Validity ◽

Control Unit ◽

Use Of Data ◽

The Cost

With the rapid development of automotive ECUs(Electronic Control Unit), the fault diagnosis becomes increasingly complicated. And the link between fault and symptom becomes less obvious. In order to improve the maintenance quality and efficiency, the paper proposes a fault diagnosis approach based on data mining technologies. By making full use of data stream, we firstly extract fault symptom vectors by processing data stream, and then establish a diagnosis decision tree through the ID3 decision tree algorithm, and finally store the link rules between faults and the related symptoms into historical fault database as a foundation for the fault diagnosis. The database provides the basis of trend judgments for a future fault. To verify this approach, an example of diagnosing faults of entertainment ECU is showed. The test result testifies the reliability and validity of this diagnostic method and reduces the cost of ECU diagnosis.

Download Full-text

Improved differentiation classification of variable precision artificial intelligence higher education management

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-219036 ◽

2021 ◽

pp. 1-10

Author(s):

Chao Dong ◽

Yan Guo

Keyword(s):

Artificial Intelligence ◽

Higher Education ◽

Data Mining ◽

Decision Tree ◽

Classification Accuracy ◽

Attribute Selection ◽

Higher Education Management ◽

Education Management ◽

Decision Tree Classification

The wide application of artificial intelligence technology in various fields has accelerated the pace of people exploring the hidden information behind large amounts of data. People hope to use data mining methods to conduct effective research on higher education management, and decision tree classification algorithm as a data analysis method in data mining technology, high-precision classification accuracy, intuitive decision results, and high generalization ability make it become a more ideal method of higher education management. Aiming at the sensitivity of data processing and decision tree classification to noisy data, this paper proposes corresponding improvements, and proposes a variable precision rough set attribute selection standard based on scale function, which considers both the weighted approximation accuracy and attribute value of the attribute. The number improves the anti-interference ability of noise data, reduces the bias in attribute selection, and improves the classification accuracy. At the same time, the suppression factor threshold, support and confidence are introduced in the tree pre-pruning process, which simplifies the tree structure. The comparative experiments on standard data sets show that the improved algorithm proposed in this paper is better than other decision tree algorithms and can effectively realize the differentiated classification of higher education management.

Download Full-text

An Analysis of Factors Influencing Foreign Language Self-Efficacy Based on C5.0 Decision Tree Algorithm in Data Mining

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.538.460 ◽

2014 ◽

Vol 538 ◽

pp. 460-464

Author(s):

Xue Li

Keyword(s):

Data Mining ◽

Decision Tree ◽

Foreign Language ◽

Language Learning ◽

Learning Strategies ◽

Self Efficacy ◽

Foreign Language Learning ◽

Decision Tree Algorithm ◽

Tree Algorithm ◽

C5.0 Decision Tree

Based on inter-correlation and permeability among disciplines, the author makes an attempt to apply the information science to cognitive linguistics to provide a new perspective for the study of foreign languages. The correlation between self-efficacy and such four factors as anxiety, learning strategies, motivation and learners’ past achievement is analyzed by means of data mining and the extent to which the above factors affect self-efficacy in language learning is explored in this paper. The paper employs the decision tree algorithm in SPSS Clementine. C5.0 decision tree algorithm is adopted to analyze data in the study. The results are elicited from the researches carried out in this paper. The increased anxiety is bound to weaken learners’ motivation over time. It is obvious that learners have low self-efficacy. It is very important to employ strategies in foreign language learning. Ignorance of using learning strategies may result in unplanned learning with unsatisfactory achievements in spite of more efforts involved. Self-efficacy in foreign language learning may be weakened accordingly. Learners’ past achievement is a reference dimension in measuring self-efficacy with weaker influence.

Download Full-text

Data Mining for ERW Welded Tube Scheduling Rules Based on Decision Tree

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.403-408.1804 ◽

2011 ◽

Vol 403-408 ◽

pp. 1804-1807

Author(s):

Ning Zhao ◽

Shao Hua Dong ◽

Qing Tian

Keyword(s):

Data Mining ◽

Decision Tree ◽

Arc Welding ◽

Data Cleaning ◽

Data Extraction ◽

Electric Arc ◽

Welded Tube ◽

Electric Arc Welding ◽

Decision Tree Method ◽

Tree Method

In order to optimize electric- arc welding （ERW） welded tube scheduling , the paper introduces data cleaning, data extraction and transformation in detail and defines the datasets of sample attribute, which is based on analysis of production process of ERW welded tube. Furthermore, Decision-Tree method is adopted to achieve data mining and summarize scheduling rules which are validated by an example.

Download Full-text