Analyzing Student Performance in Programming Education Using Classification Techniques

In this research, we aggregated students log data such as Class Test Score (CTS), Assignment Completed (ASC), Class Lab Work (CLW) and Class Attendance (CATT) from the Department of Mathematics, Computer Science Unit, Usmanu Danfodiyo University, Sokoto, Nigeria. Similarly, we employed data mining techniques such as ID3 & J48 Decision Tree Algorithms to analyze these data. We compared these algorithms on 239 classification instances. The experimental results show that the J48 algorithm has higher accuracy in the classification task compared to the ID3 algorithm. The important feature attributes such as Information Gain and Gain Ratio feature evaluators were also compared. Both the methods applied were able to rank search method and the experimental results confirmed that the two methods derived the same set of attributes with a slight deviation in the ranking. From the results analyzed, we discovered that 67.36 percent failed the course titled Introduction to Computer Programming, while 32.64 percent passed the course. Since the CATT has the highest gain value from our analysis; we concluded that it is largely responsible for the success or failure of the students.

Download Full-text

Predicting Students’ Performance In Basic Algorithms Programming In an E-Learning Environment Using Decision Tree Approach

Syntax Literate ; Jurnal Ilmiah Indonesia ◽

10.36418/syntax-literate.v7i1.5733 ◽

2022 ◽

Vol 7 (1) ◽

pp. 498

Author(s):

Jonas De Deus Guterres ◽

Kusuma Ayu Laksitowening ◽

Febryanti Sthevanie

Keyword(s):

Learning Environment ◽

Decision Tree ◽

Student Performance ◽

Information Gain ◽

Quality Data ◽

Basic Algorithm ◽

Id3 Algorithm ◽

E Learning ◽

Advanced Analysis ◽

Quality In Higher Education

Predicting the performance of students plays an important role in every institution to protect their students from failures and leverage their quality in higher education. Algorithm and Programming is a fundamental course for the students who start their studies in Informatics. Hence, the scope of this research is to identify the critical attributes which influence student performance in the E-learning Environment on Moodle LMS (Learning Management System) Platform and its accuracy. Data mining helps the process of preprocessing data in a dataset from raw data to quality data for advanced analysis. Dataset set is consisting of student academic performance such as grades of Quizzes, Mid exams, Final exams, and Final projects. Moreover, the dataset from LMS is considered as well in the process of modeling, in terms of constructing the decision tree, such as punctuality submission of Quizzes, Assignments, and Final Projects. Regarding the Basic Algorithm and Programming course, which is separated into two subjects in the first and second semester, thus the research will predict the student performance in the Basic Algorithm and programming course in the second semester based on the Introduction to programming course in the first semester. Decision Tree techniques are applied by using information gain in ID3 algorithm to get the important feature which is the PP index has the highest information gain with value 0.44, also the accuracy between ID3 and J48 algorithm that shows ID3 has the highest accuracy of modeling which is 84.80% compared to J48 82.34%.

Download Full-text

The Application of Data Mining Technology in the Teaching Evaluation in Colleges and Universities

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2017.6115 ◽

2017 ◽

Vol 14 (1) ◽

pp. 7-12 ◽

Cited By ~ 2

Author(s):

Xiaoqi Liu

Keyword(s):

Data Mining ◽

Decision Tree ◽

Evaluation System ◽

Information Gain ◽

Original Data ◽

Teaching Evaluation ◽

Personal Factors ◽

Mining Technology ◽

Id3 Algorithm ◽

College Work

As the teaching management informationization level is higher and higher, Network based teaching evaluation system has been widely used, and a lot of evaluation of the original data has been accumulated. This research, taking recent five years teaching evaluation data of the college work for as basis, analyzes teachers’ personal factors and teaching operation factors respectively with the data mining technology of decision tree ID3 algorithm. By calculating the factors of information entropy and information gain value, the corresponding decision tree is gained. The teaching evaluation results are made use of really rather than become a mere formality, and thus provide powerful basis for the effectiveness and scientificalness of teaching evaluation.

Download Full-text

COMPARATIVE STUDY OF MACHINE LEARNING KNN, SVM, AND DECISION TREE ALGORITHM TO PREDICT STUDENT’S PERFORMANCE

International Journal of Research -GRANTHAALAYAH ◽

10.29121/granthaalayah.v7.i1.2019.1048 ◽

2019 ◽

Vol 7 (1) ◽

pp. 190-196

Author(s):

Slamet Wiyono ◽

Taufiq Abidin

Keyword(s):

Decision Tree ◽

Student Performance ◽

Prediction Accuracy ◽

Model Building ◽

Decision Tree Algorithm ◽

Tree Algorithm ◽

Svm Algorithm ◽

Tree Algorithms ◽

Student’S Performance ◽

Predicting Student Performance

Students who are not-active will affect the number of students who graduate on time. Prevention of not-active students can be done by predicting student performance. The study was conducted by comparing the KNN, SVM, and Decision Tree algorithms to get the best predictive model. The model making process was carried out by steps; data collecting, pre-processing, model building, comparison of models, and evaluation. The results show that the SVM algorithm has the best accuracy in predicting with a precision value of 95%. The Decision Tree algorithm has a prediction accuracy of 93% and the KNN algorithm has a prediction accuracy value of 92%.

Download Full-text

Penerapan Algoritma ID3 dalam Prediksi Kebutuhan Pupuk

Journal of Information System Research (JOSH) ◽

10.47065/josh.v2i4.565 ◽

2021 ◽

Vol 2 (4) ◽

pp. 247-253

Author(s):

Milyani Aritonang

Keyword(s):

Data Mining ◽

Decision Tree ◽

Plant Protection ◽

Organic Fertilizer ◽

Urea Fertilizer ◽

Id3 Algorithm ◽

Development Unit ◽

Npk Fertilizer

The need for fertilizer at the Plant Protection Development Unit (UPPT) is uncertain depending on the demand of farmers, therefore it is necessary to predict fertilizer needs. There are five types of fertilizers predicted by the Plant Protection Development Unit (UPPT), including Urea fertilizer, ZA fertilizer, SP-36 fertilizer, NPK fertilizer, and Organic fertilizer, so fertilizer needs can be predicted. In predicting data mining on fertilizer needs using the ID3 algorithm. Where it works is calculating the value of entropy and gain to get the final result in the form of a tree to the decision and rule. Testing is done using the tanagra software. The results of the tests carried out on the tanagra application using the ID3 algorithm are in the form of a decision tree, while in the calculation the results obtained are in the form of a decision tree.

Download Full-text

Classification Methods in the Detection of New Suspicious Emails

Journal of Information & Knowledge Management ◽

10.1142/s0219649208002044 ◽

2008 ◽

Vol 07 (03) ◽

pp. 209-217 ◽

Cited By ~ 2

Author(s):

S. Appavu Alias Balamurugan ◽

G. Athiappan ◽

M. Muthu Pandian ◽

R. Rajaram

Keyword(s):

Neural Network ◽

Data Mining ◽

Decision Tree ◽

Binary Tree ◽

Experimental Results ◽

Classification Methods ◽

Detection Rates ◽

Naive Bayesian ◽

The Past ◽

Naïve Bayesian

Email has become one of the fastest and most economical forms of communication. However, the increase of email users has resulted in the dramatic increase of suspicious emails during the past few years. This paper proposes to apply classification data mining for the task of suspicious email detection based on deception theory. In this paper, email data was classified using four different classifiers (Neural Network, SVM, Naïve Bayesian and Decision Tree). The experiment was performed using weka on the basis of different data size by which the suspicious emails are detected from the email corpus. Experimental results show that simple ID3 classifier which make a binary tree, will give a promising detection rates.

Download Full-text

Student Performance Predictions Using Knowledge Discovery Database and Data Mining, DPU Students Records as Sample

Academic Journal of Nawroz University ◽

10.25007/ajnu.v10n3a875 ◽

2021 ◽

Vol 10 (3) ◽

pp. 121-127

Author(s):

Bareen Haval ◽

Karwan Jameel Abdulrahman ◽

Araz Rajab

Keyword(s):

Data Mining ◽

Decision Tree ◽

Student Performance ◽

Educational Data Mining ◽

Data Sets ◽

Decision Tree Classifier ◽

Data Mining Techniques ◽

Academic History ◽

Tree Classifier ◽

Using Data

This article presents the results of connecting an educational data mining techniques to the academic performance of students. Three classification models (Decision Tree, Random Forest and Deep Learning) have been developed to analyze data sets and predict the performance of students. The projected submission of the three classificatory was calculated and matched. The academic history and data of the students from the Office of the Registrar were used to train the models. Our analysis aims to evaluate the results of students using various variables such as the student's grade. Data from (221) students with (9) different attributes were used. The results of this study are very important, provide a better understanding of student success assessments and stress the importance of data mining in education. The main purpose of this study is to show the student successful forecast using data mining techniques to improve academic programs. The results of this research indicate that the Decision Tree classifier overtakes two other classifiers by achieving a total prediction accuracy of 97%.

Download Full-text

DEVELOPING A PARALLEL CLASSIFIER FOR MINING IN BIG DATA SETS

IIUM Engineering Journal ◽

10.31436/iiumej.v22i2.1541 ◽

2021 ◽

Vol 22 (2) ◽

pp. 119-134

Author(s):

Ahad Shamseen ◽

Morteza Mohammadi Zanjireh ◽

Mahdi Bahaghighat ◽

Qin Xin

Keyword(s):

Data Mining ◽

Big Data ◽

Decision Tree ◽

Main Memory ◽

Experimental Results ◽

Primary Data ◽

Data Sets ◽

Decision Tree Classifier ◽

Vast Amount ◽

Tree Classifier

Data mining is the extraction of information and its roles from a vast amount of data. This topic is one of the most important topics these days. Nowadays, massive amounts of data are generated and stored each day. This data has useful information in different fields that attract programmers’ and engineers’ attention. One of the primary data mining classifying algorithms is the decision tree. Decision tree techniques have several advantages but also present drawbacks. One of its main drawbacks is its need to reside its data in the main memory. SPRINT is one of the decision tree builder classifiers that has proposed a fix for this problem. In this paper, our research developed a new parallel decision tree classifier by working on SPRINT results. Our experimental results show considerable improvements in terms of the runtime and memory requirements compared to the SPRINT classifier. Our proposed classifier algorithm could be implemented in serial and parallel environments and can deal with big data. ABSTRAK: Perlombongan data adalah pengekstrakan maklumat dan peranannya dari sejumlah besar data. Topik ini adalah salah satu topik yang paling penting pada masa ini. Pada masa ini, data yang banyak dihasilkan dan disimpan setiap hari. Data ini mempunyai maklumat berguna dalam pelbagai bidang yang menarik perhatian pengaturcara dan jurutera. Salah satu algoritma pengkelasan perlombongan data utama adalah pokok keputusan. Teknik pokok keputusan mempunyai beberapa kelebihan tetapi kekurangan. Salah satu kelemahan utamanya adalah keperluan menyimpan datanya dalam memori utama. SPRINT adalah salah satu pengelasan pembangun pokok keputusan yang telah mengemukakan untuk masalah ini. Dalam makalah ini, penyelidikan kami sedang mengembangkan pengkelasan pokok keputusan selari baru dengan mengusahakan hasil SPRINT. Hasil percubaan kami menunjukkan peningkatan yang besar dari segi jangka masa dan keperluan memori berbanding dengan pengelasan SPRINT. Algoritma pengklasifikasi yang dicadangkan kami dapat dilaksanakan dalam persekitaran bersiri dan selari dan dapat menangani data besar.

Download Full-text

Developed third iterative dichotomizer based on feature decisive values for educational data mining

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v18.i1.pp209-217 ◽

2020 ◽

Vol 18 (1) ◽

pp. 209

Author(s):

Saja Taha Ahmed ◽

Rafah Al-Hamdani ◽

Muayad Sadik Croock

Keyword(s):

Data Mining ◽

Feature Selection ◽

Decision Tree ◽

Predictive Analytics ◽

Educational Data Mining ◽

Target Class ◽

Id3 Algorithm ◽

Feature Weight ◽

Holdout Validation ◽

Fold Cross Validation

Recently, the decision trees have been adopted among the preeminent utilized classification models. They acquire their fame from their efficiency in predictive analytics, easy to interpret and implicitly perform feature selection. This latter perspective is one of essential significance in Educational Data Mining (EDM), in which selecting the most relevant features has a major impact on classification accuracy enhancement. The main contribution is to build a new multi-objective decision tree, which can be used for feature selection and classification. The proposed Decisive Decision Tree (DDT) is introduced and constructed based on a decisive feature value as a feature weight related to the target class label. The traditional Iterative Dichotomizer 3 (ID3) algorithm and the proposed DDT are compared using three datasets in terms of some ID3 issues, including logarithmic calculation complexity and multi-values featuresselection. The results indicated that the proposed DDT outperforms the ID3 in the developing time. The accuracy of the classification is improved on the basis of 10-fold cross-validation for all datasets with the highest accuracy achieved by the proposed method is 92% for the student.por dataset and holdout validation for two datasets, i.e. Iraqi and Student-Math. The experiment also shows that the proposed DDT tends to select attributes that are important rather than multi-value.

Download Full-text

Appraisal of the Classification Technique in Data Mining of Student Performance using J48 Decision Tree, K-Nearest Neighbor and Multilayer Perceptron Algorithms

International Journal of Computer Applications ◽

10.5120/ijca2018916751 ◽

2018 ◽

Vol 179 (33) ◽

pp. 39-46 ◽

Cited By ~ 1

Author(s):

Faiza Umar ◽

Najim Ussiph

Keyword(s):

Data Mining ◽

Decision Tree ◽

Student Performance ◽

Multilayer Perceptron ◽

Nearest Neighbor ◽

K Nearest Neighbor ◽

Classification Technique ◽

J48 Decision Tree

Download Full-text

Entropy based C4.5-SHO algorithm with information gain optimization in data mining

PeerJ Computer Science ◽

10.7717/peerj-cs.424 ◽

2021 ◽

Vol 7 ◽

pp. e424

Author(s):

G Sekhar Reddy ◽

Suneetha Chittineni

Keyword(s):

Data Mining ◽

Decision Tree ◽

Information Gain ◽

Characteristic Curve ◽

Cuckoo Search ◽

Computer Assisted ◽

Quadratic Entropy ◽

C4.5 Decision Tree ◽

Data Investigation ◽

Gain Optimization

Information efficiency is gaining more importance in the development as well as application sectors of information technology. Data mining is a computer-assisted process of massive data investigation that extracts meaningful information from the datasets. The mined information is used in decision-making to understand the behavior of each attribute. Therefore, a new classification algorithm is introduced in this paper to improve information management. The classical C4.5 decision tree approach is combined with the Selfish Herd Optimization (SHO) algorithm to tune the gain of given datasets. The optimal weights for the information gain will be updated based on SHO. Further, the dataset is partitioned into two classes based on quadratic entropy calculation and information gain. Decision tree gain optimization is the main aim of our proposed C4.5-SHO method. The robustness of the proposed method is evaluated on various datasets and compared with classifiers, such as ID3 and CART. The accuracy and area under the receiver operating characteristic curve parameters are estimated and compared with existing algorithms like ant colony optimization, particle swarm optimization and cuckoo search.

Download Full-text