Applicability of Traditional Classification Techniques on Educational Data

Student performance prediction and analysis is an essential part of higher educational institutions, which helps in overall betterment of the educational system. Various traditional Data Mining (DM) techniques like Regression, Classification, etc. are prominently utilized for analyzing the data coming from educational settings. The usage of DM in the area of academics is called Educational Data Mining (EDM). The current pilot study aims to determine the applicability of these standalone classification techniques namely; Decision Tree, BayesNet, Nearest Neighbor, Rule-Based, and Random Forest (RF). The present pilot study uses the WEKA tool to implement traditional classification techniques on a standard dataset containing student academic information and background. The paper also implements feature selection to identify the high influential features from the dataset. It helps in reducing the dimensionality of the dataset as well as enhancing the accuracy of the classifier. The results of classifiers are compared on basis of standard statistical measures like Mean Absolute Error (MAE), Root Mean Square Error (RMSE), and Kappa, etc. The results show the applicability of classification algorithms for student performance prediction which will help under-achievers and struggling students to improve. It is found the output that, J48 algorithm of the Decision tree gave the best results. Further, it is deduced from the comparative analysis that individual classifiers give different accuracy on the same dataset due to class imbalance in a multiclass dataset.

Download Full-text

Student Academic Performance Prediction using Supervised Learning Techniques

International Journal of Emerging Technologies in Learning (iJET) ◽

10.3991/ijet.v14i14.10310 ◽

2019 ◽

Vol 14 (14) ◽

pp. 92 ◽

Cited By ~ 1

Author(s):

Muhammad Imran ◽

Shahzad Latif ◽

Danish Mehmood ◽

Muhammad Saqlain Shah

Keyword(s):

Data Mining ◽

Supervised Learning ◽

Student Performance ◽

Performance Prediction ◽

Class Imbalance ◽

Ensemble Methods ◽

Fine Tuning ◽

Classification Error ◽

Decision Tree Classifier ◽

Tree Classifier

Automatic Student performance prediction is a crucial job due to the large volume of data in educational databases. This job is being addressed by educational data mining (EDM). EDM develop methods for discovering data that is derived from educational environment. These methods are used for understanding student and their learning environment. The educational institutions are often curious that how many students will be pass/fail for necessary arrangements. In previous studies, it has been observed that many researchers have intension on the selection of appropriate algorithm for just classification and ignores the solutions of the problems which comes during data mining phases such as data high dimensionality ,class imbalance and classification error etc. Such types of problems reduced the accuracy of the model. Several well-known classification algorithms are applied in this domain but this paper proposed a student performance prediction model based on supervised learning decision tree classifier. In addition, an ensemble method is applied to improve the performance of the classifier. Ensemble methods approach is designed to solve classification, predictions problems. This study proves the importance of data preprocessing and algorithms fine-tuning tasks to resolve the data quality issues. The experimental dataset used in this work belongs to Alentejo region of Portugal which is obtained from UCI Machine Learning Repository. Three supervised learning algorithms (J48, NNge and MLP) are employed in this study for experimental purposes. The results showed that J48 achieved highest accuracy 95.78% among others.

Download Full-text

Appraisal of the Classification Technique in Data Mining of Student Performance using J48 Decision Tree, K-Nearest Neighbor and Multilayer Perceptron Algorithms

International Journal of Computer Applications ◽

10.5120/ijca2018916751 ◽

2018 ◽

Vol 179 (33) ◽

pp. 39-46 ◽

Cited By ~ 1

Author(s):

Faiza Umar ◽

Najim Ussiph

Keyword(s):

Data Mining ◽

Decision Tree ◽

Student Performance ◽

Multilayer Perceptron ◽

Nearest Neighbor ◽

K Nearest Neighbor ◽

Classification Technique ◽

J48 Decision Tree

Download Full-text

Comparative Study of Data Mining Classifiers for Students’ Academic Performance

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2020.9277 ◽

2020 ◽

Vol 17 (9) ◽

pp. 4548-4552

Author(s):

Vikas Rattan ◽

Ruchi Mittal ◽

Varun Malik

Keyword(s):

Data Mining ◽

Random Forest ◽

Decision Tree ◽

Comparative Study ◽

Student Performance ◽

Nearest Neighbor ◽

Educational Institutions ◽

Competitive Edge ◽

Data Mining Tool ◽

Mining Tool

Tremendous growth of educational institutions forced educational institutes to adopt data mining techniques to bring out important and yet unknown facts from educational data to have a competitive edge over their counterparts. In this paper, student performance dataset comprises of 131 records is taken from UCI repository and data mining tool Orange is used to study the comparative analyses of accuracy for classifying the performance of student in graduation using four classifiers namely random forest, k nearest neighbor (KNN), decision tree and naïve bayes. The result shows that decision tree accuracy is highest among all other classifier

Download Full-text

Modelling Student’s Performance Using Data Mining Techniques in a Higher Learning Environment in the Pacific

International Journal of Neural Networks and Advanced Applications ◽

10.46300/91016.2020.7.10 ◽

2020 ◽

Vol 7 ◽

Keyword(s):

Neural Network ◽

Data Mining ◽

Statistical Analysis ◽

Decision Tree ◽

Student Performance ◽

Demographic Data ◽

Classification Model ◽

Classification Techniques ◽

The Pacific ◽

Under Sampling

The students’ performance in higher education has become one of the most widely studied area. Modelling student performance play a pivotal role in forecasting students’ performance where the data mining applications are now becoming most widely used techniques in this study. There are various factors, which determine the student performance. Eight attributes are used as input, which is considered most influential in determining students’ performance in the Pacific. Statistical analysis is done to see which attribute has the highest influence to student performance. In this research, different algorithms are utilized for building the classification model, each of them using various classification techniques. Some of classification techniques used are Artificial Neural Network, Decision Tree, Decision Table, and Naïve Bayes. The WEKA explorer application and R software are used for correlation test between different variables. The dataset used in this research is an imbalanced set, which is later transformed to balance set through under sampling. Neural Network is one of the classification techniques that has done well on both, imbalanced and balanced dataset. Another technique which has done well is Decision tree. Statistical analysis shows that internal assessment has weak positive relationship with student performance while demographic data is not. Further observations are reported in this research in relation to two types of datasets with application to different classification techniques

Download Full-text

Student performance prediction based on data mining classification techniques

Nigerian Journal of Technology ◽

10.4314/njt.v37i4.31 ◽

2018 ◽

Vol 37 (4) ◽

pp. 1087 ◽

Cited By ~ 1

Author(s):

Y.K. Saheed ◽

T.O. Oladele ◽

A.O. Akanni ◽

W.M. Ibrahim

Keyword(s):

Data Mining ◽

Student Performance ◽

Performance Prediction ◽

Classification Techniques

Download Full-text

Student Performance Prediction using Online Behavior Discussion Forum with Data Mining Techniques

Proceedings of the Borneo International Conference on Education and Social Sciences ◽

10.5220/0009017000900095 ◽

2018 ◽

Author(s):

Febrianti Widyahastuti ◽

Viany Utami Tjhin

Keyword(s):

Data Mining ◽

Student Performance ◽

Performance Prediction ◽

Discussion Forum ◽

Online Behavior ◽

Data Mining Techniques

Download Full-text

Student Performance Predictions Using Knowledge Discovery Database and Data Mining, DPU Students Records as Sample

Academic Journal of Nawroz University ◽

10.25007/ajnu.v10n3a875 ◽

2021 ◽

Vol 10 (3) ◽

pp. 121-127

Author(s):

Bareen Haval ◽

Karwan Jameel Abdulrahman ◽

Araz Rajab

Keyword(s):

Data Mining ◽

Decision Tree ◽

Student Performance ◽

Educational Data Mining ◽

Data Sets ◽

Decision Tree Classifier ◽

Data Mining Techniques ◽

Academic History ◽

Tree Classifier ◽

Using Data

This article presents the results of connecting an educational data mining techniques to the academic performance of students. Three classification models (Decision Tree, Random Forest and Deep Learning) have been developed to analyze data sets and predict the performance of students. The projected submission of the three classificatory was calculated and matched. The academic history and data of the students from the Office of the Registrar were used to train the models. Our analysis aims to evaluate the results of students using various variables such as the student's grade. Data from (221) students with (9) different attributes were used. The results of this study are very important, provide a better understanding of student success assessments and stress the importance of data mining in education. The main purpose of this study is to show the student successful forecast using data mining techniques to improve academic programs. The results of this research indicate that the Decision Tree classifier overtakes two other classifiers by achieving a total prediction accuracy of 97%.

Download Full-text

Role of FCBF Feature Selection in Educational Data Mining

Mehran University Research Journal of Engineering and Technology ◽

10.22581/muet1982.2004.09 ◽

2020 ◽

Vol 39 (4) ◽

pp. 772-778

Author(s):

Maryam Zaffar ◽

Manzoor Ahmad Hashmani ◽

K.S. Savita ◽

Syed Sajjad Hussain Rizvi ◽

Mubashar Rehman

Keyword(s):

Data Mining ◽

Feature Selection ◽

Prediction Model ◽

Student Performance ◽

Performance Prediction ◽

Prediction Models ◽

Educational Data Mining ◽

Action Plans ◽

Factors Affecting ◽

Academic Organization

The Educational Data Mining (EDM) is a very vigorous area of Data Mining (DM), and it is helpful in predicting the performance of students. Student performance prediction is not only important for the student but also helpful for academic organization to detect the causes of success and failures of students. Furthermore, the features selected through the students’ performance prediction models helps in developing action plans for academic welfare. Feature selection can increase the prediction accuracy of the prediction model. In student performance prediction model, where every feature is very important, as a neglection of any important feature can cause the wrong development of academic action plans. Moreover, the feature selection is a very important step in the development of student performance prediction models. There are different types of feature selection algorithms. In this paper, Fast Correlation-Based Filter (FCBF) is selected as a feature selection algorithm. This paper is a step on the way to identifying the factors affecting the academic performance of the students. In this paper performance of FCBF is being evaluated on three different student’s datasets. The performance of FCBF is detected well on a student dataset with greater no of features.

Download Full-text

Analisis Komparatif Evaluasi Performa Algoritma Klasifikasi pada Readmisi Pasien Diabetes

Jurnal Buana Informatika ◽

10.24002/jbi.v7i4.770 ◽

2016 ◽

Vol 7 (4) ◽

Author(s):

Mochammad Yusa ◽

Ema Utami ◽

Emha T. Luthfi

Keyword(s):

Data Mining ◽

Decision Tree ◽

Cross Validation ◽

Nearest Neighbor ◽

Naive Bayes ◽

Kappa Statistic ◽

Naïve Bayes ◽

Validation Dataset ◽

K Nearest Neighbor ◽

Fold Cross Validation

Abstract. Readmission is associated with quality measures on patients in hospitals. Different attributes related to diabetic patients such as medication, ethnicity, race, lifestyle, age, and others result in the calculation of quality care that tends to be complicated. Classification techniques of data mining can solve this problem. In this paper, the evaluation on three different classifiers, i.e. Decision Tree, k-Nearest Neighbor (k-NN), dan Naive Bayes with various settingparameter, is developed by using 10-Fold Cross Validation technique. The targets of parameter performance evaluated is based on term of Accuracy, Mean Absolute Error (MAE), dan Kappa Statistic. The selected dataset consists of 47 attributes and 49.735 records. The result shows that k-NN classifier with k=100 has a better performance in term of accuracy and Kappa Statistic, but Naive Bayes outperforms in term of MAE among other classifiers. Keywords: k-NN, naive bayes, diabetes, readmissionAbstrak. Proses Readmisi dikaitkan dengan perhitungan kualitas penanganan pasien di rumah sakit. Perbedaan atribut-atribut yang berhubungan dengan pasien diabetes proses medikasi, etnis, ras, gaya hidup, umur, dan lain-lain, mengakibatkan perhitungan kualitas cenderung rumit. Teknik klasifikasi data mining dapat menjadi solusi dalam perhitungan kualitas ini. Teknik klasifikasi merupakan salah satu teknik data mining yang perkembangannya cukup signifikan. Di dalam penelitian ini, model algoritma klasifikasi Decision Tree, k-Nearest Neighbor (k-NN), dan Naive Bayes dengan berbagai parameter setting akan dievaluasi performanya berdasarkan nilai performa Accuracy, Mean AbsoluteError (MAE), dan Kappa Statistik dengan metode 10-Fold Cross Validation. Dataset yang dievaluasi memiliki 47 atribut dengan 49.735 records. Hasil penelitian menunjukan bahwa performa accuracy, MAE, dan Kappa Statistik terbaik didapatkan dari Model Algoritma Naive Bayes.Kata Kunci: k-NN, naive bayes, diabetes, readmisi

Download Full-text

Student Performance Prediction Using Algorithms of Data Mining

2018 International Conference on Computing, Engineering, and Design (ICCED) ◽

10.1109/icced.2018.00055 ◽

2018 ◽

Cited By ~ 1

Author(s):

Abid Jamil ◽

Muhammad Ahsan ◽

Tahir Farooq ◽

Amir Hussain ◽

Rehan Ashraf

Keyword(s):

Data Mining ◽

Student Performance ◽

Performance Prediction

Download Full-text