Educational Data Mining for Student Learning Pattern Analysis using Clustering Algorithms

The exponential increase in universities’ electronic data creates the need to derive some useful information from these massive amounts of data. The progression in the data mining field causes it conceivable to educational data to improve the nature of educational processes. This study, thus, uses data mining methods to study the learning behavior and performance of university students. It focused on two aspects of the performance of the students. First, predicting students' learning behavior at the end of a complete year of the study program. Second, predict student performance with the help of the data model proposed by this study. Finally, provide course material recommendations using the data mining algorithm. Three data mining algorithms were considered which are K-Means, FCM, and KFCM., and maximum accuracy of 90.22% was achieved by KFCM. The study indicates that in terms of time and memory usages K-means algorithm give better results. This creates an opportunity for identifying students that may graduate with poor results or may not graduate at all, so early intercession might be possible.

Download Full-text

An Efficient Clustering Approach for Automatic Detection of Calcification in Low Dose Chest CT

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit195231 ◽

2019 ◽

pp. 163-168

Author(s):

P. Tamijiselvy ◽

N. Kavitha ◽

K. M. Keerthana ◽

D. Menakha

Keyword(s):

Data Mining ◽

Low Dose ◽

Early Stage ◽

Clustering Algorithms ◽

Automatic Detection ◽

Chest Ct ◽

Data Mining Algorithm ◽

Fuzzy C Means ◽

Data Mining Algorithms ◽

Using Data

The degree of aortic calcification has been appeared to be a risk pointer for vascular occasions including cardiovascular events. The created strategy is fully automated data mining algorithm to segment and measure calcification using Low-dose Chest CT in smokers of age 50 to 70 .The identification of subjects with increased cardiovascular risk can be detected by using data mining algorithms. This paper presents a method for automatic detection of coronary artery calcifications in low-dose chest CT scans using effective clustering algorithms with three phases as Pre-Processing, Segmentation and clustering. Fuzzy C Means algorithm provides accuracy of 80.23% demonstrate that Fuzzy C means detects the Cardio Vascular Disease at early stage.

Download Full-text

Predicting Student Failure in University Examination using Machine Learning Algorithms

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.e2643.039520 ◽

2020 ◽

Vol 9 (5) ◽

pp. 956-959

Keyword(s):

Machine Learning ◽

Data Mining ◽

Performance Management ◽

Student Performance ◽

Learning Algorithms ◽

Educational Data Mining ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Social Characteristics ◽

Student Failure

Student Performance Management is one of the key pillars of the higher education institutions since it directly impacts the student’s career prospects and college rankings. This paper follows the path of learning analytics and educational data mining by applying machine learning techniques in student data for identifying students who are at the more likely to fail in the university examinations and thus providing needed interventions for improved student performance. The Paper uses data mining approach with 10 fold cross validation to classify students based on predictors which are demographic and social characteristics of the students. This paper compares five popular machine learning algorithms Rep Tree, Jrip, Random Forest, Random Tree, Naive Bayes algorithms based on overall classifier accuracy as well as other class specific indicators i.e. precision, recall, f-measure. Results proved that Rep tree algorithm outperformed other machine learning algorithms in classifying students who are at more likely to fail in the examinations.

Download Full-text

Evaluation of Clustering Methods for Adaptive Learning Systems

Artificial Intelligence Applications in Distance Education - Advances in Mobile and Distance Learning ◽

10.4018/978-1-4666-6276-6.ch014 ◽

2015 ◽

pp. 237-260 ◽

Cited By ~ 1

Author(s):

Wilhelmiina Hämäläinen ◽

Ville Kumpulainen ◽

Maxim Mozgovoy

Keyword(s):

Data Mining ◽

Adaptive Learning ◽

Clustering Algorithms ◽

Educational Data Mining ◽

Optimal Choice ◽

Learning Systems ◽

Learning Tools ◽

Clustering Methods ◽

Central Task ◽

Adaptive Learning Systems

Clustering student data is a central task in the educational data mining and design of intelligent learning tools. The problem is that there are thousands of clustering algorithms but no general guidelines about which method to choose. The optimal choice is of course problem- and data-dependent and can seldom be found without trying several methods. Still, the purposes of clustering students and the typical features of educational data make certain clustering methods more suitable or attractive. In this chapter, the authors evaluate the main clustering methods from this perspective. Based on the analysis, the authors suggest the most promising clustering methods for different situations.

Download Full-text

Mining Students' Learning Behavior in Moodle System

Journal of Information Technology Research ◽

10.4018/jitr.2014100102 ◽

2014 ◽

Vol 7 (4) ◽

pp. 12-26 ◽

Cited By ~ 2

Author(s):

K. Touya ◽

Mohamed Fakir

Keyword(s):

Data Mining ◽

Web Application ◽

Educational Data Mining ◽

Educational Environment ◽

Learning Behavior ◽

System Database ◽

Hidden Knowledge ◽

E Learning ◽

Mining Algorithms ◽

Interesting Area

In the last few years, Educational Data Mining has become an interesting area exploited to discover and extract hidden knowledge of students from educational environment data. During the establishment of this work an attempt was made to manage the extracted information using mining techniques. These methods took place in order to get groups of students with similar characteristics. The application of classification, clustering and association rules mining algorithms on the data stored on the e-learning (Moodle system) database allowed to extract knowledges that help to understand students' behaviors and patterns. Additionally, the development of a Web application for the educators is a tool to monitor their students learning behavior by monitoring the number of assignments taken, the number of quizzes taken, the number of forum post and read by students, etc. The knowledge obtained can help the instructors to make decision about their students' interacting with the courses activities in Moodle system, and to create an efficient educational environment. In this research, a Data Mining tool called RapidMiner was used for mining the data from the Moodle system database, and a web application written in PHP was established to aid teachers with statistics.

Download Full-text

BioMedR: an R/CRAN package for integrated data analysis pipeline in biomedical study

Briefings in Bioinformatics ◽

10.1093/bib/bbz150 ◽

2019 ◽

Cited By ~ 2

Author(s):

Jie Dong ◽

Min-Feng Zhu ◽

Yong-Huan Yun ◽

Ai-Ping Lu ◽

Ting-Jun Hou ◽

...

Keyword(s):

Data Mining ◽

Clustering Algorithms ◽

R Package ◽

Integrated Analysis ◽

Analysis Pipeline ◽

Molecular Fingerprints ◽

Useful Knowledge ◽

Data Mining Algorithms ◽

Mining Methods ◽

Mining Algorithms

Abstract Background With the increasing development of biotechnology and information technology, publicly available data in chemistry and biology are undergoing explosive growth. Such wealthy information in these resources needs to be extracted and then transformed to useful knowledge by various data mining methods. However, a main computational challenge is how to effectively represent or encode molecular objects under investigation such as chemicals, proteins, DNAs and even complicated interactions when data mining methods are employed. To further explore these complicated data, an integrated toolkit to represent different types of molecular objects and support various data mining algorithms is urgently needed. Results We developed a freely available R/CRAN package, called BioMedR, for molecular representations of chemicals, proteins, DNAs and pairwise samples of their interactions. The current version of BioMedR could calculate 293 molecular descriptors and 13 kinds of molecular fingerprints for small molecules, 9920 protein descriptors based on protein sequences and six types of generalized scale-based descriptors for proteochemometric modeling, more than 6000 DNA descriptors from nucleotide sequences and six types of interaction descriptors using three different combining strategies. Moreover, this package realized five similarity calculation methods and four powerful clustering algorithms as well as several useful auxiliary tools, which aims at building an integrated analysis pipeline for data acquisition, data checking, descriptor calculation and data modeling. Conclusion BioMedR provides a comprehensive and uniform R package to link up different representations of molecular objects with each other and will benefit cheminformatics/bioinformatics and other biomedical users. It is available at: https://CRAN.R-project.org/package=BioMedR and https://github.com/wind22zhu/BioMedR/.

Download Full-text

The k-means Algorithm: A Comprehensive Survey and Performance Evaluation

Electronics ◽

10.3390/electronics9081295 ◽

2020 ◽

Vol 9 (8) ◽

pp. 1295 ◽

Cited By ~ 4

Author(s):

Mohiuddin Ahmed ◽

Raihan Seraj ◽

Syed Mohammed Shamsul Islam

Keyword(s):

Experimental Analysis ◽

Clustering Algorithm ◽

Fundamental Problem ◽

Clustering Algorithms ◽

Data Types ◽

Data Mining Algorithms ◽

Recent Developments ◽

Comprehensive Survey ◽

And Performance ◽

Mining Algorithms

The k-means clustering algorithm is considered one of the most powerful and popular data mining algorithms in the research community. However, despite its popularity, the algorithm has certain limitations, including problems associated with random initialization of the centroids which leads to unexpected convergence. Additionally, such a clustering algorithm requires the number of clusters to be defined beforehand, which is responsible for different cluster shapes and outlier effects. A fundamental problem of the k-means algorithm is its inability to handle various data types. This paper provides a structured and synoptic overview of research conducted on the k-means algorithm to overcome such shortcomings. Variants of the k-means algorithms including their recent developments are discussed, where their effectiveness is investigated based on the experimental analysis of a variety of datasets. The detailed experimental analysis along with a thorough comparison among different k-means clustering algorithms differentiates our work compared to other existing survey papers. Furthermore, it outlines a clear and thorough understanding of the k-means algorithm along with its different research directions.

Download Full-text

Application of educational data mining on analysis of students' online learning behavior

2017 2nd International Conference on Image, Vision and Computing (ICIVC) ◽

10.1109/icivc.2017.7984707 ◽

2017 ◽

Cited By ~ 3

Author(s):

Wang Jie ◽

Lv Hai-yan ◽

Cao Biao ◽

Zhao Yuan

Keyword(s):

Data Mining ◽

Online Learning ◽

Educational Data Mining ◽

Learning Behavior

Download Full-text

Student Performance Predictions Using Knowledge Discovery Database and Data Mining, DPU Students Records as Sample

Academic Journal of Nawroz University ◽

10.25007/ajnu.v10n3a875 ◽

2021 ◽

Vol 10 (3) ◽

pp. 121-127

Author(s):

Bareen Haval ◽

Karwan Jameel Abdulrahman ◽

Araz Rajab

Keyword(s):

Data Mining ◽

Decision Tree ◽

Student Performance ◽

Educational Data Mining ◽

Data Sets ◽

Decision Tree Classifier ◽

Data Mining Techniques ◽

Academic History ◽

Tree Classifier ◽

Using Data

This article presents the results of connecting an educational data mining techniques to the academic performance of students. Three classification models (Decision Tree, Random Forest and Deep Learning) have been developed to analyze data sets and predict the performance of students. The projected submission of the three classificatory was calculated and matched. The academic history and data of the students from the Office of the Registrar were used to train the models. Our analysis aims to evaluate the results of students using various variables such as the student's grade. Data from (221) students with (9) different attributes were used. The results of this study are very important, provide a better understanding of student success assessments and stress the importance of data mining in education. The main purpose of this study is to show the student successful forecast using data mining techniques to improve academic programs. The results of this research indicate that the Decision Tree classifier overtakes two other classifiers by achieving a total prediction accuracy of 97%.

Download Full-text

Effect of Academic Interest and Emotional Happiness on Academic Performance in Learning Environment.

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.a3043.059120 ◽

2020 ◽

Vol 9 (1) ◽

pp. 2486-2489

Keyword(s):

Data Mining ◽

Academic Performance ◽

Clustering Algorithm ◽

Reference Model ◽

Educational Data Mining ◽

Positive Association ◽

Academic Interest ◽

Education Data ◽

Learning Techniques ◽

And Performance

According to Bloom's Taxonomy, the motto of education is to groom the students' as a better personality in knowledge, skill set and emotions under the supervision of academicians. Development of information technology paves the way to analyse the data from the educational environment and make decisions which help to be in track to achieve the motto. i.e. Educational Data mining. Education Data mining is one of the research domains of data mining which convert the data from the educational sector as insights for decision making. This paper is to analyse the effect of student's academic interest on emotional happiness and academic performance by applying supervised and unsupervised learning techniques. Students' Emotional Happiness and students' academic performance is evaluated by the Oxford Happiness Inventory and criterion reference model. Academic interest is received as yes or no responses from the students. Naive Bayes classification algorithm and K Means clustering algorithm is applied to categorise the student participants based on their happiness scale, academic interest and academic performance. The association between academic interest and performance is determined using predictive and descriptive mining. By this research, it is witnessed the positive association between academic interest, happiness and performance. The insights of this investigation will allow the teachers' to understand the students in a better way and do the needful to enhance academic efficiency.

Download Full-text

Comparing Performance of Data Mining Algorithms in Prediction Heart Diseases

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v5i6.pp1569-1576 ◽

2015 ◽

Vol 5 (6) ◽

pp. 1569 ◽

Cited By ~ 13

Author(s):

Moloud Abdar ◽

Sharareh R. Niakan Kalhori ◽

Tole Sutikno ◽

Imam Much Ibnu Subroto ◽

Goli Arji

Keyword(s):

Neural Network ◽

Data Mining ◽

Decision Tree ◽

Heart Diseases ◽

Support Vector ◽

Data Mining Algorithm ◽

Network Support ◽

Data Mining Algorithms ◽

Mining Algorithms ◽

Analysis Models

Heart diseases are among the nation’s leading couse of mortality and moribidity. Data mining teqniques can predict the likelihood of patients getting a heart disease. The purpose of this study is comparison of different data mining algorithm on prediction of heart diseases. This work applied and compared data mining techniques to predict the risk of heart diseases. After feature analysis, models by five algorithms including decision tree (C5.0), neural network, support vector machine (SVM), logistic regression and k-nearest neighborhood (KNN) were developed and validated. C5.0 Decision tree has been able to build a model with greatest accuracy 93.02%, KNN, SVM, Neural network have been 88.37%, 86.05% and 80.23% respectively. Produced results of decision tree can be simply interpretable and applicable; their rules can be understood easily by different clinical practitioner.

Download Full-text