scholarly journals Prediction and Analysis of Student Performance in Secondary Education Based on Data Mining and Machine Learning Techniques

Author(s):  
Meenal Joshi ◽  
Shiv Kumar

<p>According to modern era education is the key to achieve success in the future; it develops a human personality, thoughts, and social skills. The purpose of this research work is to focus on educational data mining (EDM) through machine learning algorithms. EDM means to discover hidden knowledge and pattern about student's performance. Machine learning can be useful to predict the learning outcomes of students. From last few years, several tools have been used to judge the student's performance from different points of view like the student's level, objectives, techniques, algorithms, and different methods. In this paper, predicting and analyzing student performance in secondary school is conducted using data mining techniques and machine learning algorithms such as Naive Bayes, Decision Tree algorithm J48, and Logistic Regression. For this the collection of dataset from "Secondary School" and then filtration is applying on desired values using WEKA, tool.</p>

Student Performance Management is one of the key pillars of the higher education institutions since it directly impacts the student’s career prospects and college rankings. This paper follows the path of learning analytics and educational data mining by applying machine learning techniques in student data for identifying students who are at the more likely to fail in the university examinations and thus providing needed interventions for improved student performance. The Paper uses data mining approach with 10 fold cross validation to classify students based on predictors which are demographic and social characteristics of the students. This paper compares five popular machine learning algorithms Rep Tree, Jrip, Random Forest, Random Tree, Naive Bayes algorithms based on overall classifier accuracy as well as other class specific indicators i.e. precision, recall, f-measure. Results proved that Rep tree algorithm outperformed other machine learning algorithms in classifying students who are at more likely to fail in the examinations.


Author(s):  
Baban. U. Rindhe ◽  
Nikita Ahire ◽  
Rupali Patil ◽  
Shweta Gagare ◽  
Manisha Darade

Heart-related diseases or Cardiovascular Diseases (CVDs) are the main reason for a huge number of death in the world over the last few decades and has emerged as the most life-threatening disease, not only in India but in the whole world. So, there is a need fora reliable, accurate, and feasible system to diagnose such diseases in time for proper treatment. Machine Learning algorithms and techniques have been applied to various medical datasets to automate the analysis of large and complex data. Many researchers, in recent times, have been using several machine learning techniques to help the health care industry and the professionals in the diagnosis of heart-related diseases. Heart is the next major organ comparing to the brain which has more priority in the Human body. It pumps the blood and supplies it to all organs of the whole body. Prediction of occurrences of heart diseases in the medical field is significant work. Data analytics is useful for prediction from more information and it helps the medical center to predict various diseases. A huge amount of patient-related data is maintained on monthly basis. The stored data can be useful for the source of predicting the occurrence of future diseases. Some of the data mining and machine learning techniques are used to predict heart diseases, such as Artificial Neural Network (ANN), Random Forest,and Support Vector Machine (SVM).Prediction and diagnosingof heart disease become a challenging factor faced by doctors and hospitals both in India and abroad. To reduce the large scale of deaths from heart diseases, a quick and efficient detection technique is to be discovered. Data mining techniques and machine learning algorithms play a very important role in this area. The researchers accelerating their research works to develop software with thehelp of machine learning algorithms which can help doctors to decide both prediction and diagnosing of heart disease. The main objective of this research project is to predict the heart disease of a patient using machine learning algorithms.


2017 ◽  
Vol 7 (1.2) ◽  
pp. 43 ◽  
Author(s):  
K. Sreenivasa Rao ◽  
N. Swapna ◽  
P. Praveen Kumar

Data Mining is the process of extracting useful information from large sets of data. Data mining enablesthe users to have insights into the data and make useful decisions out of the knowledge mined from databases. The purpose of higher education organizations is to offer superior opportunities to its students. As with data mining, now-a-days Education Data Mining (EDM) also is considered as a powerful tool in the field of education. It portrays an effective method for mining the student’s performance based on various parameters to predict and analyze whether a student (he/she) will be recruited or not in the campus placement. Predictions are made using the machine learning algorithms J48, Naïve Bayes, Random Forest, and Random Tree in weka tool and Multiple Linear Regression, binomial logistic regression, Recursive Partitioning and Regression Tree (rpart), conditional inference tree (ctree) and Neural Network (nnet) algorithms in R studio. The results obtained from each approaches are then compared with respect to their performance and accuracy levels by graphical analysis. Based on the result, higher education organizations can offer superior training to its students.


2019 ◽  
Vol 10 (1) ◽  
pp. 90 ◽  
Author(s):  
Maria Tsiakmaki ◽  
Georgios Kostopoulos ◽  
Sotiris Kotsiantis ◽  
Omiros Ragos

Educational Data Mining (EDM) has emerged over the last two decades, concerning with the development and implementation of data mining methods in order to facilitate the analysis of vast amounts of data originating from a wide variety of educational contexts. Predicting students’ progression and learning outcomes, such as dropout, performance and course grades, is regarded among the most important tasks of the EDM field. Therefore, applying appropriate machine learning algorithms for building accurate predictive models is of outmost importance for both educators and data scientists. Considering the high-dimensional input space and the complexity of machine learning algorithms, the process of building accurate and robust learning models requires advanced data science skills, while is time-consuming and error-prone in most cases. In addition, choosing the proper method for a given problem formulation and configuring the optimal parameters’ values for a specific model is a demanding task, whilst it is often very difficult to understand and explain the produced results. In this context, the main purpose of the present study is to examine the potential use of advanced machine learning strategies on educational settings from the perspective of hyperparameter optimization. More specifically, we investigate the effectiveness of automated Machine Learning (autoML) for the task of predicting students’ learning outcomes based on their participation in online learning platforms. At the same time, we limit the search space to tree-based and rule-based models in order to achieving transparent and interpretable results. To this end, a plethora of experiments were carried out, revealing that autoML tools achieve consistently superior results. Hopefully our work will help nonexpert users (e.g., educators and instructors) in the field of EDM to conduct experiments with appropriate automated parameter configurations, thus achieving highly accurate and comprehensible results.


2021 ◽  
Vol 9 (2) ◽  
pp. 554-564
Author(s):  
Golmei Shaheamlung, Harshpreet Kaur

In the 21st-century, the issue of liver disease has been increasing all over the world. As per the latest survey report, liver disease death toll has been rise approximately 2 million per year worldwide. The overall percentage of death by liver disease is 3.5% worldwide. Chronic Liver disease is also considered to be one of the deadly diseases, so early detection and treatment can recover the disease easily. Due to rapid advancement in Artificial intelligence (AI), like various machine learning algorithms SVM, K-mean clustering, KNN, Random forest, Logistic regression, etc., This will improve the life span of a patient suffering from Chronic Liver Disease (CLD) in early stages. The data can be obtained in a large volume due to the broad exploitation of bar codes for supreme marketable products, the mechanization of various business and government dealings, and the development in the data collection tools. This research work is based on liver disease prediction using machine learning algorithms. Liver disease prediction has various levels of steps involved, pre-processing, feature extraction, and classification. In this s research work, a hybrid classification method is proposed for liver disease prediction. And Datasets are collected from the Kaggle database of Indian liver patient records. The proposed model achieved an accuracy of 77.58%. The proposed technique is implemented in Python with the Spyder tool and results are analyzed in terms of accuracy, precision, and recall.  


2021 ◽  
Vol 11 (9) ◽  
pp. 552
Author(s):  
Balqis Albreiki ◽  
Nazar Zaki ◽  
Hany Alashwal

Educational Data Mining plays a critical role in advancing the learning environment by contributing state-of-the-art methods, techniques, and applications. The recent development provides valuable tools for understanding the student learning environment by exploring and utilizing educational data using machine learning and data mining techniques. Modern academic institutions operate in a highly competitive and complex environment. Analyzing performance, providing high-quality education, strategies for evaluating the students’ performance, and future actions are among the prevailing challenges universities face. Student intervention plans must be implemented in these universities to overcome problems experienced by the students during their studies. In this systematic review, the relevant EDM literature related to identifying student dropouts and students at risk from 2009 to 2021 is reviewed. The review results indicated that various Machine Learning (ML) techniques are used to understand and overcome the underlying challenges; predicting students at risk and students drop out prediction. Moreover, most studies use two types of datasets: data from student colleges/university databases and online learning platforms. ML methods were confirmed to play essential roles in predicting students at risk and dropout rates, thus improving the students’ performance.


The scope of this research work is to identify the efficient machine learning algorithm for predicting the behavior of a student from the student performance dataset. We applied Support Vector Machines, K-Nearest Neighbor, Decision Tree and Naïve Bayes algorithms to predict the grade of a student and compared their prediction results in terms of various performance metrics. The students who visited many resources for reference, made academic related discussions and interactions in the class room, absent for minimum days, cared by parents care have shown great improvement in the final grade. Among the machine learning techniques we have used, SVM has shown more accuracy in terms of four important attribute. The accuracy rate of SVM after tuning is 0.80. The KNN and decision tree achieves the accuracy of 0.64, 0.65 respectively whereas the Naïve Bayes achieves 0.77.


Materials ◽  
2021 ◽  
Vol 14 (5) ◽  
pp. 1089
Author(s):  
Sung-Hee Kim ◽  
Chanyoung Jeong

This study aims to demonstrate the feasibility of applying eight machine learning algorithms to predict the classification of the surface characteristics of titanium oxide (TiO2) nanostructures with different anodization processes. We produced a total of 100 samples, and we assessed changes in TiO2 nanostructures’ thicknesses by performing anodization. We successfully grew TiO2 films with different thicknesses by one-step anodization in ethylene glycol containing NH4F and H2O at applied voltage differences ranging from 10 V to 100 V at various anodization durations. We found that the thicknesses of TiO2 nanostructures are dependent on anodization voltages under time differences. Therefore, we tested the feasibility of applying machine learning algorithms to predict the deformation of TiO2. As the characteristics of TiO2 changed based on the different experimental conditions, we classified its surface pore structure into two categories and four groups. For the classification based on granularity, we assessed layer creation, roughness, pore creation, and pore height. We applied eight machine learning techniques to predict classification for binary and multiclass classification. For binary classification, random forest and gradient boosting algorithm had relatively high performance. However, all eight algorithms had scores higher than 0.93, which signifies high prediction on estimating the presence of pore. In contrast, decision tree and three ensemble methods had a relatively higher performance for multiclass classification, with an accuracy rate greater than 0.79. The weakest algorithm used was k-nearest neighbors for both binary and multiclass classifications. We believe that these results show that we can apply machine learning techniques to predict surface quality improvement, leading to smart manufacturing technology to better control color appearance, super-hydrophobicity, super-hydrophilicity or batter efficiency.


2018 ◽  
Vol 7 (2.8) ◽  
pp. 684 ◽  
Author(s):  
V V. Ramalingam ◽  
Ayantan Dandapath ◽  
M Karthik Raja

Heart related diseases or Cardiovascular Diseases (CVDs) are the main reason for a huge number of death in the world over the last few decades and has emerged as the most life-threatening disease, not only in India but in the whole world. So, there is a need of reliable, accurate and feasible system to diagnose such diseases in time for proper treatment. Machine Learning algorithms and techniques have been applied to various medical datasets to automate the analysis of large and complex data. Many researchers, in recent times, have been using several machine learning techniques to help the health care industry and the professionals in the diagnosis of heart related diseases. This paper presents a survey of various models based on such algorithms and techniques andanalyze their performance. Models based on supervised learning algorithms such as Support Vector Machines (SVM), K-Nearest Neighbour (KNN), NaïveBayes, Decision Trees (DT), Random Forest (RF) and ensemble models are found very popular among the researchers.


Sign in / Sign up

Export Citation Format

Share Document