Machine Learning in Higher Education

Author(s):  
Garima Jaiswal ◽  
Arun Sharma ◽  
Reeti Sarup

Machine learning aims to give computers the ability to automatically learn from data. It can enable computers to make intelligent decisions by recognizing complex patterns from data. Through data mining, humongous amounts of data can be explored and analyzed to extract useful information and find interesting patterns. Classification, a supervised learning technique, can be beneficial in predicting class labels for test data by referring the already labeled classes from available training data set. In this chapter, educational data mining techniques are applied over a student dataset to analyze the multifarious factors causing alarmingly high number of dropouts. This work focuses on predicting students at risk of dropping out using five classification algorithms, namely, K-NN, naive Bayes, decision tree, random forest, and support vector machine. This can assist in improving pedagogical practices in order to enhance the performance of students predicted at risk of dropping out, thus reducing the dropout rates in higher education.

2017 ◽  
Vol 7 (1.2) ◽  
pp. 43 ◽  
Author(s):  
K. Sreenivasa Rao ◽  
N. Swapna ◽  
P. Praveen Kumar

Data Mining is the process of extracting useful information from large sets of data. Data mining enablesthe users to have insights into the data and make useful decisions out of the knowledge mined from databases. The purpose of higher education organizations is to offer superior opportunities to its students. As with data mining, now-a-days Education Data Mining (EDM) also is considered as a powerful tool in the field of education. It portrays an effective method for mining the student’s performance based on various parameters to predict and analyze whether a student (he/she) will be recruited or not in the campus placement. Predictions are made using the machine learning algorithms J48, Naïve Bayes, Random Forest, and Random Tree in weka tool and Multiple Linear Regression, binomial logistic regression, Recursive Partitioning and Regression Tree (rpart), conditional inference tree (ctree) and Neural Network (nnet) algorithms in R studio. The results obtained from each approaches are then compared with respect to their performance and accuracy levels by graphical analysis. Based on the result, higher education organizations can offer superior training to its students.


Author(s):  
Adeel Ahmed ◽  
Kamlesh Kumar ◽  
Mansoor A. Khuhro ◽  
Asif A. Wagan ◽  
Imtiaz A. Halepoto ◽  
...  

Nowadays, educational data mining is being employed as assessing tool for study and analysis of hidden patterns in academic databases which can be used to predict student’s academic performance. This paper implements various machine learning classification techniques on students’ academic records for results predication. For this purpose, data of MS(CS) students were collected from a public university of Pakistan through their assignments, quizzes, and sessional marks. The WEKA data mining tool has been used for performing all experiments namely, data pre-processing, classification, and visualization. For performance measure, classifier models were trained with 3- and 10-fold cross validation methods to evaluate classifiers' accuracy. The results show that bagging classifier combined with support vector machines outperform other classifiers in terms of accuracy, precision, recall, and F-measure score. The obtained outcomes confirm that our research provides significant contribution in prediction of students’ academic performance which can ultimately be used to assists faculty members to focus low grades students in improving their academic records.


2021 ◽  
Vol 13 (22) ◽  
pp. 12461
Author(s):  
Chih-Chang Yu ◽  
Yufeng (Leon) Wu

While the use of deep neural networks is popular for predicting students’ learning outcomes, convolutional neural network (CNN)-based methods are used more often. Such methods require numerous features, training data, or multiple models to achieve week-by-week predictions. However, many current learning management systems (LMSs) operated by colleges cannot provide adequate information. To make the system more feasible, this article proposes a recurrent neural network (RNN)-based framework to identify at-risk students who might fail the course using only a few common learning features. RNN-based methods can be more effective than CNN-based methods in identifying at-risk students due to their ability to memorize time-series features. The data used in this study were collected from an online course that teaches artificial intelligence (AI) at a university in northern Taiwan. Common features, such as the number of logins, number of posts and number of homework assignments submitted, are considered to train the model. This study compares the prediction results of the RNN model with the following conventional machine learning models: logistic regression, support vector machines, decision trees and random forests. This work also compares the performance of the RNN model with two neural network-based models: the multi-layer perceptron (MLP) and a CNN-based model. The experimental results demonstrate that the RNN model used in this study is better than conventional machine learning models and the MLP in terms of F-score, while achieving similar performance to the CNN-based model with fewer parameters. Our study shows that the designed RNN model can identify at-risk students once one-third of the semester has passed. Some future directions are also discussed.


2019 ◽  
Vol 255 ◽  
pp. 03002
Author(s):  
Mat Yaacob Nik Nurul Hafzan ◽  
Deris Safaai ◽  
Mat Asiah ◽  
Mohamad Mohd Saberi ◽  
Safaai Siti Syuhaida

Predictive analytics including statistical techniques, predictive modelling, machine learning, and data mining that analyse current and historical facts to make predictions about future or otherwise unknown events. Higher education institutions nowadays are under increasing pressure to respond to national and global economic, political and social changes such as the growing need to increase the proportion of students in certain disciplines, embedding workplace graduate attributes and ensuring that the quality of learning programs are both nationally and globally relevant. However, in higher education institution, there are significant numbers of students that stop their studies before graduation, especially for undergraduate students. Problem related to stopping out student and late or not graduating student can be improved by applying analytics. Using analytics, administrators, instructors and student can predict what will happen in future. Administrator and instructors can decide suitable intervention programs for at-risk students and before students decide to leave their study. Many different machine learning techniques have been implemented for predictive modelling in the past including decision tree, k-nearest neighbour, random forest, neural network, support vector machine, naïve Bayesian and a few others. A few attempts have been made to use Bayesian network and dynamic Bayesian network as modelling techniques for predicting at- risk student but a few challenges need to be resolved. The motivation for using dynamic Bayesian network is that it is robust to incomplete data and it provides opportunities for handling changing and dynamic environment. The trends and directions of research on prediction and identifying at-risk student are developing prediction model that can provide as early as possible alert to administrators, predictive model that handle dynamic and changing environment and the model that provide real-time prediction.


Entropy ◽  
2021 ◽  
Vol 23 (4) ◽  
pp. 485 ◽  
Author(s):  
Carlos A. Palacios ◽  
José A. Reyes-Suárez ◽  
Lorena A. Bearzotti ◽  
Víctor Leiva ◽  
Carolina Marchant

Data mining is employed to extract useful information and to detect patterns from often large data sets, closely related to knowledge discovery in databases and data science. In this investigation, we formulate models based on machine learning algorithms to extract relevant information predicting student retention at various levels, using higher education data and specifying the relevant variables involved in the modeling. Then, we utilize this information to help the process of knowledge discovery. We predict student retention at each of three levels during their first, second, and third years of study, obtaining models with an accuracy that exceeds 80% in all scenarios. These models allow us to adequately predict the level when dropout occurs. Among the machine learning algorithms used in this work are: decision trees, k-nearest neighbors, logistic regression, naive Bayes, random forest, and support vector machines, of which the random forest technique performs the best. We detect that secondary educational score and the community poverty index are important predictive variables, which have not been previously reported in educational studies of this type. The dropout assessment at various levels reported here is valid for higher education institutions around the world with similar conditions to the Chilean case, where dropout rates affect the efficiency of such institutions. Having the ability to predict dropout based on student’s data enables these institutions to take preventative measures, avoiding the dropouts. In the case study, balancing the majority and minority classes improves the performance of the algorithms.


2021 ◽  
Vol 11 (9) ◽  
pp. 552
Author(s):  
Balqis Albreiki ◽  
Nazar Zaki ◽  
Hany Alashwal

Educational Data Mining plays a critical role in advancing the learning environment by contributing state-of-the-art methods, techniques, and applications. The recent development provides valuable tools for understanding the student learning environment by exploring and utilizing educational data using machine learning and data mining techniques. Modern academic institutions operate in a highly competitive and complex environment. Analyzing performance, providing high-quality education, strategies for evaluating the students’ performance, and future actions are among the prevailing challenges universities face. Student intervention plans must be implemented in these universities to overcome problems experienced by the students during their studies. In this systematic review, the relevant EDM literature related to identifying student dropouts and students at risk from 2009 to 2021 is reviewed. The review results indicated that various Machine Learning (ML) techniques are used to understand and overcome the underlying challenges; predicting students at risk and students drop out prediction. Moreover, most studies use two types of datasets: data from student colleges/university databases and online learning platforms. ML methods were confirmed to play essential roles in predicting students at risk and dropout rates, thus improving the students’ performance.


Author(s):  
Nada Lebkiri ◽  
Mohamed Daoudi ◽  
Zakaria Abidli ◽  
Joumana Elturk ◽  
Abdelmajid Soulaymani ◽  
...  

Student failure prediction is one of the main topics in university learning contexts, as it helps to avoid failure in higher education institutions and provides a basis to make the teaching and learning process more effective, efficient and reliable. The overall aim of this study is to identify students who are susceptible to fail a given university course. This research paper reports the implementation of an Educational Data Mining project based on the CRISP-DM methodology. The data was collected from the APOGEE system of Ibn Tofail University, a form and specifications of the tested courses. The business goal of this paper is to develop a model that can identify students who are susceptible to failure in a given academic course. Such a model helps prevent failure in higher education institutions and provides a basis for making the teaching and learning process more effective, efficient and reliable. Most common machine learning algorithms in the field of Educational Data Mining were used. The results of our research showed that the proposed method was able to achieve an overall accuracy of 97% in predicting students at potential failure.


Sensors ◽  
2021 ◽  
Vol 21 (7) ◽  
pp. 2503
Author(s):  
Taro Suzuki ◽  
Yoshiharu Amano

This paper proposes a method for detecting non-line-of-sight (NLOS) multipath, which causes large positioning errors in a global navigation satellite system (GNSS). We use GNSS signal correlation output, which is the most primitive GNSS signal processing output, to detect NLOS multipath based on machine learning. The shape of the multi-correlator outputs is distorted due to the NLOS multipath. The features of the shape of the multi-correlator are used to discriminate the NLOS multipath. We implement two supervised learning methods, a support vector machine (SVM) and a neural network (NN), and compare their performance. In addition, we also propose an automated method of collecting training data for LOS and NLOS signals of machine learning. The evaluation of the proposed NLOS detection method in an urban environment confirmed that NN was better than SVM, and 97.7% of NLOS signals were correctly discriminated.


Animals ◽  
2020 ◽  
Vol 10 (5) ◽  
pp. 771
Author(s):  
Toshiya Arakawa

Mammalian behavior is typically monitored by observation. However, direct observation requires a substantial amount of effort and time, if the number of mammals to be observed is sufficiently large or if the observation is conducted for a prolonged period. In this study, machine learning methods as hidden Markov models (HMMs), random forests, support vector machines (SVMs), and neural networks, were applied to detect and estimate whether a goat is in estrus based on the goat’s behavior; thus, the adequacy of the method was verified. Goat’s tracking data was obtained using a video tracking system and used to estimate whether they, which are in “estrus” or “non-estrus”, were in either states: “approaching the male”, or “standing near the male”. Totally, the PC of random forest seems to be the highest. However, The percentage concordance (PC) value besides the goats whose data were used for training data sets is relatively low. It is suggested that random forest tend to over-fit to training data. Besides random forest, the PC of HMMs and SVMs is high. However, considering the calculation time and HMM’s advantage in that it is a time series model, HMM is better method. The PC of neural network is totally low, however, if the more goat’s data were acquired, neural network would be an adequate method for estimation.


2020 ◽  
Vol 12 (7) ◽  
pp. 1218
Author(s):  
Laura Tuşa ◽  
Mahdi Khodadadzadeh ◽  
Cecilia Contreras ◽  
Kasra Rafiezadeh Shahi ◽  
Margret Fuchs ◽  
...  

Due to the extensive drilling performed every year in exploration campaigns for the discovery and evaluation of ore deposits, drill-core mapping is becoming an essential step. While valuable mineralogical information is extracted during core logging by on-site geologists, the process is time consuming and dependent on the observer and individual background. Hyperspectral short-wave infrared (SWIR) data is used in the mining industry as a tool to complement traditional logging techniques and to provide a rapid and non-invasive analytical method for mineralogical characterization. Additionally, Scanning Electron Microscopy-based image analyses using a Mineral Liberation Analyser (SEM-MLA) provide exhaustive high-resolution mineralogical maps, but can only be performed on small areas of the drill-cores. We propose to use machine learning algorithms to combine the two data types and upscale the quantitative SEM-MLA mineralogical data to drill-core scale. This way, quasi-quantitative maps over entire drill-core samples are obtained. Our upscaling approach increases result transparency and reproducibility by employing physical-based data acquisition (hyperspectral imaging) combined with mathematical models (machine learning). The procedure is tested on 5 drill-core samples with varying training data using random forests, support vector machines and neural network regression models. The obtained mineral abundance maps are further used for the extraction of mineralogical parameters such as mineral association.


Sign in / Sign up

Export Citation Format

Share Document