scholarly journals Evaluating Student Levelling Based on Machine Learning Model’s Performance

Author(s):  
Shatha Ghareeb ◽  
Abir Jaafar Hussain ◽  
Dhiya Al-Jumeily ◽  
Wasiq Khan ◽  
Rawaa Al-Jumeily ◽  
...  

Abstract In this paper, a novel application of machine learning algorithms is presented for student levelling. In multicultural countries such as UAE, there are various education curriculums where the sector of private schools and quality assurance is supervising various private schools for many nationalities. As there are various education curriculums in United Arab Emirates, specifically Abu Dhabi, to meet expats’ needs, there are different requirements for registration and success. In addition, there are different age groups for starting education in each curriculum. Every curriculum follows different education methods such as assessment techniques, reassessment rules, and exam boards. Currently, students who transfer to other curriculums are not correctly placed to their appropriate year group as a result of the start and end dates of each academic year as well as due to their date of birth, in which students who are either younger or older for that year group can create gaps in their learning and performance. In addition, pupils’ academic journeys are not stored which create a gap for the schools to track their learning process. In this paper, we propose a computational framework applicable in multicultural countries such as United Arab Emirates in which multi-education systems are implemented. Machine Learning are used to provide the appropriate student’ level aiding shoolds to provide a smooth transition when assigning students to their year groups and provide levelling and differentiation information of pupils for a smooth transition between one education curriculums to another, in which retrieval of their progress is possible. For classification and discriminant analysis of pupils levelling, three machine learning classifiers are utilised including random forest classifier, Artificial Neural Network, and combined classifiers. The simulation results indicated that the proposed machine learning classifiers generated effective performance in terms of accuracy.

2021 ◽  
Vol 11 (7) ◽  
pp. 3130
Author(s):  
Janka Kabathova ◽  
Martin Drlik

Early and precisely predicting the students’ dropout based on available educational data belongs to the widespread research topic of the learning analytics research field. Despite the amount of already realized research, the progress is not significant and persists on all educational data levels. Even though various features have already been researched, there is still an open question, which features can be considered appropriate for different machine learning classifiers applied to the typical scarce set of educational data at the e-learning course level. Therefore, the main goal of the research is to emphasize the importance of the data understanding, data gathering phase, stress the limitations of the available datasets of educational data, compare the performance of several machine learning classifiers, and show that also a limited set of features, which are available for teachers in the e-learning course, can predict student’s dropout with sufficient accuracy if the performance metrics are thoroughly considered. The data collected from four academic years were analyzed. The features selected in this study proved to be applicable in predicting course completers and non-completers. The prediction accuracy varied between 77 and 93% on unseen data from the next academic year. In addition to the frequently used performance metrics, the comparison of machine learning classifiers homogeneity was analyzed to overcome the impact of the limited size of the dataset on obtained high values of performance metrics. The results showed that several machine learning algorithms could be successfully applied to a scarce dataset of educational data. Simultaneously, classification performance metrics should be thoroughly considered before deciding to deploy the best performance classification model to predict potential dropout cases and design beneficial intervention mechanisms.


2021 ◽  
Vol 5 (4 (113)) ◽  
pp. 55-63
Author(s):  
Beimbet Daribayev ◽  
Aksultan Mukhanbet ◽  
Yedil Nurakhov ◽  
Timur Imankulov

The problem of oil displacement was solved using neural networks and machine learning classifiers. The Buckley-Leverett model is selected, which describes the process of oil displacement by water. It consists of the equation of continuity of oil, water phases and Darcy’s law. The challenge is to optimize the oil displacement problem. Optimization will be performed at three levels: vectorization of calculations; implementation of classical algorithms; implementation of the algorithm using neural networks. A feature of the method proposed in the work is the identification of the method with high accuracy and the smallest errors, comparing the results of machine learning classifiers and types of neural networks. The research paper is also one of the first papers in which a comparison was made with machine learning classifiers and neural and recurrent neural networks. The classification was carried out according to three classification algorithms, such as decision tree, support vector machine (SVM) and gradient boosting. As a result of the study, the Gradient Boosting classifier and the neural network showed high accuracy, respectively 99.99 % and 97.4 %. The recurrent neural network trained faster than the others. The SVM classifier has the lowest accuracy score. To achieve this goal, a dataset was created containing over 67,000 data for class 10. These data are important for the problems of oil displacement in porous media. The proposed methodology provides a simple and elegant way to instill oil knowledge into machine learning algorithms. This removes two of the most significant drawbacks of machine learning algorithms: the need for large datasets and the robustness of extrapolation. The presented principles can be generalized in countless ways in the future and should lead to a new class of algorithms for solving both forward and inverse oil problems


Author(s):  
Ritu Aggrawal, Saurabh Pal

Background: Early speculation of cardiovascular disease can help determine the lifestyle change options of high-risk patients, thereby reducing difficulties. We propose a coronary heart disease data set analysis technique to predict people’s risk of danger based on people’s clinically determined history. The methods introduced may be integrated into multiple uses, such for developing decision support system, developing a risk management network, and help for experts and clinical staff. Methods: We employed the Framingham Heart study dataset, which is publicly available Kaggle, to train several machine learning classifiers such as logistic regression (LR), K-nearest neighbor (KNN), Naïve Bayes (NB), decision tree (DT), random forest (RF) and gradient boosting classifier (GBC) for disease prediction. The p-value method has been used for feature elimination, and the selected features have been incorporated for further prediction. Various thresholds are used with different classifiers to make predictions. In order to estimating the precision of the classifiers, ROC curve, confusion matrix and AUC value are considered for model verification. The performance of the six classifiers is used for comparison to predict chronic heart disease (CHD). Results: After applying the p-value backward elimination statistical method on the 10-year CHD data set, 6 significant features were selected from 14 features with p <0.5. In the performance of machine learning classifiers, GBC has the highest accuracy score, which is 87.61%. Conclusions: Statistical methods, such as the combination of p-value backward elimination method and machine learning classifiers, thereby improving the accuracy of the classifier and shortening the running time of the machine.


Author(s):  
Alexandra Renouard ◽  
Alessia Maggi ◽  
Marc Grunberg ◽  
Cécile Doubre ◽  
Clément Hibert

Abstract Small-magnitude earthquakes shed light on the spatial and magnitude distribution of natural seismicity, as well as its rate and occurrence, especially in stable continental regions where natural seismicity remains difficult to explain under slow strain-rate conditions. However, capturing them in catalogs is strongly hindered by signal-to-noise ratio issues, resulting in high rates of false and man-made events also being detected. Accurate and robust discrimination of these events is critical for optimally detecting small earthquakes. This requires uncovering recurrent salient features that can rapidly distinguish first false events from real events, then earthquakes from man-made events (mainly quarry blasts), despite high signal variability and noise content. In this study, we combined the complementary strengths of human and interpretable rule-based machine-learning algorithms for solving this classification problem. We used human expert knowledge to co-create two reliable machine-learning classifiers through human-assisted selection of classification features and review of events with uncertain classifier predictions. The two classifiers are integrated into the SeisComP3 operational monitoring system. The first one discards false events from the set of events obtained with a low short-term average/long-term average threshold; the second one labels the remaining events as either earthquakes or quarry blasts. When run in an operational setting, the first classifier correctly detected more than 99% of false events and just over 93% of earthquakes; the second classifier correctly labeled 95% of quarry blasts and 96% of earthquakes. After a manual review of the second classifier low-confidence outputs, the final catalog contained fewer than 2% of misclassified events. These results confirm that machine learning strengthens the quality of earthquake catalogs and that the performance of machine-learning classifiers can be improved through human expertise. Our study promotes a broader implication of hybrid intelligence monitoring within seismological observatories.


2021 ◽  
Vol 13 (18) ◽  
pp. 10018 ◽  
Author(s):  
Mohamed Elhag Mohamed Abo ◽  
Norisma Idris ◽  
Rohana Mahmud ◽  
Atika Qazi ◽  
Ibrahim Abaker Targio Hashem ◽  
...  

A sentiment analysis of Arabic texts is an important task in many commercial applications such as Twitter. This study introduces a multi-criteria method to empirically assess and rank classifiers for Arabic sentiment analysis. Prominent machine learning algorithms were deployed to build classification models for Arabic sentiment analysis classifiers. Moreover, an assessment of the top five machine learning classifiers’ performances measures was discussed to rank the performance of the classifier. We integrated the top five ranking methods with evaluation metrics of machine learning classifiers such as accuracy, recall, precision, F-measure, CPU Time, classification error, and area under the curve (AUC). The method was tested using Saudi Arabic product reviews to compare five popular classifiers. Our results suggest that deep learning and support vector machine (SVM) classifiers perform best with accuracy 85.25%, 82.30%; precision 85.30, 83.87%; recall 88.41%, 83.89; F-measure 86.81, 83.87%; classification error 14.75, 17.70; and AUC 0.93, 0.90, respectively. They outperform decision trees, K-nearest neighbours (K-NN), and Naïve Bayes classifiers.


2021 ◽  
Vol 4 (4) ◽  
pp. 309-315
Author(s):  
Kumawuese Jennifer Kurugh ◽  
Muhammad Aminu Ahmad ◽  
Awwal Ahmad Babajo

Datasets are a major requirement in the development of breast cancer classification/detection models using machine learning algorithms. These models can provide an effective, accurate and less expensive diagnosis method and reduce life losses. However, using the same machine learning algorithms on different datasets yields different results. This research developed several machine learning models for breast cancer classification/detection using Random forest, support vector machine, K Nearest Neighbors, Gaussian Naïve Bayes, Perceptron and Logistic regression. Three widely used test data sets were used; Wisconsin Breast Cancer (WBC) Original, Wisconsin Diagnostic Breast Cancer (WDBC) and Wisconsin Prognostic Breast Cancer (WPBC). The results show that datasets affect the performance of machine learning classifiers. Also, the machine learning classifiers have different performances with a given breast cancer dataset


Sign in / Sign up

Export Citation Format

Share Document