Predicting students’ academic performance using a modified kNN algorithm

Categorical Variables ◽

Ratio Scale ◽

K Nearest Neighbors ◽

Learning Techniques ◽

Nominal Variables ◽

Better Than

AbstractThe target (dependent) variable is often influenced not only by ratio scale variables, but also by qualitative (nominal scale) variables in classification analysis. Majority of machine learning techniques accept only numerical inputs. Hence, it is necessary to encode these categorical variables into numerical values using encoding techniques. If the variable does not have relation or order between its values, assigning numbers will mislead the machine learning techniques. This paper presents a modified k-nearest-neighbors algorithm that calculates the distances values of categorical (nominal) variables without encoding them. A student’s academic performance dataset is used for testing the enhanced algorithm. It shows that the proposed algorithm outperforms standard one that needs nominal variables encoding to calculate the distance between the nominal variables. The results show the proposed algorithm preforms 14% better than standard one in accuracy, and it is not sensitive to outliers.

Detection of Loss Zones while Drilling Using Different Machine Learning Techniques

Journal of Energy Resources Technology ◽

10.1115/1.4051553 ◽

2021 ◽

pp. 1-29

Author(s):

Ahmed Alsaihati ◽

Mahmoud Abughaban ◽

Salaheldin Elkatatny ◽

Abdulazeez Abdulraheem

Keyword(s):

Machine Learning ◽

Support Vector Machines ◽

Random Forests ◽

Nearest Neighbors ◽

Support Vector ◽

K Nearest Neighbors ◽

Learning Techniques ◽

Vector Machines ◽

Testing Set

Abstract Fluid loss into formations is a common operational issue that is frequently encountered when drilling across naturally or induced fractured formations. This could pose significant operational risks, such as well-control, stuck pipe, and wellbore instability, which, in turn, lead to an increase of well time and cost. This research aims to use and evaluate different machine learning techniques, namely: support vector machines, random forests, and K-nearest neighbors in detecting loss circulation occurrences while drilling using solely drilling surface parameters. Actual field data of seven wells, which had suffered partial or severe loss circulation, were used to build predictive models, while Well-8 was used to compare the performance of the developed models. Different performance metrics were used to evaluate the performance of the developed models. Recall, precision, and F1-score measures were used to evaluate the ability of the developed model to detect loss circulation occurrences. The results showed the K-nearest neighbors classifier achieved a high F1-score of 0.912 in detecting loss circulation occurrence in the testing set, while the random forests was the second-best classifier with almost the same F1-score of 0.910. The support vector machines achieved an F1-score of 0.83 in predicting the loss circulation occurrence in the testing set. The K-nearest neighbors outperformed other models in detecting the loss circulation occurrences in Well-8 with an F1-score of 0.80. The main contribution of this research as compared to previous studies is that it identifies losses events based on real-time measurements of the active pit volume.

Real Time Efficient Accident Predictor System using Machine Learning Techniques (kNN, RF, LR, DT)

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.d6910.1210220 ◽

2020 ◽

Vol 10 (2) ◽

pp. 108-111

Keyword(s):

Machine Learning ◽

Random Forest ◽

Real Time ◽

Classification Accuracy ◽

Nearest Neighbors ◽

Machine Learning Algorithms ◽

Classification Methods ◽

K Nearest Neighbors ◽

Real time crash predictor system is determining frequency of crashes and also severity of crashes. Nowadays machine learning based methods are used to predict the total number of crashes. In this project, prediction accuracy of machine learning algorithms like Decision tree (DT), K-nearest neighbors (KNN), Random forest (RF), Logistic Regression (LR) are evaluated. Performance analysis of these classification methods are evaluated in terms of accuracy. Dataset included for this project is obtained from 49 states of US and 27 states of India which contains 2.25 million US accident crash records and 1.16 million crash records respectively. Results prove that classification accuracy obtained from Random Forest (RF) is96% compared to other classification methods.

Machine Learning Techniques for Determining Students' Academic Performance: A Sustainable Development Case for Engineering Education

2020 International Conference on Decision Aid Sciences and Application (DASA) ◽

10.1109/dasa51403.2020.9317178 ◽

2020 ◽

Author(s):

Sujan Poudyal ◽

Morteza Nagahi ◽

Mohammad Nagahisarchoghaei ◽

Ghodsieh Ghanbari

Keyword(s):

Machine Learning ◽

Sustainable Development ◽

Academic Performance ◽

Engineering Education ◽

Emerging Trends in Computing and Expert Technology - Lecture Notes on Data Engineering and Communications Technologies ◽

A Comparison of Machine Learning Techniques for the Prediction of the Student’s Academic Performance

10.1007/978-3-030-32150-5_107 ◽

2019 ◽

pp. 1052-1062

Author(s):

Jyoti Kumari ◽

R. Venkatesan ◽

T. Jemima Jebaseeli ◽

V. Abisha Felsit ◽

K. Salai Selvanayaki ◽

...

Keyword(s):

Machine Learning ◽

Academic Performance ◽

Flood Early Warning Systems Using Machine Learning Techniques: The Case of the Tomebamba Catchment at the Southern Andes of Ecuador

Hydrology ◽

10.3390/hydrology8040183 ◽

2021 ◽

Vol 8 (4) ◽

pp. 183

Author(s):

Paul Muñoz ◽

Johanna Orellana-Alvear ◽

Jörg Bendix ◽

Jan Feyen ◽

Rolando Célleri

Keyword(s):

Machine Learning ◽

Early Warning ◽

Early Warning Systems ◽

Data Representation ◽

Lead Times ◽

Warning Systems ◽

Tropical Andes ◽

K Nearest Neighbors ◽

Worldwide, machine learning (ML) is increasingly being used for developing flood early warning systems (FEWSs). However, previous studies have not focused on establishing a methodology for determining the most efficient ML technique. We assessed FEWSs with three river states, No-alert, Pre-alert and Alert for flooding, for lead times between 1 to 12 h using the most common ML techniques, such as multi-layer perceptron (MLP), logistic regression (LR), K-nearest neighbors (KNN), naive Bayes (NB), and random forest (RF). The Tomebamba catchment in the tropical Andes of Ecuador was selected as a case study. For all lead times, MLP models achieve the highest performance followed by LR, with f1-macro (log-loss) scores of 0.82 (0.09) and 0.46 (0.20) for the 1 h and 12 h cases, respectively. The ranking was highly variable for the remaining ML techniques. According to the g-mean, LR models correctly forecast and show more stability at all states, while the MLP models perform better in the Pre-alert and Alert states. The proposed methodology for selecting the optimal ML technique for a FEWS can be extrapolated to other case studies. Future efforts are recommended to enhance the input data representation and develop communication applications to boost the awareness of society of floods.

Assessment of Academic Performance with The E-mental Health Interventions in Virtual Learning Environment Using Machine Learning Techniques: A Hybrid Approach

Journal of Engineering Education Transformations ◽

10.16920/jeet/2021/v34i0/157109 ◽

2021 ◽

Vol 34 (0) ◽

pp. 79

Author(s):

A. Sheik Abdullah ◽

R. M. Abirami ◽

A. Gitwina ◽

C. Varthana

Keyword(s):

Mental Health ◽

Machine Learning ◽

Academic Performance ◽

Learning Environment ◽

Hybrid Approach ◽

Virtual Learning Environment ◽

Health Interventions ◽

Learning Techniques ◽

Mental Health Interventions

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

Deep Regressor: Cross Subject Academic Performance Prediction System for University Level Students

10.35940/ijitee.k1254.09811s19 ◽

2019 ◽

Vol 8 (11S) ◽

pp. 1265-1267

Keyword(s):

Machine Learning ◽

Academic Performance ◽

Evaluation Process ◽

Education Institution ◽

Drop Out ◽

Learning Techniques ◽

Current Program ◽

The Right ◽

The University

Predicting the academic performance of students has been an important research topic in the Educational field. The main aim of a higher education institution is to provide quality education for students. One way to accomplish a higher level of quality of education is by predicting student’s academic performance and there by taking earlyre- medial actions to improve the same. This paper presents a system which utilizes machine learning techniques to classify and predict the academic performance of the students at the right time before the drop out occurs. The system first accepts the performance parameters of the basic level courses which the student had already passed as these parameters also influence the further study. To pre- dict the performance of the current program, the system continuously accepts the academic performance parame- ters after each academic evaluation process. The system employs machine learning techniques to study the aca- demic performance of the students after each evaluation process. The system also learns the basic rules followed by the University for assessing the students. Based on the present performance of the students, the system classifies the students into different levels and identify the students at high risk. Earlier prediction can help the students to adopt suitable measures in advance to improve the per for- man ce. The systems can also identify the factor saffecting the performance of the same students which helps them to take remedial measures in advance.

Machine learning in diachronic corpus phonology: mining verse data to infer trajectories in English phonotactics

Papers in Historical Phonology ◽

10.2218/pihph.3.2018.2878 ◽

2018 ◽

Vol 3 ◽

Author(s):

Andreas Baumann

Keyword(s):

Machine Learning ◽

Middle English ◽

Large Data ◽

Large Data Sets ◽

Data Sets ◽

Powerful Method ◽

K Nearest Neighbors ◽

Learning Techniques ◽

Standard Techniques

Machine learning is a powerful method when working with large data sets such as diachronic corpora. However, as opposed to standard techniques from inferential statistics like regression modeling, machine learning is less commonly used among phonological corpus linguists. This paper discusses three different machine learning techniques (K nearest neighbors classifiers; Naïve Bayes classifiers; artificial neural networks) and how they can be applied to diachronic corpus data to address specific phonological questions. To illustrate the methodology, I investigate Middle English schwa deletion and when and how it potentially triggered reduction of final /mb/ clusters in English.

Prediction of Liver Diseases by Using Few Machine Learning Based Approaches

Australian Journal of Engineering and Innovative Technology ◽

10.34104/ajeit.020.085090 ◽

2020 ◽

pp. 85-90

Keyword(s):

Machine Learning ◽

Comparative Analysis ◽

Liver Diseases ◽

Model Building ◽

Medical Science ◽

Support Vector ◽

Classification Algorithms ◽

K Nearest Neighbors ◽

Advancement in medical science has always been one of the most vital aspects of the human race. With the progress in technology, the use of modern techniques and equipment is always imposed on treatment purposes. Nowadays, machine learning techniques have widely been used in medical science for assuring accuracy. In this work, we have constructed computational model building techniques for liver disease prediction accurately. We used some efficient classification algorithms: Random Forest, Perceptron, Decision Tree, K-Nearest Neighbors (KNN), and Support Vector Machine (SVM) for predicting liver diseases. Our works provide the implementation of hybrid model construction and comparative analysis for improving prediction performance. At first, classification algorithms are applied to the original liver patient datasets collected from the UCI repository. Then we analyzed features and tweaked to improve the performance of our predictor and made a comparative analysis among the classifiers. We examined that, KNN algorithm outperformed all other techniques with feature selection.

Flood Early Warning Systems using Machine Learning Techniques. Case the Tomebamba Catchment at the Southern Andes of Ecuador

10.20944/preprints202111.0510.v1 ◽

2021 ◽

Author(s):

Paul Muñoz ◽

Johanna Orellana-Alvear ◽

Jörg Bendix ◽

Jan Feyen ◽

Rolando Célleri

Keyword(s):

Machine Learning ◽

Early Warning ◽

Early Warning Systems ◽

Data Representation ◽

Lead Times ◽

Warning Systems ◽

Tropical Andes ◽

K Nearest Neighbors ◽

Flood Early Warning Systems (FEWSs) using Machine Learning (ML) has gained worldwide popularity. However, determining the most efficient ML technique is still a bottleneck. We assessed FEWSs with three river states, No-alert, Pre-alert, and Alert for flooding, for lead times between 1 to 12 hours using the most common ML techniques, such as Multi-Layer Perceptron (MLP), Logistic Regression (LR), K-Nearest Neighbors (KNN), Naive Bayes (NB), and Random Forest (RF). The Tomebamba catchment in the tropical Andes of Ecuador was selected as case study. For all lead times, MLP models achieve the highest performance followed by LR, with f1-macro (log-loss) scores of 0.82 (0.09) and 0.46 (0.20) for the 1- and 12-hour cases, respectively. The ranking was highly variable for the remaining ML techniques. According to the g-mean, LR models correctly forecast and show more stability at all states, while the MLP models perform better in the Pre-alert and Alert states. Future efforts are recommended to enhance the input data representation and develop communication applications to boost the awareness of the society for floods.