IoT System for School Dropout Prediction Using Machine Learning Techniques Based on Socioeconomic Data

School dropout permeates various teaching modalities and has generated social, economic, political, and academic damage to those involved in the educational process. Evasion data in higher education courses show the pessimistic scenario of fragility that configures education, mainly in underdeveloped countries. In this context, this paper presents an Internet of Things (IoT) framework for predicting dropout using machine learning methods such as Decision Tree, Logistic Regression, Support Vector Machine, K-nearest neighbors, Multilayer perceptron, and Deep Learning based on socioeconomic data. With the use of socioeconomic data, it is possible to identify in the act of pre-registration who are the students likely to evade, since this information is filled in the pre-registration form. This paper proposes the automation of the prediction process by a method capable of obtaining information that would be difficult and time consuming for humans to obtain, contributing to a more accurate prediction. With the advent of IoT, it is possible to create a highly efficient and flexible tool for improving management and service-related issues, which can provide a prediction of dropout of new students entering higher-level courses, allowing personalized follow-up to students to reverse a possible dropout. The approach was validated by analyzing the accuracy, F1 score, recall, and precision parameters. The results showed that the developed system obtained 99.34% accuracy, 99.34% F1 score, 100% recall, and 98.69% precision using Decision Tree. Thus, the developed system presents itself as a viable option for use in universities to predict students likely to leave university.

Download Full-text

A review of machine learning techniques using decision tree and support vector machine

2016 International Conference on Computing Communication Control and automation (ICCUBEA) ◽

10.1109/iccubea.2016.7860040 ◽

2016 ◽

Cited By ~ 14

Author(s):

Madan Somvanshi ◽

Pranjali Chavan ◽

Shital Tambade ◽

S. V. Shinde

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Decision Tree ◽

Machine Learning Techniques ◽

Support Vector ◽

Learning Techniques

Download Full-text

Prediction of Liver Diseases by Using Few Machine Learning Based Approaches

Australian Journal of Engineering and Innovative Technology ◽

10.34104/ajeit.020.085090 ◽

2020 ◽

pp. 85-90

Keyword(s):

Machine Learning ◽

Comparative Analysis ◽

Liver Diseases ◽

Model Building ◽

Medical Science ◽

Machine Learning Techniques ◽

Support Vector ◽

Classification Algorithms ◽

K Nearest Neighbors ◽

Learning Techniques

Advancement in medical science has always been one of the most vital aspects of the human race. With the progress in technology, the use of modern techniques and equipment is always imposed on treatment purposes. Nowadays, machine learning techniques have widely been used in medical science for assuring accuracy. In this work, we have constructed computational model building techniques for liver disease prediction accurately. We used some efficient classification algorithms: Random Forest, Perceptron, Decision Tree, K-Nearest Neighbors (KNN), and Support Vector Machine (SVM) for predicting liver diseases. Our works provide the implementation of hybrid model construction and comparative analysis for improving prediction performance. At first, classification algorithms are applied to the original liver patient datasets collected from the UCI repository. Then we analyzed features and tweaked to improve the performance of our predictor and made a comparative analysis among the classifiers. We examined that, KNN algorithm outperformed all other techniques with feature selection.

Download Full-text

Computational Identification of Chemical Compounds with Potential Activity against Leishmania amazonensis using Nonlinear Machine Learning Techniques

Current Topics in Medicinal Chemistry ◽

10.2174/1568026619666181130121558 ◽

2019 ◽

Vol 18 (27) ◽

pp. 2347-2354 ◽

Cited By ~ 3

Author(s):

Juan Alberto Castillo-Garit ◽

Naivi Flores-Balmaseda ◽

Orlando Álvarez ◽

Hai Pham-The ◽

Virginia Pérez-Doñate ◽

...

Keyword(s):

Machine Learning ◽

Computational Models ◽

Neglected Tropical Diseases ◽

Chemical Compounds ◽

Theoretical Models ◽

Leishmania Amazonensis ◽

Machine Learning Techniques ◽

Support Vector ◽

K Nearest Neighbors ◽

Data Set

Leishmaniasis is a poverty-related disease endemic in 98 countries worldwide, with morbidity and mortality increasing daily. All currently used first-line and second-line drugs for the treatment of leishmaniasis exhibit several drawbacks including toxicity, high costs and route of administration. Consequently, the development of new treatments for leishmaniasis is a priority in the field of neglected tropical diseases. The aim of this work is to develop computational models those allow the identification of new chemical compounds with potential anti-leishmanial activity. A data set of 116 organic chemicals, assayed against promastigotes of Leishmania amazonensis, is used to develop the theoretical models. The cutoff value to consider a compound as active one was IC50≤1.5μM. For this study, we employed Dragon software to calculate the molecular descriptors and WEKA to obtain machine learning (ML) models. All ML models showed accuracy values between 82% and 91%, for the training set. The models developed with k-nearest neighbors and classification trees showed sensitivity values of 97% and 100%, respectively; while the models developed with artificial neural networks and support vector machine showed specificity values of 94% and 92%, respectively. In order to validate our models, an external test-set was evaluated with good behavior for all models. A virtual screening was performed and 156 compounds were identified as potential anti-leishmanial by all the ML models. This investigation highlights the merits of ML-based techniques as an alternative to other more traditional methods to find new chemical compounds with anti-leishmanial activity.

Download Full-text

Detection of Loss Zones while Drilling Using Different Machine Learning Techniques

Journal of Energy Resources Technology ◽

10.1115/1.4051553 ◽

2021 ◽

pp. 1-29

Author(s):

Ahmed Alsaihati ◽

Mahmoud Abughaban ◽

Salaheldin Elkatatny ◽

Abdulazeez Abdulraheem

Keyword(s):

Machine Learning ◽

Support Vector Machines ◽

Random Forests ◽

Nearest Neighbors ◽

Machine Learning Techniques ◽

Support Vector ◽

K Nearest Neighbors ◽

Learning Techniques ◽

Vector Machines ◽

Testing Set

Abstract Fluid loss into formations is a common operational issue that is frequently encountered when drilling across naturally or induced fractured formations. This could pose significant operational risks, such as well-control, stuck pipe, and wellbore instability, which, in turn, lead to an increase of well time and cost. This research aims to use and evaluate different machine learning techniques, namely: support vector machines, random forests, and K-nearest neighbors in detecting loss circulation occurrences while drilling using solely drilling surface parameters. Actual field data of seven wells, which had suffered partial or severe loss circulation, were used to build predictive models, while Well-8 was used to compare the performance of the developed models. Different performance metrics were used to evaluate the performance of the developed models. Recall, precision, and F1-score measures were used to evaluate the ability of the developed model to detect loss circulation occurrences. The results showed the K-nearest neighbors classifier achieved a high F1-score of 0.912 in detecting loss circulation occurrence in the testing set, while the random forests was the second-best classifier with almost the same F1-score of 0.910. The support vector machines achieved an F1-score of 0.83 in predicting the loss circulation occurrence in the testing set. The K-nearest neighbors outperformed other models in detecting the loss circulation occurrences in Well-8 with an F1-score of 0.80. The main contribution of this research as compared to previous studies is that it identifies losses events based on real-time measurements of the active pit volume.

Download Full-text

Feature Selection from Lyme Disease Patient Survey Using Machine Learning

Algorithms ◽

10.3390/a13120334 ◽

2020 ◽

Vol 13 (12) ◽

pp. 334

Author(s):

Joshua Vendrow ◽

Jamie Haddock ◽

Deanna Needell ◽

Lorraine Johnson

Keyword(s):

Machine Learning ◽

Lyme Disease ◽

Large Scale ◽

Disease Patient ◽

Patient Survey ◽

Machine Learning Techniques ◽

Medical Community ◽

Support Vector ◽

Global Rating ◽

K Nearest Neighbors

Lyme disease is a rapidly growing illness that remains poorly understood within the medical community. Critical questions about when and why patients respond to treatment or stay ill, what kinds of treatments are effective, and even how to properly diagnose the disease remain largely unanswered. We investigate these questions by applying machine learning techniques to a large scale Lyme disease patient registry, MyLymeData, developed by the nonprofit LymeDisease.org. We apply various machine learning methods in order to measure the effect of individual features in predicting participants’ answers to the Global Rating of Change (GROC) survey questions that assess the self-reported degree to which their condition improved, worsened, or remained unchanged following antibiotic treatment. We use basic linear regression, support vector machines, neural networks, entropy-based decision tree models, and k-nearest neighbors approaches. We first analyze the general performance of the model and then identify the most important features for predicting participant answers to GROC. After we identify the “key” features, we separate them from the dataset and demonstrate the effectiveness of these features at identifying GROC. In doing so, we highlight possible directions for future study both mathematically and clinically.

Download Full-text

Spatial–Temporal Analysis of Land Cover Change at the Bento Rodrigues Dam Disaster Area Using Machine Learning Techniques

Remote Sensing ◽

10.3390/rs11212548 ◽

2019 ◽

Vol 11 (21) ◽

pp. 2548

Author(s):

Dong Luo ◽

Douglas G. Goodin ◽

Marcellus M. Caldas

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Land Cover ◽

Decision Tree ◽

Machine Learning Algorithms ◽

Training Data ◽

Machine Learning Techniques ◽

Support Vector ◽

Disaster Area ◽

Mine Sites

Disasters are an unpredictable way to change land use and land cover. Improving the accuracy of mapping a disaster area at different time is an essential step to analyze the relationship between human activity and environment. The goals of this study were to test the performance of different processing procedures and examine the effect of adding normalized difference vegetation index (NDVI) as an additional classification feature for mapping land cover changes due to a disaster. Using Landsat ETM+ and OLI images of the Bento Rodrigues mine tailing disaster area, we created two datasets, one with six bands, and the other one with six bands plus the NDVI. We used support vector machine (SVM) and decision tree (DT) algorithms to build classifier models and validated models performance using 10-fold cross-validation, resulting in accuracies higher than 90%. The processed results indicated that the accuracy could reach or exceed 80%, and the support vector machine had a better performance than the decision tree. We also calculated each land cover type’s sensitivity (true positive rate) and found that Agriculture, Forest and Mine sites had higher values but Bareland and Water had lower values. Then, we visualized land cover maps in 2000 and 2017 and found out the Mine sites areas have been expanded about twice of the size, but Forest decreased 12.43%. Our findings showed that it is feasible to create a training data pool and use machine learning algorithms to classify a different year’s Landsat products and NDVI can improve the vegetation covered land classification. Furthermore, this approach can provide a venue to analyze land pattern change in a disaster area over time.

Download Full-text

Classification of Agriculture Farm Machinery Using Machine Learning and Internet of Things

Symmetry ◽

10.3390/sym13030403 ◽

2021 ◽

Vol 13 (3) ◽

pp. 403

Author(s):

Muhammad Waleed ◽

Tai-Won Um ◽

Tariq Kamal ◽

Syed Muhammad Usman

Keyword(s):

Machine Learning ◽

Random Forest ◽

Decision Tree ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Gradient Boosting ◽

Support Vector ◽

Farm Machinery ◽

Learning Techniques

In this paper, we apply the multi-class supervised machine learning techniques for classifying the agriculture farm machinery. The classification of farm machinery is important when performing the automatic authentication of field activity in a remote setup. In the absence of a sound machine recognition system, there is every possibility of a fraudulent activity taking place. To address this need, we classify the machinery using five machine learning techniques—K-Nearest Neighbor (KNN), Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF) and Gradient Boosting (GB). For training of the model, we use the vibration and tilt of machinery. The vibration and tilt of machinery are recorded using the accelerometer and gyroscope sensors, respectively. The machinery included the leveler, rotavator and cultivator. The preliminary analysis on the collected data revealed that the farm machinery (when in operation) showed big variations in vibration and tilt, but observed similar means. Additionally, the accuracies of vibration-based and tilt-based classifications of farm machinery show good accuracy when used alone (with vibration showing slightly better numbers than the tilt). However, the accuracies improve further when both (the tilt and vibration) are used together. Furthermore, all five machine learning algorithms used for classification have an accuracy of more than 82%, but random forest was the best performing. The gradient boosting and random forest show slight over-fitting (about 9%), but both algorithms produce high testing accuracy. In terms of execution time, the decision tree takes the least time to train, while the gradient boosting takes the most time.

Download Full-text

On the Analysis of Machine Learning Classifiers to Detect Traffic Congestion in Vehicular Networks

10.5753/eniac.2019.9290 ◽

2019 ◽

Author(s):

Lucas Carvalho ◽

Maycon Silva ◽

Edimilson Santos ◽

Daniel Guidoni

Keyword(s):

Machine Learning ◽

Traffic Congestion ◽

Vehicular Networks ◽

Naive Bayes ◽

Naïve Bayes ◽

Machine Learning Techniques ◽

Support Vector ◽

K Nearest Neighbors ◽

Applied Machine Learning ◽

Routing Methods

Problems related to traffic congestion and management have become common in many cities. Thus, vehicle re-routing methods have been proposed to minimize the congestion. Some of these methods have applied machine learning techniques, more specifically classifiers, to verify road conditions and detect congestion. However, better results may be obtained by applying a classifier more suitable to domain. In this sense, this paper presents an evaluation of different classifiers applied to the identification of the level of road congestion. Our main goal is to analyze the characteristics of each classifier in this task. The classifiers involved in the experiments here are: Multiple Layer Neural Network (MLP), K-Nearest Neighbors (KNN), Decision Trees (J48), Support Vector Machines (SVM), Naive Bayes and Tree Augment Naive Bayes.

Download Full-text

An Innovative Method for Predicting and Classifying Inadequate Accuracy in Heart Disease by Using Decision Tree with K-Nearest Neighbors Algorithm

Alinteri Journal of Agricultural Sciences ◽

10.47059/alinteri/v36i1/ajas21086 ◽

2021 ◽

Vol 36 (1) ◽

pp. 609-615

Author(s):

Mandhapati Rajesh ◽

Dr.K. Malathi

Keyword(s):

Machine Learning ◽

Heart Disease ◽

Decision Tree ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Accuracy Rate ◽

K Nearest Neighbors ◽

Machine Learning Methods ◽

Learning Techniques

Aim: Predicting the Heartdiseases using medical parameters of cardiac patients to get a good accuracy rate using machine learning methods like innovative Decision Tree (DT) algorithm. Materials and Methods: Supervised Machine learning Techniques with innovative Decision Tree (N = 20) and K Nearest Neighbour (KNN) (N = 20) are performed with five different datasets at each time to record five samples. Results: The Decision Tree is used to predict heart disease with the help of various medical conditions, the accuracy is achieved for DT is 98% and KNN is 72.2%. The two algorithms Decision Tree and KNN are statistically insignificant (=.737) with the independent sample T-Test value (p<0.005) with a confidence level of 95%. Conclusion: Prediction and classification of heart disease significantly seem to be better in DT than KNN.

Download Full-text

An Ontology Driven System to Predict Diabetes with Machine Learning Techniques

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.b7586.129219 ◽

2019 ◽

Vol 9 (2) ◽

pp. 4005-4011

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Decision Tree ◽

Early Stage ◽

Machine Learning Techniques ◽

Support Vector ◽

Classification Algorithms ◽

Machine Learning Classification ◽

Diagnostic Center ◽

Mental Trauma

Diabetes Mellitus is considered one of the chronic diseases of humankind which causes an increase in blood sugar. Many complications are reported if DM remains untreated and unidentified. Identification of this disease requires a lot of physical and mental trauma and effort which involves visiting a doctor, blood and urine test at the diagnostic center which consumes more time. Difficulties can be over crossed using the trending technology of Machine learning. The idea of the model is to prognosticate the occurrence of a diabetic with high accuracy. Therefore, two machine learning classification algorithms namely Fine Decision Tree and Support Vector Machine are used in this experiment to detect diabetes at an early stage. Therefore two machine learning classification algorithms namely Fine Decision Tree and Support Vector Machine are used in this experiment to detect diabetes at an early stage.

Download Full-text