Modelling Student Employability on an Academic Basis

Student Skill ◽

Academic Scores ◽

Global Ageing

With the population growth and the employability scarcity, the placement of students has become a significant concern. Problems of global ageing and miss-match of student skill and knowledge can be witnessed easily. Fewer works of literature are available to predict the placement of students. This study aims to create a supervised machine learning (SML) model to predict the employability of graduates based on their academic scores and streams. The study used the decision-tree technique to create the SML model. The model can predict the placement chance based on students' academic scores and streams with 65% accuracy. Some new theoretical and practical contributions have been discussed.

Performance Improvement of Decision Tree: A Robust Classifier Using Tabu Search Algorithm

Applied Sciences ◽

10.3390/app11156728 ◽

2021 ◽

Vol 11 (15) ◽

pp. 6728

Author(s):

Muhammad Asfand Hafeez ◽

Muhammad Rashid ◽

Hassan Tariq ◽

Zain Ul Abideen ◽

Saud S. Alotaibi ◽

...

Keyword(s):

Machine Learning ◽

Tabu Search ◽

Decision Tree ◽

Decision Trees ◽

Search Algorithm ◽

Learning Algorithms ◽

Performance Comparison ◽

Tabu Search Algorithm

Classification and regression are the major applications of machine learning algorithms which are widely used to solve problems in numerous domains of engineering and computer science. Different classifiers based on the optimization of the decision tree have been proposed, however, it is still evolving over time. This paper presents a novel and robust classifier based on a decision tree and tabu search algorithms, respectively. In the aim of improving performance, our proposed algorithm constructs multiple decision trees while employing a tabu search algorithm to consistently monitor the leaf and decision nodes in the corresponding decision trees. Additionally, the used tabu search algorithm is responsible to balance the entropy of the corresponding decision trees. For training the model, we used the clinical data of COVID-19 patients to predict whether a patient is suffering. The experimental results were obtained using our proposed classifier based on the built-in sci-kit learn library in Python. The extensive analysis for the performance comparison was presented using Big O and statistical analysis for conventional supervised machine learning algorithms. Moreover, the performance comparison to optimized state-of-the-art classifiers is also presented. The achieved accuracy of 98%, the required execution time of 55.6 ms and the area under receiver operating characteristic (AUROC) for proposed method of 0.95 reveals that the proposed classifier algorithm is convenient for large datasets.

Comparison of the Performance of Machine Learning Algorithms in Predicting Heart Disease

Frontiers in Health Informatics ◽

10.30699/fhi.v10i1.349 ◽

2021 ◽

Vol 10 (1) ◽

pp. 99

Author(s):

Sajad Yousefi

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Heart Disease ◽

Decision Tree ◽

Roc Curve ◽

Learning Models ◽

Algorithm Performance ◽

Machine Learning Models

Introduction: Heart disease is often associated with conditions such as clogged arteries due to the sediment accumulation which causes chest pain and heart attack. Many people die due to the heart disease annually. Most countries have a shortage of cardiovascular specialists and thus, a significant percentage of misdiagnosis occurs. Hence, predicting this disease is a serious issue. Using machine learning models performed on multidimensional dataset, this article aims to find the most efficient and accurate machine learning models for disease prediction.Material and Methods: Several algorithms were utilized to predict heart disease among which Decision Tree, Random Forest and KNN supervised machine learning are highly mentioned. The algorithms are applied to the dataset taken from the UCI repository including 294 samples. The dataset includes heart disease features. To enhance the algorithm performance, these features are analyzed, the feature importance scores and cross validation are considered.Results: The algorithm performance is compared with each other, so that performance based on ROC curve and some criteria such as accuracy, precision, sensitivity and F1 score were evaluated for each model. As a result of evaluation, Accuracy, AUC ROC are 83% and 99% respectively for Decision Tree algorithm. Logistic Regression algorithm with accuracy and AUC ROC are 88% and 91% respectively has better performance than other algorithms. Therefore, these techniques can be useful for physicians to predict heart disease patients and prescribe them correctly.Conclusion: Machine learning technique can be used in medicine for analyzing the related data collections to a disease and its prediction. The area under the ROC curve and evaluating criteria related to a number of classifying algorithms of machine learning to evaluate heart disease and indeed, the prediction of heart disease is compared to determine the most appropriate classification. As a result of evaluation, better performance was observed in both Decision Tree and Logistic Regression models.

Analysis of Decision Tree Induction Algorithms

Research Society and Development ◽

10.33448/rsd-v8i11.1473 ◽

2019 ◽

Vol 8 (11) ◽

pp. e298111473

Author(s):

Hugo Kenji Rodrigues Okada ◽

Andre Ricardo Nascimento das Neves ◽

Ricardo Shitsuka

Keyword(s):

Machine Learning ◽

Decision Tree ◽

Decision Trees ◽

Quantitative Study ◽

Data Structures ◽

Execution Time ◽

Decision Tree Induction ◽

Classification And Regression ◽

Cart Algorithm

Decision trees are data structures or computational methods that enable nonparametric supervised machine learning and are used in classification and regression tasks. The aim of this paper is to present a comparison between the decision tree induction algorithms C4.5 and CART. A quantitative study is performed in which the two methods are compared by analyzing the following aspects: operation and complexity. The experiments presented practically equal hit percentages in the execution time for tree induction, however, the CART algorithm was approximately 46.24% slower than C4.5 and was considered to be more effective.

Encrypted DNP3 Traffic Classification Using Supervised Machine Learning Algorithms

Machine Learning and Knowledge Extraction ◽

10.3390/make1010022 ◽

2019 ◽

Vol 1 (1) ◽

pp. 384-399 ◽

Cited By ~ 2

Author(s):

Thais de Toledo ◽

Nunzio Torrisi

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Decision Tree ◽

Smart Grids ◽

Learning Algorithms ◽

Electric Utility ◽

Support Vector ◽

Communication Link

The Distributed Network Protocol (DNP3) is predominately used by the electric utility industry and, consequently, in smart grids. The Peekaboo attack was created to compromise DNP3 traffic, in which a man-in-the-middle on a communication link can capture and drop selected encrypted DNP3 messages by using support vector machine learning algorithms. The communication networks of smart grids are a important part of their infrastructure, so it is of critical importance to keep this communication secure and reliable. The main contribution of this paper is to compare the use of machine learning techniques to classify messages of the same protocol exchanged in encrypted tunnels. The study considers four simulated cases of encrypted DNP3 traffic scenarios and four different supervised machine learning algorithms: Decision tree, nearest-neighbor, support vector machine, and naive Bayes. The results obtained show that it is possible to extend a Peekaboo attack over multiple substations, using a decision tree learning algorithm, and to gather significant information from a system that communicates using encrypted DNP3 traffic.

Supervised machine learning based liver disease prediction approach with LASSO feature selection

Bulletin of Electrical Engineering and Informatics ◽

10.11591/eei.v10i6.3242 ◽

2021 ◽

Vol 10 (6) ◽

pp. 3369-3376

Author(s):

Saima Afrin ◽

F. M. Javed Mehedi Shamrat ◽

Tafsirul Islam Nibir ◽

Mst. Fahmida Muntasim ◽

Md. Shakil Moharram ◽

...

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Liver Disease ◽

Decision Tree ◽

Medical Science ◽

Gradient Boosting ◽

Support Vector ◽

Machine Learning Classification ◽

Prediction Approach

In this contemporary era, the uses of machine learning techniques are increasing rapidly in the field of medical science for detecting various diseases such as liver disease (LD). Around the globe, a large number of people die because of this deadly disease. By diagnosing the disease in a primary stage, early treatment can be helpful to cure the patient. In this research paper, a method is proposed to diagnose the LD using supervised machine learning classification algorithms, namely logistic regression, decision tree, random forest, AdaBoost, KNN, linear discriminant analysis, gradient boosting and support vector machine (SVM). We also deployed a least absolute shrinkage and selection operator (LASSO) feature selection technique on our taken dataset to suggest the most highly correlated attributes of LD. The predictions with 10 fold cross-validation (CV) made by the algorithms are tested in terms of accuracy, sensitivity, precision and f1-score values to forecast the disease. It is observed that the decision tree algorithm has the best performance score where accuracy, precision, sensitivity and f1-score values are 94.295%, 92%, 99% and 96% respectively with the inclusion of LASSO. Furthermore, a comparison with recent studies is shown to prove the significance of the proposed system.

Predictive model construction for prediction of soil fertility using decision tree machine learning algorithm

Kongunadu Research Journal ◽

10.26524/krj.2021.5 ◽

2021 ◽

Vol 8 (1) ◽

pp. 30-35

Author(s):

Jayalakshmi R ◽

Savitha Devi M

Keyword(s):

Machine Learning ◽

Decision Tree ◽

Soil Fertility ◽

Learning Algorithms ◽

Crop Productivity ◽

Support Vector ◽

Severe Problem ◽

Agriculture Sector

Agriculture sector is recognized as the backbone of the Indian economy that plays a crucial role in the growth of the nation’s economy. It imparts on weather and other environmental aspects. Some of the factors on which agriculture is reliant are Soil, climate, flooding, fertilizers, temperature, precipitation, crops, insecticides, and herb. The soil fertility is dependent on these factors and hence difficult to predict. However, the Agriculture sector in India is facing the severe problem of increasing crop productivity. Farmers lack the essential knowledge of nutrient content of the soil, selection of crop best suited for the soil and they also lack efficient methods for predicting crop well in advance so that appropriate methods have been used to improve crop productivity. This paper presents different Supervised Machine Learning Algorithms such as Decision tree, K-Nearest Neighbor (KNN), Support Vector Machine (SVM) to predict the fertility of soil based on macro-nutrients and micro-nutrients status found in the dataset. Supervised Machine Learning algorithms are applied on the training dataset and are tested with the test dataset, and the implementation of these algorithms is done using R Tool. The performance analysis of these algorithms is done using different evaluation metrics like mean absolute error, cross-validation, and accuracy. Result analysis shows that the Decision tree is produced the best accuracy of 99% with a very less mean square error (MSE) rate.

E-Antenatal assistance care using decision tree analytics and cluster analytics based supervised machine learning

2017 International Conference on IoT and Application (ICIOT) ◽

10.1109/iciota.2017.8073617 ◽

2017 ◽

Cited By ~ 1

Author(s):

G. Saranya ◽

G. Geetha ◽

M. Safa

Keyword(s):

Machine Learning ◽

Decision Tree ◽

Supervised Machine Learning

Prediction of Autism Spectrum Disorder Using Supervised Machine Learning Algorithms

Asian Journal of Computer Science and Technology ◽

10.51983/ajcst-2019.8.3.2734 ◽

2019 ◽

Vol 8 (3) ◽

pp. 15-18

Author(s):

T. Lakshmi Praveena ◽

N. V. Muthu Lakshmi

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Autism Spectrum Disorder ◽

Random Forest ◽

Decision Tree ◽

Autism Spectrum ◽

Early Years ◽

Spectrum Disorder ◽

Treatment Mechanisms

Autism appears to be a neuro developmental disorder that is visible in the early years. It is a wide-spectrum disorder that indicates that the severity and symptoms can vary from person to person. The Centre for Disease Control found that one in 68 was diagnosed with autism spectrum disorder with increasing numbers in every year. Detection of autism in adults is a cumbersome procedure because in adults, many symptoms can blend with some other mental health, motor impairment disorders so misinterpretation of actual diseases can in turn lead to a terrible life without proper diagnosis and effective treatment mechanisms. Machine learning is a powerful computer tool that supports different application domains Learning complex relationships or patterns from large datasets to draw accurate conclusions. Disease assessment can be done with predictive health data analysis and more appropriate treatment mechanisms that are now a hot area of research. Supervised learning is an important step of Machine learning which uses a rule-based approach by examining empirical data sets to build accurate predictive models. In this paper, decision tree, random forest, SVM, neural networks algorithms are applied on autism spectrum data which have been collected from UCI repository. The results of decision tree, random forest, SVM, neural networks algorithms on autism dataset are presented in this paper in an efficient manner. Analysis performed over these accurate results which will be useful to make right decisions in predicting autism spectrum disorder (ASD) at early stages. Thus, early autism intervention using machine learning techniques opens up a new way for autistic individuals to develop the potential to lead a better life by improving their behavioural and emotional skills.

Classification of Agriculture Farm Machinery Using Machine Learning and Internet of Things

Symmetry ◽

10.3390/sym13030403 ◽

2021 ◽

Vol 13 (3) ◽

pp. 403

Author(s):

Muhammad Waleed ◽

Tai-Won Um ◽

Tariq Kamal ◽

Syed Muhammad Usman

Keyword(s):

Machine Learning ◽

Random Forest ◽

Decision Tree ◽

Machine Learning Techniques ◽

Gradient Boosting ◽

Support Vector ◽

Farm Machinery ◽

Learning Techniques

In this paper, we apply the multi-class supervised machine learning techniques for classifying the agriculture farm machinery. The classification of farm machinery is important when performing the automatic authentication of field activity in a remote setup. In the absence of a sound machine recognition system, there is every possibility of a fraudulent activity taking place. To address this need, we classify the machinery using five machine learning techniques—K-Nearest Neighbor (KNN), Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF) and Gradient Boosting (GB). For training of the model, we use the vibration and tilt of machinery. The vibration and tilt of machinery are recorded using the accelerometer and gyroscope sensors, respectively. The machinery included the leveler, rotavator and cultivator. The preliminary analysis on the collected data revealed that the farm machinery (when in operation) showed big variations in vibration and tilt, but observed similar means. Additionally, the accuracies of vibration-based and tilt-based classifications of farm machinery show good accuracy when used alone (with vibration showing slightly better numbers than the tilt). However, the accuracies improve further when both (the tilt and vibration) are used together. Furthermore, all five machine learning algorithms used for classification have an accuracy of more than 82%, but random forest was the best performing. The gradient boosting and random forest show slight over-fitting (about 9%), but both algorithms produce high testing accuracy. In terms of execution time, the decision tree takes the least time to train, while the gradient boosting takes the most time.

Breast cancer prediction using an optimal machine learning technique for next generation sequences

Concurrent Engineering ◽

10.1177/1063293x21991808 ◽

2021 ◽

pp. 1063293X2199180

Author(s):

Babymol Kurian ◽

VL Jyothi

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Decision Tree ◽

Classification Model ◽

Support Vector ◽

Next Generation ◽

Machine Learning Technique ◽

Cancer Prediction ◽

Learning Technique

A wide reach on cancer prediction and detection using Next Generation Sequencing (NGS) by the application of artificial intelligence is highly appreciated in the current scenario of the medical field. Next generation sequences were extracted from NCBI (National Centre for Biotechnology Information) gene repository. Sequences of normal Homo sapiens (Class 1), BRCA1 (Class 2) and BRCA2 (Class 3) were extracted for Machine Learning (ML) purpose. The total volume of datasets extracted for the process were 1580 in number under four categories of 50, 100, 150 and 200 sequences. The breast cancer prediction process was carried out in three major steps such as feature extraction, machine learning classification and performance evaluation. The features were extracted with sequences as input. Ten features of DNA sequences such as ORF (Open Reading Frame) count, individual nucleobase average count of A, T, C, G, AT and GC-content, AT/GC composition, G-quadruplex occurrence, MR (Mutation Rate) were extracted from three types of sequences for the classification process. The sequence type was also included as a target variable to the feature set with values 0, 1 and 2 for classes 1, 2 and 3 respectively. Nine various supervised machine learning techniques like LR (Logistic Regression statistical model), LDA (Linear Discriminant analysis model), k-NN (k nearest neighbours’ algorithm), DT (Decision tree technique), NB (Naive Bayes classifier), SVM (Support-Vector Machine algorithm), RF (Random Forest learning algorithm), AdaBoost (AB) and Gradient Boosting (GB) were employed on four various categories of datasets. Of all supervised models, decision tree machine learning technique performed most with maximum accuracy in classification of 94.03%. Classification model performance was evaluated using precision, recall, F1-score and support values wherein F1-score was most similar to the classification accuracy.