scholarly journals Performance Evaluation of Machine Learning Predictive Analytical Model for Determining the Job Applicants Employment Status

2021 ◽  
Vol 6 (1) ◽  
pp. 67-79
Author(s):  
Olalekan Awujoola ◽  
Philip O Odion ◽  
Martins E Irhebhude ◽  
Halima Aminu

Several higher institution of learning faces issue or difficulty of turning out more than 90% of their graduates who can competently satisfy and meet the requirements of the industry. However, the industry is also confronted with the difficulty of sourcing skilled tertiary institution graduates that match their needs. Failure or success of any organization depends mostly on how its workforce is recruited and retained. Therefore, the selection of an acceptable or satisfactory candidate for the job position is one of the major and vital problems of management decision-making. This work, therefore, proposes a modern, accurate and worthy machine learning classification model that can be deployed, implemented, and put to use when making predictions and assessments on job applicant's attributes from their academic performance datasets in other to meet the selection criteria for the industry. Both supervised and unsupervised machine learning classifiers were considered in this work. Naïve Bayes, Logistic Regression, support vector machine (SVM). Random Forest and Decision tree performed well, but Logistic Regression outperformed others with 93% accuracy.

2021 ◽  
Vol 42 (Supplement_1) ◽  
Author(s):  
M J Espinosa Pascual ◽  
P Vaquero Martinez ◽  
V Vaquero Martinez ◽  
J Lopez Pais ◽  
B Izquierdo Coronel ◽  
...  

Abstract Introduction Out of all patients admitted with Myocardial Infarction, 10 to 15% have Myocardial Infarction with Non-Obstructive Coronaries Arteries (MINOCA). Classification algorithms based on deep learning substantially exceed traditional diagnostic algorithms. Therefore, numerous machine learning models have been proposed as useful tools for the detection of various pathologies, but to date no study has proposed a diagnostic algorithm for MINOCA. Purpose The aim of this study was to estimate the diagnostic accuracy of several automated learning algorithms (Support-Vector Machine [SVM], Random Forest [RF] and Logistic Regression [LR]) to discriminate between people suffering from MINOCA from those with Myocardial Infarction with Obstructive Coronary Artery Disease (MICAD) at the time of admission and before performing a coronary angiography, whether invasive or not. Methods A Diagnostic Test Evaluation study was carried out applying the proposed algorithms to a database constituted by 553 consecutive patients admitted to our Hospital with Myocardial Infarction. According to the definitions of 2016 ESC Position Paper on MINOCA, patients were classified into two groups: MICAD and MINOCA. Out of the total 553 patients, 214 were discarded due to the lack of complete data. The set of machine learning algorithms was trained on 244 patients (training sample: 75%) and tested on 80 patients (test sample: 25%). A total of 64 variables were available for each patient, including demographic, clinical and laboratorial features before the angiographic procedure. Finally, the diagnostic precision of each architecture was taken. Results The most accurate classification model was the Random Forest algorithm (Specificity [Sp] 0.88, Sensitivity [Se] 0.57, Negative Predictive Value [NPV] 0.93, Area Under the Curve [AUC] 0.85 [CI 0.83–0.88]) followed by the standard Logistic Regression (Sp 0.76, Se 0.57, NPV 0.92 AUC 0.74 and Support-Vector Machine (Sp 0.84, Se 0.38, NPV 0.90, AUC 0.78) (see graph). The variables that contributed the most in order to discriminate a MINOCA from a MICAD were the traditional cardiovascular risk factors, biomarkers of myocardial injury, hemoglobin and gender. Results were similar when the 19 patients with Takotsubo syndrome were excluded from the analysis. Conclusion A prediction system for diagnosing MINOCA before performing coronary angiographies was developed using machine learning algorithms. Results show higher accuracy of diagnosing MINOCA than conventional statistical methods. This study supports the potential of machine learning algorithms in clinical cardiology. However, further studies are required in order to validate our results. FUNDunding Acknowledgement Type of funding sources: None. ROC curves of different algorithms


2021 ◽  
pp. 089198872199355
Author(s):  
Anastasia Bougea ◽  
Efthymia Efthymiopoulou ◽  
Ioanna Spanou ◽  
Panagiotis Zikos

Objective: Our aim was to develop a machine learning algorithm based only on non-invasively clinic collectable predictors, for the accurate diagnosis of these disorders. Methods: This is an ongoing prospective cohort study ( ClinicalTrials.gov identifier NCT number NCT04448340) of 78 PDD and 62 DLB subjects whose diagnostic follow-up is available for at least 3 years after the baseline assessment. We used predictors such as clinico-demographic characteristics, 6 neuropsychological tests (mini mental, PD Cognitive Rating Scale, Brief Visuospatial Memory test, Symbol digit written, Wechsler adult intelligence scale, trail making A and B). We investigated logistic regression, K-Nearest Neighbors (K-NNs) Support Vector Machine (SVM), Naïve Bayes classifier, and Ensemble Model for their ability to predict successfully PDD or DLB diagnosis. Results: The K-NN classification model had an accuracy 91.2% of overall cases based on 15 best clinical and cognitive scores achieving 96.42% sensitivity and 81% specificity on discriminating between DLB and PDD. The binomial logistic regression classification model achieved an accuracy of 87.5% based on 15 best features, showing 93.93% sensitivity and 87% specificity. The SVM classification model had an accuracy 84.6% of overall cases based on 15 best features achieving 90.62% sensitivity and 78.58% specificity. A model created on Naïve Bayes classification had 82.05% accuracy, 93.10% sensitivity and 74.41% specificity. Finally, an Ensemble model, synthesized by the individual ones, achieved 89.74% accuracy, 93.75% sensitivity and 85.73% specificity. Conclusion: Machine learning method predicted with high accuracy, sensitivity and specificity PDD or DLB diagnosis based on non-invasively and easily in-the-clinic and neuropsychological tests.


2018 ◽  
Vol 2018 ◽  
pp. 1-8 ◽  
Author(s):  
Guangzhou An ◽  
Kazuko Omodaka ◽  
Satoru Tsuda ◽  
Yukihiro Shiga ◽  
Naoko Takada ◽  
...  

This study develops an objective machine-learning classification model for classifying glaucomatous optic discs and reveals the classificatory criteria to assist in clinical glaucoma management. In this study, 163 glaucoma eyes were labelled with four optic disc types by three glaucoma specialists and then randomly separated into training and test data. All the images of these eyes were captured using optical coherence tomography and laser speckle flowgraphy to quantify the ocular structure and blood-flow-related parameters. A total of 91 parameters were extracted from each eye along with the patients’ background information. Machine-learning classifiers, including the neural network (NN), naïve Bayes (NB), support vector machine (SVM), and gradient boosted decision trees (GBDT), were trained to build the classification models, and a hybrid feature selection method that combines minimum redundancy maximum relevance and genetic-algorithm-based feature selection was applied to find the most valid and relevant features for NN, NB, and SVM. A comparison of the performance of the three machine-learning classification models showed that the NN had the best classification performance with a validated accuracy of 87.8% using only nine ocular parameters. These selected quantified parameters enabled the trained NN to classify glaucomatous optic discs with relatively high performance without requiring color fundus images.


Water ◽  
2021 ◽  
Vol 13 (17) ◽  
pp. 2387
Author(s):  
Fernando Salazar ◽  
André Conde ◽  
Joaquín Irazábal ◽  
David J. Vicente

Dam safety assessment is typically made by comparison between the outcome of some predictive model and measured monitoring data. This is done separately for each response variable, and the results are later interpreted before decision making. In this work, three approaches based on machine learning classifiers are evaluated for the joint analysis of a set of monitoring variables: multi-class, two-class and one-class classification. Support vector machines are applied to all prediction tasks, and random forest is also used for multi-class and two-class. The results show high accuracy for multi-class classification, although the approach has limitations for practical use. The performance in two-class classification is strongly dependent on the features of the anomalies to detect and their similarity to those used for model fitting. The one-class classification model based on support vector machines showed high prediction accuracy, while avoiding the need for correctly selecting and modelling the potential anomalies. A criterion for anomaly detection based on model predictions is defined, which results in a decrease in the misclassification rate. The possibilities and limitations of all three approaches for practical use are discussed.


Medicina ◽  
2021 ◽  
Vol 57 (11) ◽  
pp. 1230
Author(s):  
Jae-Geum Shim ◽  
Kyoung-Ho Ryu ◽  
Eun-Ah Cho ◽  
Jin Hee Ahn ◽  
Hong Kyoon Kim ◽  
...  

Background and Objectives: Chronic lower back pain (LBP) is a common clinical disorder. The early identification of patients who will develop chronic LBP would help develop preventive measures and treatment. We aimed to develop machine learning models that can accurately predict the risk of chronic LBP. Materials and Methods: Data from the Sixth Korea National Health and Nutrition Examination Survey conducted in 2014 and 2015 (KNHANES VI-2, 3) were screened for selecting patients with chronic LBP. LBP lasting >30 days in the past 3 months was defined as chronic LBP in the survey. The following classification models with machine learning algorithms were developed and validated to predict chronic LBP: logistic regression (LR), k-nearest neighbors (KNN), naïve Bayes (NB), decision tree (DT), random forest (RF), gradient boosting machine (GBM), support vector machine (SVM), and artificial neural network (ANN). The performance of these models was compared with respect to the area under the receiver operating characteristic curve (AUROC). Results: A total of 6119 patients were analyzed in this study, of which 1394 had LBP. The feature selected data consisted of 13 variables. The LR, KNN, NB, DT, RF, GBM, SVM, and ANN models showed performances (in terms of AUROCs) of 0.656, 0.656, 0.712, 0.671, 0.699, 0.660, 0.707, and 0.716, respectively, with ten-fold cross-validation. Conclusions: In this study, the ANN model was identified as the best machine learning classification model for predicting the occurrence of chronic LBP. Therefore, machine learning could be effectively applied in the identification of populations at high risk of chronic LBP.


2021 ◽  
Vol 11 (18) ◽  
pp. 8596
Author(s):  
Swetha Chittam ◽  
Balakrishna Gokaraju ◽  
Zhigang Xu ◽  
Jagannathan Sankar ◽  
Kaushik Roy

There is a high need for a big data repository for material compositions and their derived analytics of metal strength, in the material science community. Currently, many researchers maintain their own excel sheets, prepared manually by their team by tabulating the experimental data collected from scientific journals, and analyzing the data by performing manual calculations using formulas to determine the strength of the material. In this study, we propose a big data storage for material science data and its processing parameters information to address the laborious process of data tabulation from scientific articles, data mining techniques to retrieve the information from databases to perform big data analytics, and a machine learning prediction model to determine material strength insights. Three models are proposed based on Logistic regression, Support vector Machine SVM and Random Forest Algorithms. These models are trained and tested using a 10-fold cross validation approach. The Random Forest classification model performed better on the independent dataset, with 87% accuracy in comparison to Logistic regression and SVM with 72% and 78%, respectively.


2020 ◽  
Vol 4 (1) ◽  
pp. 86-96
Author(s):  
Ricky Risnantoyo ◽  
Arifin Nugroho ◽  
Kresna Mandara

Corona virus outbreaks that occur in almost all countries in the world have an impact not only in the health sector, but also in other sectors such as tourism, finance, transportation, etc. This raises a variety of sentiments from the public with the emergence of corona virus as a trending topic on Twitter social media. Twitter was chosen by the public because it can disseminate information in real time and can see market reactions quickly. This research uses "tweet" data or public tweet related to "Corona Virus" to see how the sentiment polarity arises. Text mining techniques and three machine learning classification algorithms are used, including Naive Bayes, Support Vector Machine (SVM), K-Nearest Neighbor (K-NN) to build a tweet classification model of sentiments whether they have positive, negative, or neutral polarity. The highest test results are generated by the Support Vector Machine (SVM) algorithm with an accuracy value of 76.21%, a precision value of 78.04%, and a recall value of 71.42%.Keywords: Machine Learning, Corona Virus, Twitter, Sentiment Analysis.


Author(s):  
P Sai Teja

Unsolicited e-mail also known as Spam has become a huge concern for each e-mail user. In recent times, it is very difficult to filter spam emails as these emails are produced or created or written in a very special manner so that anti-spam filters cannot detect such emails. This paper compares and reviews performance metrics of certain categories of supervised machine learning techniques such as SVM (Support Vector Machine), Random Forest, Decision Tree, CNN, (Convolutional Neural Network), KNN(K Nearest Neighbor), MLP(Multi-Layer Perceptron), Adaboost (Adaptive Boosting) Naïve Bayes algorithm to predict or classify into spam emails. The objective of this study is to consider the details or content of the emails, learn a finite dataset available and to develop a classification model that will be able to predict or classify whether an e-mail is spam or not.


Malware damages computers without user's consent; they cause various threats unknowingly, hence detection of these is very crucial. In this study, we proposed to detect the presence of malware by using the classification technique of Machine Learning. Classification type in Machine Learning requires the output variable to be of a categorical kind; it attempts to draw some conclusion from the ascertained values. In short, classification constructs a model based on the training set and values or predicts categorical class labels. In our work, we propose to classify the presence of malware by incorporating two chief classification algorithms, such as Support Vector Machine and Logistic Regression. The data set used for it was not satisfactory. Consequently, we tend to explore a data set that met our necessities and enforced Logistic Regression on the same moreover, we plotted a scatter-gram for the scope of visualization and incorporated XG-Boost for the performance enhancement. This study assists in analyzing the presence of malware by adopting a proper dataset and ascertaining pivotal attributes leading to this classification.


The advent of internet has lead to colossal development of e-learning frameworks. The efficiency of such systems however relies on the effectiveness and fast content based retrieval approaches. This paper presents a methodology for efficient search and retrieval of lecture videos based on Machine Learning (ML) text classification algorithm. The text transcript is generated exclusively from the audio content extracted from the video lectures. This content is utilized for the summary and keyword extraction which is used for training the ML text classification model. An optimized search is achieved based on the trained ML model. The performance of the system is compared by training the system using Naive Bayes, Support Vector Machine and Logistic Regression algorithms. Performance evaluation was done by precision, recall, F-score and accuracy of the search for each of the classifiers. It is observed that the system trained on Naive Bayes classification algorithm achieved better performance both in terms of time and also with respect to relevancy of the search results


Sign in / Sign up

Export Citation Format

Share Document