Comparison of Machine-Learning Classification Models for Glaucoma Management

This study develops an objective machine-learning classification model for classifying glaucomatous optic discs and reveals the classificatory criteria to assist in clinical glaucoma management. In this study, 163 glaucoma eyes were labelled with four optic disc types by three glaucoma specialists and then randomly separated into training and test data. All the images of these eyes were captured using optical coherence tomography and laser speckle flowgraphy to quantify the ocular structure and blood-flow-related parameters. A total of 91 parameters were extracted from each eye along with the patients’ background information. Machine-learning classifiers, including the neural network (NN), naïve Bayes (NB), support vector machine (SVM), and gradient boosted decision trees (GBDT), were trained to build the classification models, and a hybrid feature selection method that combines minimum redundancy maximum relevance and genetic-algorithm-based feature selection was applied to find the most valid and relevant features for NN, NB, and SVM. A comparison of the performance of the three machine-learning classification models showed that the NN had the best classification performance with a validated accuracy of 87.8% using only nine ocular parameters. These selected quantified parameters enabled the trained NN to classify glaucomatous optic discs with relatively high performance without requiring color fundus images.

Download Full-text

Supervised machine learning based liver disease prediction approach with LASSO feature selection

Bulletin of Electrical Engineering and Informatics ◽

10.11591/eei.v10i6.3242 ◽

2021 ◽

Vol 10 (6) ◽

pp. 3369-3376

Author(s):

Saima Afrin ◽

F. M. Javed Mehedi Shamrat ◽

Tafsirul Islam Nibir ◽

Mst. Fahmida Muntasim ◽

Md. Shakil Moharram ◽

...

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Liver Disease ◽

Decision Tree ◽

Medical Science ◽

Supervised Machine Learning ◽

Gradient Boosting ◽

Support Vector ◽

Machine Learning Classification ◽

Prediction Approach

In this contemporary era, the uses of machine learning techniques are increasing rapidly in the field of medical science for detecting various diseases such as liver disease (LD). Around the globe, a large number of people die because of this deadly disease. By diagnosing the disease in a primary stage, early treatment can be helpful to cure the patient. In this research paper, a method is proposed to diagnose the LD using supervised machine learning classification algorithms, namely logistic regression, decision tree, random forest, AdaBoost, KNN, linear discriminant analysis, gradient boosting and support vector machine (SVM). We also deployed a least absolute shrinkage and selection operator (LASSO) feature selection technique on our taken dataset to suggest the most highly correlated attributes of LD. The predictions with 10 fold cross-validation (CV) made by the algorithms are tested in terms of accuracy, sensitivity, precision and f1-score values to forecast the disease. It is observed that the decision tree algorithm has the best performance score where accuracy, precision, sensitivity and f1-score values are 94.295%, 92%, 99% and 96% respectively with the inclusion of LASSO. Furthermore, a comparison with recent studies is shown to prove the significance of the proposed system.

Download Full-text

An analysis of PCOS disease prediction model using machine learning classification algorithms

Recent Patents on Engineering ◽

10.2174/1872212115999201224130204 ◽

2020 ◽

Vol 15 ◽

Author(s):

Shivani Aggarwal ◽

Kavita Pandey

Keyword(s):

Machine Learning ◽

Insulin Resistance ◽

Feature Selection ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Support Vector ◽

Classification Algorithms ◽

Metabolic Abnormalities ◽

Related Disorder ◽

Machine Learning Classification

Background: Polycystic ovary syndrome is commonly known as PCOS and it is surprising that it affects up to 18% of women in reproductive age. PCOS is the most usually occurring hormone-related disorder. Some of the symptoms of PCOS are irregular periods, increased facial and body hair growth, attain more weight, darkening of skin, diabetes and trouble conceiving (infertility). It also came into light that patients suffering from PCOS also possess a range of metabolic abnormalities. Due to metabolic abnormalities, some disorder may occur which increase the risk of insulin resistance, type 2 diabetes and impaired glucose tolerance (a sign of prediabetes). Family members of women suffering from PCOS are also at higher hazardous level for developing the same metabolic abnormalities. Obesity and overweight status contribute to insulin resistance in PCOS. Objective: In the modern era, there are several new technologies available to diagnose PCOS and one of them is Machine learning algorithms because they are exposed to new data. These algorithms learn from past experiences to produce reliable and repeatable decisions. In this article, Machine learning algorithms are used to identify the important features to diagnose PCOS. Methods: Several classification algorithms like Support vector machine (SVM), Logistic Regression, Gradient Boosting, Random Forest, Decision Tree and K-Nearest Neighbor (KNN) are uses well organized test datasets for classify huge records. Initially a dataset of 541 instances and 41 attributes has been taken to apply the prediction models and a manual feature selection is done over it. Results: After the feature selection, a set of 12 attributes has been identified which plays a crucial role in diagnosing PCOS. Conclusion: There are several researches progressing in the direction of diagnosing PCOS but till now the relevant features are not identify for the same.

Download Full-text

Performance Evaluation of Machine Learning Predictive Analytical Model for Determining the Job Applicants Employment Status

Malaysian Journal of Applied Sciences ◽

10.37231/myjas.2021.6.1.276 ◽

2021 ◽

Vol 6 (1) ◽

pp. 67-79

Author(s):

Olalekan Awujoola ◽

Philip O Odion ◽

Martins E Irhebhude ◽

Halima Aminu

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Job Applicants ◽

Classification Model ◽

Management Decision ◽

Support Vector ◽

Tertiary Institution ◽

Machine Learning Classification ◽

Higher Institution ◽

Management Decision Making

Several higher institution of learning faces issue or difficulty of turning out more than 90% of their graduates who can competently satisfy and meet the requirements of the industry. However, the industry is also confronted with the difficulty of sourcing skilled tertiary institution graduates that match their needs. Failure or success of any organization depends mostly on how its workforce is recruited and retained. Therefore, the selection of an acceptable or satisfactory candidate for the job position is one of the major and vital problems of management decision-making. This work, therefore, proposes a modern, accurate and worthy machine learning classification model that can be deployed, implemented, and put to use when making predictions and assessments on job applicant's attributes from their academic performance datasets in other to meet the selection criteria for the industry. Both supervised and unsupervised machine learning classifiers were considered in this work. Naïve Bayes, Logistic Regression, support vector machine (SVM). Random Forest and Decision tree performed well, but Logistic Regression outperformed others with 93% accuracy.

Download Full-text

FEATURE SELECTION AND MACHINE LEARNING CLASSIFICATION FOR MALWARE DETECTION

Jurnal Teknologi ◽

10.11113/jt.v77.3558 ◽

2015 ◽

Vol 77 (1) ◽

Cited By ~ 13

Author(s):

Ban Mohammed Khammas ◽

Alireza Monemi ◽

Joseph Stephen Bassi ◽

Ismahani Ismail ◽

Sulaiman Mohd Nor ◽

...

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Computer Security ◽

Malware Detection ◽

Principal Component ◽

Machine Learning Techniques ◽

Detection Methods ◽

Support Vector ◽

Machine Learning Classification ◽

Minimum Number

Malware is a computer security problem that can morph to evade traditional detection methods based on known signature matching. Since new malware variants contain patterns that are similar to those in observed malware, machine learning techniques can be used to identify new malware. This work presents a comparative study of several feature selection methods with four different machine learning classifiers in the context of static malware detection based on n-grams analysis. The result shows that the use of Principal Component Analysis (PCA) feature selection and Support Vector Machines (SVM) classification gives the best classification accuracy using a minimum number of features.

Download Full-text

Minimum Mapping from EMG Signals at Human Elbow and Shoulder Movements into Two DoF Upper-Limb Robot with Machine Learning

Machines ◽

10.3390/machines9030056 ◽

2021 ◽

Vol 9 (3) ◽

pp. 56

Author(s):

Pringgo Widyo Laksono ◽

Takahide Kitamura ◽

Joseph Muguro ◽

Kojiro Matsushita ◽

Minoru Sasaki ◽

...

Keyword(s):

Machine Learning ◽

Decision Tree ◽

Nearest Neighbor ◽

Energy Operator ◽

Classification Model ◽

Robotic Arm ◽

Support Vector ◽

Classification Models ◽

Elbow Extension ◽

K Nearest Neighbor

This research focuses on the minimum process of classifying three upper arm movements (elbow extension, shoulder extension, combined shoulder and elbow extension) of humans with three electromyography (EMG) signals, to control a 2-degrees of freedom (DoF) robotic arm. The proposed minimum process consists of four parts: time divisions of data, Teager–Kaiser energy operator (TKEO), the conventional EMG feature extraction (i.e., the mean absolute value (MAV), zero crossings (ZC), slope-sign changes (SSC), and waveform length (WL)), and eight major machine learning models (i.e., decision tree (medium), decision tree (fine), k-Nearest Neighbor (KNN) (weighted KNN, KNN (fine), Support Vector Machine (SVM) (cubic and fine Gaussian SVM), Ensemble (bagged trees and subspace KNN). Then, we compare and investigate 48 classification models (i.e., 47 models are proposed, and 1 model is the conventional) based on five healthy subjects. The results showed that all the classification models achieved accuracies ranging between 74–98%, and the processing speed is below 40 ms and indicated acceptable controller delay for robotic arm control. Moreover, we confirmed that the classification model with no time division, with TKEO, and with ensemble (subspace KNN) had the best performance in accuracy rates at 96.67, recall rates at 99.66, and precision rates at 96.99. In short, the combination of the proposed TKEO and ensemble (subspace KNN) plays an important role to achieve the EMG classification.

Download Full-text

Anomaly Detection in Dam Behaviour with Machine Learning Classification Models

Water ◽

10.3390/w13172387 ◽

2021 ◽

Vol 13 (17) ◽

pp. 2387

Author(s):

Fernando Salazar ◽

André Conde ◽

Joaquín Irazábal ◽

David J. Vicente

Keyword(s):

Machine Learning ◽

Support Vector Machines ◽

Anomaly Detection ◽

Classification Model ◽

Misclassification Rate ◽

Joint Analysis ◽

Support Vector ◽

Machine Learning Classification ◽

Vector Machines ◽

One Class Classification

Dam safety assessment is typically made by comparison between the outcome of some predictive model and measured monitoring data. This is done separately for each response variable, and the results are later interpreted before decision making. In this work, three approaches based on machine learning classifiers are evaluated for the joint analysis of a set of monitoring variables: multi-class, two-class and one-class classification. Support vector machines are applied to all prediction tasks, and random forest is also used for multi-class and two-class. The results show high accuracy for multi-class classification, although the approach has limitations for practical use. The performance in two-class classification is strongly dependent on the features of the anomalies to detect and their similarity to those used for model fitting. The one-class classification model based on support vector machines showed high prediction accuracy, while avoiding the need for correctly selecting and modelling the potential anomalies. A criterion for anomaly detection based on model predictions is defined, which results in a decrease in the misclassification rate. The possibilities and limitations of all three approaches for practical use are discussed.

Download Full-text

Capturing user sentiments for online Indian movie reviews

The Electronic Library ◽

10.1108/el-04-2017-0075 ◽

2018 ◽

Vol 36 (4) ◽

pp. 677-695 ◽

Cited By ~ 3

Author(s):

Shrawan Kumar Trivedi ◽

Shubhamoy Dey ◽

Anil Kumar

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Comparative Study ◽

Sentiment Analysis ◽

Language Processing ◽

Classification Model ◽

Support Vector ◽

Content Type ◽

Machine Learning Classifiers ◽

Learning Classifiers

Purpose Sentiment analysis and opinion mining are emerging areas of research for analyzing Web data and capturing users’ sentiments. This research aims to present sentiment analysis of an Indian movie review corpus using natural language processing and various machine learning classifiers. Design/methodology/approach In this paper, a comparative study between three machine learning classifiers (Bayesian, naïve Bayesian and support vector machine [SVM]) was performed. All the classifiers were trained on the words/features of the corpus extracted, using five different feature selection algorithms (Chi-square, info-gain, gain ratio, one-R and relief-F [RF] attributes), and a comparative study was performed between them. The classifiers and feature selection approaches were evaluated using different metrics (F-value, false-positive [FP] rate and training time). Findings The results of this study show that, for the maximum number of features, the RF feature selection approach was found to be the best, with better F-values, a low FP rate and less time needed to train the classifiers, whereas for the least number of features, one-R was better than RF. When the evaluation was performed for machine learning classifiers, SVM was found to be superior, although the Bayesian classifier was comparable with SVM. Originality/value This is a novel research where Indian review data were collected and then a classification model for sentiment polarity (positive/negative) was constructed.

Download Full-text

Machine Learning Approaches to Predict Chronic Lower Back Pain in People Aged over 50 Years

Medicina ◽

10.3390/medicina57111230 ◽

2021 ◽

Vol 57 (11) ◽

pp. 1230

Author(s):

Jae-Geum Shim ◽

Kyoung-Ho Ryu ◽

Eun-Ah Cho ◽

Jin Hee Ahn ◽

Hong Kyoon Kim ◽

...

Keyword(s):

Machine Learning ◽

Back Pain ◽

Lower Back Pain ◽

Machine Learning Algorithms ◽

Classification Model ◽

Gradient Boosting ◽

Support Vector ◽

Chronic Lower Back Pain ◽

Machine Learning Classification ◽

Lower Back

Background and Objectives: Chronic lower back pain (LBP) is a common clinical disorder. The early identification of patients who will develop chronic LBP would help develop preventive measures and treatment. We aimed to develop machine learning models that can accurately predict the risk of chronic LBP. Materials and Methods: Data from the Sixth Korea National Health and Nutrition Examination Survey conducted in 2014 and 2015 (KNHANES VI-2, 3) were screened for selecting patients with chronic LBP. LBP lasting >30 days in the past 3 months was defined as chronic LBP in the survey. The following classification models with machine learning algorithms were developed and validated to predict chronic LBP: logistic regression (LR), k-nearest neighbors (KNN), naïve Bayes (NB), decision tree (DT), random forest (RF), gradient boosting machine (GBM), support vector machine (SVM), and artificial neural network (ANN). The performance of these models was compared with respect to the area under the receiver operating characteristic curve (AUROC). Results: A total of 6119 patients were analyzed in this study, of which 1394 had LBP. The feature selected data consisted of 13 variables. The LR, KNN, NB, DT, RF, GBM, SVM, and ANN models showed performances (in terms of AUROCs) of 0.656, 0.656, 0.712, 0.671, 0.699, 0.660, 0.707, and 0.716, respectively, with ten-fold cross-validation. Conclusions: In this study, the ANN model was identified as the best machine learning classification model for predicting the occurrence of chronic LBP. Therefore, machine learning could be effectively applied in the identification of populations at high risk of chronic LBP.

Download Full-text

Sentiment Analysis on Corona Virus Pandemic Using Machine Learning Algorithm

JOURNAL OF INFORMATICS AND TELECOMMUNICATION ENGINEERING ◽

10.31289/jite.v4i1.3798 ◽

2020 ◽

Vol 4 (1) ◽

pp. 86-96

Author(s):

Ricky Risnantoyo ◽

Arifin Nugroho ◽

Kresna Mandara

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Sentiment Analysis ◽

Learning Algorithm ◽

Classification Model ◽

Support Vector ◽

K Nearest Neighbor ◽

The Public ◽

Machine Learning Classification ◽

Corona Virus

Corona virus outbreaks that occur in almost all countries in the world have an impact not only in the health sector, but also in other sectors such as tourism, finance, transportation, etc. This raises a variety of sentiments from the public with the emergence of corona virus as a trending topic on Twitter social media. Twitter was chosen by the public because it can disseminate information in real time and can see market reactions quickly. This research uses "tweet" data or public tweet related to "Corona Virus" to see how the sentiment polarity arises. Text mining techniques and three machine learning classification algorithms are used, including Naive Bayes, Support Vector Machine (SVM), K-Nearest Neighbor (K-NN) to build a tweet classification model of sentiments whether they have positive, negative, or neutral polarity. The highest test results are generated by the Support Vector Machine (SVM) algorithm with an accuracy value of 76.21%, a precision value of 78.04%, and a recall value of 71.42%.Keywords: Machine Learning, Corona Virus, Twitter, Sentiment Analysis.

Download Full-text

EEG based Drowsiness Prediction Using Machine Learning Approach

Webology ◽

10.14704/web/v18i2/web18351 ◽

2021 ◽

Vol 18 (2) ◽

pp. 740-755

Author(s):

V. Vijay Priya ◽

M. Uma

Keyword(s):

Machine Learning ◽

Brain Activity ◽

Classification Model ◽

Economic Losses ◽

Feature Engineering ◽

Signal Acquisition ◽

Classification Models ◽

Injury Death ◽

Machine Learning Classification ◽

Driver Drowsiness

Drowsiness is the main cause of road accidents and it leads to severe physical injury, death, and significant economic losses. To monitor driver drowsiness various methods like Behaviour measures, Vehicle measures, Physiological measures and Hybrid measures have been used in previous research. This paper mainly focuses on physiological methods to predict the driver’s drowsiness. Several physiological methods are used to predict drowsiness. Among those methods, Electroencephalography is one of the non-invasive physiological methods to measure the brain activity of the subject. EEG brain signal extracted from the human scalp is analysed with various features and used for various health application like predicting drowsiness, fatigue etc. The main objective of the proposed system is to early predict the driver drowsiness with high accuracy so that we have divided our work into two steps. The first step is to collect the publicly available dataset of EEG based Eye state as (Eye open and Eye closed) where the signal acquisition process was done from Emotiv EEG Neuroheadset (14 electrodes) and analysed various feature engineering techniques and statistical techniques. The second step was applied with the machine learning classification model as K-NN and performance-based predicting models are used. In the Existing System, they used various machine learning classification models like K-NN and SVM for EEG Eye state classification and produced results around 80% -97%. Compared to the Existing system our proposed method produced better classification models for predicting driver drowsiness using different Feature engineering process and classification models as K-NN produced 98% of accuracy.

Download Full-text