Comparative Analysis of Machine Learning Algorithms for Multi-Syndrome Classification of Neurodegenerative Syndromes

Abstract Importance The entry of artificial intelligence into medicine is pending. Several methods have been used for predictions of structured neuroimaging data, yet nobody compared them in this context.Objective Multi-class prediction is key for building computational aid systems for differential diagnosis. We compared support vector machine, random forest, gradient boosting, and deep feed-forward neural networks for the classification of different neurodegenerative syndromes based on structural magnetic resonance imaging.Design, Setting, and Participants Atlas-based volumetry was performed on multi-centric T1weighted MRI data from 940 subjects, i.e. 124 healthy controls and 816 patients with ten different neurodegenerative diseases, leading to a multi-diagnostic multi-class classification task with eleven different classes.Interventions n.a.Main Outcomes and Measures Cohen’s Kappa, Accuracy, and F1-score to assess model performance.Results Over all, the neural network produced both the best performance measures as well as the most robust results. The smaller classes however were better classified by either the ensemble learning methods or the support vector machine, while performance measures for small classes were comparatively low, as expected. Diseases with regionally specific and pronounced atrophy patterns were generally better classified than diseases with wide-spread and rather weak atrophy.Conclusions and Relevance Our study furthermore underlines the necessity of larger data sets but also calls for a careful consideration of different machine learning methods that can handle the type of data and the classification task best.Trial Registration n.a.

Download Full-text

MODIS-FIRMS and ground-truthing based wildfire likelihood mapping of Sikkim Himalaya using machine learning algorithms.

10.21203/rs.3.rs-750123/v1 ◽

2021 ◽

Author(s):

Polash Banerjee

Keyword(s):

Machine Learning ◽

Machine Learning Algorithms ◽

Tree Cover ◽

Anthropogenic Factors ◽

Gradient Boosting ◽

Support Vector ◽

Learning Methods ◽

Sikkim Himalaya ◽

Environmental Features ◽

Machine Learning Methods

Abstract Wildfires in limited extent and intensity can be a boon for the forest ecosystem. However, recent episodes of wildfires of 2019 in Australia and Brazil are sad reminders of their heavy ecological and economical costs. Understanding the role of environmental factors in the likelihood of wildfires in a spatial context would be instrumental in mitigating it. In this study, 14 environmental features encompassing meteorological, topographical, ecological, in situ and anthropogenic factors have been considered for preparing the wildfire likelihood map of Sikkim Himalaya. A comparative study on the efficiency of machine learning methods like Generalized Linear Model (GLM), Support Vector Machine (SVM), Random Forest (RF) and Gradient Boosting Model (GBM) has been performed to identify the best performing algorithm in wildfire prediction. The study indicates that all the machine learning methods are good at predicting wildfires. However, RF has outperformed, followed by GBM in the prediction. Also, environmental features like average temperature, average wind speed, proximity to roadways and tree cover percentage are the most important determinants of wildfires in Sikkim Himalaya. This study can be considered as a decision support tool for preparedness, efficient resource allocation and sensitization of people towards mitigation of wildfires in Sikkim.

Download Full-text

Tremor Identification Using Machine Learning in Parkinson's Disease

Early Detection of Neurological Disorders Using Machine Learning Systems - Advances in Medical Technologies and Clinical Practice ◽

10.4018/978-1-5225-8567-1.ch008 ◽

2019 ◽

pp. 128-151

Author(s):

Angana Saikia ◽

Vinayak Majhi ◽

Masaraf Hussain ◽

Sudip Paul ◽

Amitava Datta

Keyword(s):

Machine Learning ◽

Parkinson’S Disease ◽

Support Vector Machine ◽

Parkinson's Disease ◽

Discriminant Analysis ◽

Learning Algorithms ◽

The Body ◽

Machine Learning Algorithms ◽

Support Vector

Tremor is an involuntary quivering movement or shake. Characteristically occurring at rest, the classic slow, rhythmic tremor of Parkinson's disease (PD) typically starts in one hand, foot, or leg and can eventually affect both sides of the body. The resting tremor of PD can also occur in the jaw, chin, mouth, or tongue. Loss of dopamine leads to the symptoms of Parkinson's disease and may include a tremor. For some people, a tremor might be the first symptom of PD. Various studies have proposed measurable technologies and the analysis of the characteristics of Parkinsonian tremors using different techniques. Various machine-learning algorithms such as a support vector machine (SVM) with three kernels, a discriminant analysis, a random forest, and a kNN algorithm are also used to classify and identify various kinds of tremors. This chapter focuses on an in-depth review on identification and classification of various Parkinsonian tremors using machine learning algorithms.

Download Full-text

Comparison of machine learning algorithms applied to symptoms to determine infectious causes of death in children: national survey of 18,000 verbal autopsies in the Million Death Study in India

BMC Public Health ◽

10.1186/s12889-021-11829-y ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Susan Idicula-Thomas ◽

Ulka Gawde ◽

Prabhat Jha

Keyword(s):

Machine Learning ◽

Verbal Autopsy ◽

Causes Of Death ◽

Prediction Models ◽

Machine Learning Algorithms ◽

Classification And Regression Tree ◽

Gradient Boosting ◽

Support Vector ◽

Diarrhoeal Diseases

Abstract Background Machine learning (ML) algorithms have been successfully employed for prediction of outcomes in clinical research. In this study, we have explored the application of ML-based algorithms to predict cause of death (CoD) from verbal autopsy records available through the Million Death Study (MDS). Methods From MDS, 18826 unique childhood deaths at ages 1–59 months during the time period 2004–13 were selected for generating the prediction models of which over 70% of deaths were caused by six infectious diseases (pneumonia, diarrhoeal diseases, malaria, fever of unknown origin, meningitis/encephalitis, and measles). Six popular ML-based algorithms such as support vector machine, gradient boosting modeling, C5.0, artificial neural network, k-nearest neighbor, classification and regression tree were used for building the CoD prediction models. Results SVM algorithm was the best performer with a prediction accuracy of over 0.8. The highest accuracy was found for diarrhoeal diseases (accuracy = 0.97) and the lowest was for meningitis/encephalitis (accuracy = 0.80). The top signs/symptoms for classification of these CoDs were also extracted for each of the diseases. A combination of signs/symptoms presented by the deceased individual can effectively lead to the CoD diagnosis. Conclusions Overall, this study affirms that verbal autopsy tools are efficient in CoD diagnosis and that automated classification parameters captured through ML could be added to verbal autopsies to improve classification of causes of death.

Download Full-text

Tremor Identification Using Machine Learning in Parkinson's Disease

Research Anthology on Diagnosing and Treating Neurocognitive Disorders ◽

10.4018/978-1-7998-3441-0.ch018 ◽

2021 ◽

pp. 341-365

Author(s):

Angana Saikia ◽

Vinayak Majhi ◽

Masaraf Hussain ◽

Sudip Paul ◽

Amitava Datta

Keyword(s):

Machine Learning ◽

Parkinson’S Disease ◽

Support Vector Machine ◽

Parkinson's Disease ◽

Discriminant Analysis ◽

Learning Algorithms ◽

The Body ◽

Machine Learning Algorithms ◽

Support Vector

Download Full-text

Machine Learning Methods for Fear Classification Based on Physiological Features

Sensors ◽

10.3390/s21134519 ◽

2021 ◽

Vol 21 (13) ◽

pp. 4519

Author(s):

Livia Petrescu ◽

Cătălin Petrescu ◽

Ana Oprea ◽

Oana Mitruț ◽

Gabriela Moise ◽

...

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Dimensionality Reduction ◽

Data Augmentation ◽

Binary Classification ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Physiological Data ◽

Gradient Boosting ◽

Support Vector

This paper focuses on the binary classification of the emotion of fear, based on the physiological data and subjective responses stored in the DEAP dataset. We performed a mapping between the discrete and dimensional emotional information considering the participants’ ratings and extracted a substantial set of 40 types of features from the physiological data, which represented the input to various machine learning algorithms—Decision Trees, k-Nearest Neighbors, Support Vector Machine and artificial networks—accompanied by dimensionality reduction, feature selection and the tuning of the most relevant hyperparameters, boosting classification accuracy. The methodology we approached included tackling different situations, such as resolving the problem of having an imbalanced dataset through data augmentation, reducing overfitting, computing various metrics in order to obtain the most reliable classification scores and applying the Local Interpretable Model-Agnostic Explanations method for interpretation and for explaining predictions in a human-understandable manner. The results show that fear can be predicted very well (accuracies ranging from 91.7% using Gradient Boosting Trees to 93.5% using dimensionality reduction and Support Vector Machine) by extracting the most relevant features from the physiological data and by searching for the best parameters which maximize the machine learning algorithms’ classification scores.

Download Full-text

Do we need different machine learning algorithms for QSAR modeling? A comprehensive assessment of 16 machine learning algorithms on 14 QSAR data sets

Briefings in Bioinformatics ◽

10.1093/bib/bbaa321 ◽

2020 ◽

Author(s):

Zhenxing Wu ◽

Minfeng Zhu ◽

Yu Kang ◽

Elaine Lai-Han Leung ◽

Tailong Lei ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Support Vector Machine ◽

Gaussian Process Regression ◽

Principal Component ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

Support Vector ◽

Data Sets ◽

Linear Svm

Abstract Although a wide variety of machine learning (ML) algorithms have been utilized to learn quantitative structure–activity relationships (QSARs), there is no agreed single best algorithm for QSAR learning. Therefore, a comprehensive understanding of the performance characteristics of popular ML algorithms used in QSAR learning is highly desirable. In this study, five linear algorithms [linear function Gaussian process regression (linear-GPR), linear function support vector machine (linear-SVM), partial least squares regression (PLSR), multiple linear regression (MLR) and principal component regression (PCR)], three analogizers [radial basis function support vector machine (rbf-SVM), K-nearest neighbor (KNN) and radial basis function Gaussian process regression (rbf-GPR)], six symbolists [extreme gradient boosting (XGBoost), Cubist, random forest (RF), multiple adaptive regression splines (MARS), gradient boosting machine (GBM), and classification and regression tree (CART)] and two connectionists [principal component analysis artificial neural network (pca-ANN) and deep neural network (DNN)] were employed to learn the regression-based QSAR models for 14 public data sets comprising nine physicochemical properties and five toxicity endpoints. The results show that rbf-SVM, rbf-GPR, XGBoost and DNN generally illustrate better performances than the other algorithms. The overall performances of different algorithms can be ranked from the best to the worst as follows: rbf-SVM > XGBoost > rbf-GPR > Cubist > GBM > DNN > RF > pca-ANN > MARS > linear-GPR ≈ KNN > linear-SVM ≈ PLSR > CART ≈ PCR ≈ MLR. In terms of prediction accuracy and computational efficiency, SVM and XGBoost are recommended to the regression learning for small data sets, and XGBoost is an excellent choice for large data sets. We then investigated the performances of the ensemble models by integrating the predictions of multiple ML algorithms. The results illustrate that the ensembles of two or three algorithms in different categories can indeed improve the predictions of the best individual ML algorithms.

Download Full-text

An Empirical Analysis of Evolved Radial Basis Function Networks and Support Vector Machines with Mixture of Kernels

International Journal of Artificial Intelligence Tools ◽

10.1142/s021821301550013x ◽

2015 ◽

Vol 24 (04) ◽

pp. 1550013 ◽

Cited By ~ 5

Author(s):

Ch. Sanjeev Kumar Dash ◽

Pulak Sahoo ◽

Satchidananda Dehuri ◽

Sung-Bae Cho

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Support Vector Machine ◽

Radial Basis Function ◽

Basis Function ◽

Machine Learning Algorithms ◽

Support Vector ◽

Imbalanced Dataset ◽

Radial Basis

Classification is one of the most fundamental and formidable tasks in many domains including biomedical. In biomedical domain, the distributions of data in most of the datasets into predefined number of classes is significantly different (i.e., the classes are distributed unevenly). Many mathematical, statistical, and machine learning approaches have been developed for classification of biomedical datasets with a varying degree of success. This paper attempts to analyze the empirical performance of two forefront machine learning algorithms particularly designed for classification problem by adding some novelty to address the problem of imbalanced dataset. The evolved radial basis function network with novel kernel and support vector machine with mixture of kernels are suitably designed for the purpose of classification of imbalanced dataset. The experimental outcome shows that both algorithms are promising compared to simple radial basis function neural networks and support vector machine, respectively. However, on an average, support vector machine with mixture kernels is better than evolved radial basis function neural networks.

Download Full-text

Prediction of longitudinal facial crack in steel thin slabs funnel mold using different machine learning algorithms

International Journal of Innovation Science ◽

10.1108/ijis-09-2020-0172 ◽

2020 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Kushalkumar Thakkar ◽

Suhas Suresh Ambekar ◽

Manoj Hudnurkar

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Decision Tree ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

Support Vector ◽

Manufacturing Companies ◽

Content Type ◽

Steel Manufacturing

Purpose Longitudinal facial cracks (LFC) are one of the major defects occurring in the continuous-casting stage of thin slab caster using funnel molds. Longitudinal cracks occur mainly owing to non-uniform cooling, varying thermal conductivity along mold length and use of high superheat during casting, improper casting powder characteristics. These defects are difficult to capture and are visible only in the final stages of a process or even at the customer end. Besides, there is a seasonality associated with this defect where defect intensity increases during the winter season. To address the issue, a model-based on data analytics is developed. Design/methodology/approach Around six-month data of steel manufacturing process is taken and around 60 data collection point is analyzed. The model uses different classification machine learning algorithms such as logistic regression, decision tree, ensemble methods of a decision tree, support vector machine and Naïve Bays (for different cut off level) to investigate data. Findings Proposed research framework shows that most of models give good results between cut off level 0.6–0.8 and random forest, gradient boosting for decision trees and support vector machine model performs better compared to other model. Practical implications Based on predictions of model steel manufacturing companies can identify the optimal operating range where this defect can be reduced. Originality/value An analytical approach to identify LFC defects provides objective models for reduction of LFC defects. By reducing LFC defects, quality of steel can be improved.

Download Full-text

Classification of Child Items in a Gold Tree using Support Vector Machine Classifier

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.d8026.118419 ◽

2019 ◽

Vol 8 (4) ◽

pp. 3208-3216

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Support Vector Machine Classifier ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Support Vector ◽

Svm Classifier ◽

Main Application ◽

Novel Approach

Sorting of images has been a challenge in Machine Learning Algorithms over the years. Various algorithms have been proposed to sort an image but none of them are able to sort the image clearly. The drawback of the existing systems is that the sorted image is not clearly identified. So, to overcome this drawback we have proposed a novel approach to sort the children of a tree and match them with the existing designs. The images will be sorted on the basis of the class of the image. The images are taken from the image and manual binning of those images are done. Then the images are trained and tested. GLCM feature is extracted from the trained and tested images which are later on fed to the SVM classifier. The classification of image is then done with the help of SVM classifier. Around 7000 images are trained on SVM and used for classification. More than 300 different classes have been created in the database for comparison. Realtime images of child items are captured and fed to the SVM for classifying. The main application of this image is the use in distinguishing the designs in the ornaments. The various parts of the ornaments can be differentiated clearly. Thus, the proposed method is precise as compared to the existing methods.

Download Full-text

Application of machine learning in the process of classification of advertised jobs

IJEEC - INTERNATIONAL JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTING ◽

10.7251/ijeec2002093c ◽

2020 ◽

Vol 4 (2) ◽

Author(s):

Branislava Cvijetic ◽

Zaharije Radivojevic

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Naive Bayes ◽

Naïve Bayes ◽

Machine Learning Algorithms ◽

Data Sources ◽

Support Vector ◽

Official Statistics ◽

External Data

Institutions that provide official statistics tend to use external data sources such as administrative data sources besides regular statistical surveys. In addition to the mentioned data sources, Big Data became recognized as a new data source for the provider of official statistics. Classification of textual data is one of the elementary tasks for the provider of official statistics, regardless of data sources. In this paper, application of traditional machine learning algorithms, Multinomial Naive Bayes and Support Vector Machine, for the classification of advertised jobs according to ISCO-08, has been presented. The paper presents the methods of collecting data on advertised jobs from four websites and procedures for creating a multilingual dataset. There are different types of text preprocessing, such as converting uppercase letters into lowercase letters, stopword removal, punctuation mark removal, lemmatization, correction of commonly misspelled words, and reduction of replicated characters. We hypothesized that the application of different combinations of preprocessing methods influenced the text classification results. Two experiments had conducted to test the hypothesis. Both experiments results showed that using the Support Vector Machine algorithm on a created dataset gives better results than Multinomial Naive Bayes. Performed experiments showed that the proposed algorithms gave a good performance with an overall accuracy of up to 90% but with different accuracy for individual classes due to an imbalanced dataset.

Download Full-text