Performance Comparison of Machine Learning Classification Algorithms

Author(s):  
K. M. Veena ◽  
K. Manjula Shenoy ◽  
K. B. Ajitha Shenoy
2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Nasser Assery ◽  
Yuan (Dorothy) Xiaohong ◽  
Qu Xiuli ◽  
Roy Kaushik ◽  
Sultan Almalki

Purpose This study aims to propose an unsupervised learning model to evaluate the credibility of disaster-related Twitter data and present a performance comparison with commonly used supervised machine learning models. Design/methodology/approach First historical tweets on two recent hurricane events are collected via Twitter API. Then a credibility scoring system is implemented in which the tweet features are analyzed to give a credibility score and credibility label to the tweet. After that, supervised machine learning classification is implemented using various classification algorithms and their performances are compared. Findings The proposed unsupervised learning model could enhance the emergency response by providing a fast way to determine the credibility of disaster-related tweets. Additionally, the comparison of the supervised classification models reveals that the Random Forest classifier performs significantly better than the SVM and Logistic Regression classifiers in classifying the credibility of disaster-related tweets. Originality/value In this paper, an unsupervised 10-point scoring model is proposed to evaluate the tweets’ credibility based on the user-based and content-based features. This technique could be used to evaluate the credibility of disaster-related tweets on future hurricanes and would have the potential to enhance emergency response during critical events. The comparative study of different supervised learning methods has revealed effective supervised learning methods for evaluating the credibility of Tweeter data.


2020 ◽  
Author(s):  
Valerio Carruba

<p>Asteroid families are groups of asteroids that are the product of collisions or of the rotational fission of a parent object.  These groups are mainly identified in proper elements or frequencies domains.   Because of robotic telescope surveys, the number of known asteroids has increased from about 10,000 in the early 90's to more than 750,000 nowadays. Traditional approaches for identifying new members of asteroid families, like the hierarchical clustering method (HCM), may   struggle to keep up with the growing rate of new discoveries. Here we used machine learning classification algorithms to identify new family members based on the orbital distribution in proper (a,e,sin(i)) of previously known family constituents. We compared the outcome of nine classification algorithms from stand alone and ensemble approaches.  The Extremely Randomized Trees (ExtraTree) method had the highest precision, enabling to  retrieve up to 97% of family members identified with standard HCM.</p>


2020 ◽  
Vol 14 ◽  

Breast Cancer (BC) is amongst the most common and leading causes of deaths in women throughout the world. Recently, classification and data analysis tools are being widely used in the medical field for diagnosis, prognosis and decision making to help lower down the risks of people dying or suffering from diseases. Advanced machine learning methods have proven to give hope for patients as this has helped the doctors in early detection of diseases like Breast Cancer that can be fatal, in support with providing accurate outcomes. However, the results highly depend on the techniques used for feature selection and classification which will produce a strong machine learning model. In this paper, a performance comparison is conducted using four classifiers which are Multilayer Perceptron (MLP), Support Vector Machine (SVM), K-Nearest Neighbors (KNN) and Random Forest on the Wisconsin Breast Cancer dataset to spot the most effective predictors. The main goal is to apply best machine learning classification methods to predict the Breast Cancer as benign or malignant using terms such as accuracy, f-measure, precision and recall. Experimental results show that Random forest is proven to achieve the highest accuracy of 99.26% on this dataset and features, while SVM and KNN show 97.78% and 97.04% accuracy respectively. MLP shows the least accuracy of 94.07%. All the experiments are conducted using RStudio as the data mining tool platform.


Symmetry ◽  
2021 ◽  
Vol 13 (8) ◽  
pp. 1518
Author(s):  
Lazar Z. Velimirović ◽  
Radmila Janković ◽  
Jelena D. Velimirović ◽  
Aleksandar Janjić

One way to optimize wastewater treatment system infrastructure, its operations, monitoring, maintenance and management is through development of smart forecasting, monitoring and failure prediction systems using machine learning modeling. The aim of this paper was to develop a model that was able to predict a water pump failure based on the asymmetrical type of data obtained from sensors such as water levels, capacity, current and flow values. Several machine learning classification algorithms were used for predicting water pump failure. Using the classification algorithms, it was possible to make predictions of future values with a simple input of current values, as well as predicting probabilities of each sample belonging to each class. In order to build a prediction model, an asymmetrical type dataset containing the aforementioned variables was used.


2019 ◽  
Vol 58 (06) ◽  
pp. 205-212
Author(s):  
Cirruse Salehnasab ◽  
Abbas Hajifathali ◽  
Farkhondeh Asadi ◽  
Elham Roshandel ◽  
Alireza Kazemi ◽  
...  

Abstract Background The acute graft-versus-host disease (aGvHD) is the most important cause of mortality in patients receiving allogeneic hematopoietic stem cell transplantation. Given that it occurs at the stage of severe tissue damage, its diagnosis is late. With the advancement of machine learning (ML), promising real-time models to predict aGvHD have emerged. Objective This article aims to synthesize the literature on ML classification algorithms for predicting aGvHD, highlighting algorithms and important predictor variables used. Methods A systemic review of ML classification algorithms used to predict aGvHD was performed using a search of the PubMed, Embase, Web of Science, Scopus, Springer, and IEEE Xplore databases undertaken up to April 2019 based on Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) statements. The studies with a focus on using the ML classification algorithms in the process of predicting of aGvHD were considered. Results After applying the inclusion and exclusion criteria, 14 studies were selected for evaluation. The results of the current analysis showed that the algorithms used were Artificial Neural Network (79%), Support Vector Machine (50%), Naive Bayes (43%), k-Nearest Neighbors (29%), Regression (29%), and Decision Trees (14%), respectively. Also, many predictor variables have been used in these studies so that we have divided them into more abstract categories, including biomarkers, demographics, infections, clinical, genes, transplants, drugs, and other variables. Conclusion Each of these ML algorithms has a particular characteristic and different proposed predictors. Therefore, it seems these ML algorithms have a high potential for predicting aGvHD if the process of modeling is performed correctly.


Sign in / Sign up

Export Citation Format

Share Document