Performance Comparison of Machine Learning Classification Algorithms

Purpose This study aims to propose an unsupervised learning model to evaluate the credibility of disaster-related Twitter data and present a performance comparison with commonly used supervised machine learning models. Design/methodology/approach First historical tweets on two recent hurricane events are collected via Twitter API. Then a credibility scoring system is implemented in which the tweet features are analyzed to give a credibility score and credibility label to the tweet. After that, supervised machine learning classification is implemented using various classification algorithms and their performances are compared. Findings The proposed unsupervised learning model could enhance the emergency response by providing a fast way to determine the credibility of disaster-related tweets. Additionally, the comparison of the supervised classification models reveals that the Random Forest classifier performs significantly better than the SVM and Logistic Regression classifiers in classifying the credibility of disaster-related tweets. Originality/value In this paper, an unsupervised 10-point scoring model is proposed to evaluate the tweets’ credibility based on the user-based and content-based features. This technique could be used to evaluate the credibility of disaster-related tweets on future hurricanes and would have the potential to enhance emergency response during critical events. The comparative study of different supervised learning methods has revealed effective supervised learning methods for evaluating the credibility of Tweeter data.

Download Full-text

An Experimental Comparison of Machine Learning Classification Algorithms for Breast Cancer Diagnosis

Information Systems - Lecture Notes in Business Information Processing ◽

10.1007/978-3-030-44322-1_2 ◽

2020 ◽

pp. 18-30

Author(s):

Markos Marios Kaklamanis ◽

Michael Ε. Filippakis ◽

Marios Touloupos ◽

Klitos Christodoulou

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Cancer Diagnosis ◽

Breast Cancer Diagnosis ◽

Experimental Comparison ◽

Classification Algorithms ◽

Machine Learning Classification

Download Full-text

Machine learning classification of new asteroid families members

10.5194/epsc2020-36 ◽

2020 ◽

Author(s):

Valerio Carruba

Keyword(s):

Machine Learning ◽

Family Members ◽

Classification Algorithms ◽

New Members ◽

Machine Learning Classification ◽

New Family ◽

Robotic Telescope ◽

Traditional Approaches ◽

Asteroid Families

<p>Asteroid families are groups of asteroids that are the product of collisions or of the rotational fission of a parent object. &#160;These groups are mainly identified in proper elements or frequencies domains. &#160; Because of robotic telescope surveys, the number of known asteroids has increased from about 10,000 in the early 90's to more than 750,000 nowadays. Traditional approaches for identifying new members of asteroid families, like the hierarchical clustering method (HCM), may &#160; struggle to keep up with the growing rate of new discoveries. Here we used machine learning classification algorithms to identify new family members based on the orbital distribution in proper (a,e,sin(i)) of previously known family constituents. We compared the outcome of nine classification algorithms from stand alone and ensemble approaches. &#160;The Extremely Randomized Trees (ExtraTree) method had the highest precision, enabling to&#160; retrieve up to 97% of family members identified with standard HCM.</p>

Download Full-text

Classifications of Breast Cancer Diagnosis using Machine Learning

International Journal of Computers ◽

10.46300/9108.2020.14.13 ◽

2020 ◽

Vol 14 ◽

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Random Forest ◽

Breast Cancer Diagnosis ◽

Performance Comparison ◽

Support Vector ◽

Breast Cancer Dataset ◽

K Nearest Neighbors ◽

Cancer Dataset ◽

Machine Learning Classification

Breast Cancer (BC) is amongst the most common and leading causes of deaths in women throughout the world. Recently, classification and data analysis tools are being widely used in the medical field for diagnosis, prognosis and decision making to help lower down the risks of people dying or suffering from diseases. Advanced machine learning methods have proven to give hope for patients as this has helped the doctors in early detection of diseases like Breast Cancer that can be fatal, in support with providing accurate outcomes. However, the results highly depend on the techniques used for feature selection and classification which will produce a strong machine learning model. In this paper, a performance comparison is conducted using four classifiers which are Multilayer Perceptron (MLP), Support Vector Machine (SVM), K-Nearest Neighbors (KNN) and Random Forest on the Wisconsin Breast Cancer dataset to spot the most effective predictors. The main goal is to apply best machine learning classification methods to predict the Breast Cancer as benign or malignant using terms such as accuracy, f-measure, precision and recall. Experimental results show that Random forest is proven to achieve the highest accuracy of 99.26% on this dataset and features, while SVM and KNN show 97.78% and 97.04% accuracy respectively. MLP shows the least accuracy of 94.07%. All the experiments are conducted using RStudio as the data mining tool platform.

Download Full-text

Wastewater Plant Reliability Prediction Using the Machine Learning Classification Algorithms

Symmetry ◽

10.3390/sym13081518 ◽

2021 ◽

Vol 13 (8) ◽

pp. 1518

Author(s):

Lazar Z. Velimirović ◽

Radmila Janković ◽

Jelena D. Velimirović ◽

Aleksandar Janjić

Keyword(s):

Machine Learning ◽

Wastewater Treatment ◽

Water Levels ◽

Reliability Prediction ◽

Classification Algorithms ◽

Water Pump ◽

Pump Failure ◽

Machine Learning Classification ◽

Capacity Current ◽

Prediction Systems

One way to optimize wastewater treatment system infrastructure, its operations, monitoring, maintenance and management is through development of smart forecasting, monitoring and failure prediction systems using machine learning modeling. The aim of this paper was to develop a model that was able to predict a water pump failure based on the asymmetrical type of data obtained from sensors such as water levels, capacity, current and flow values. Several machine learning classification algorithms were used for predicting water pump failure. Using the classification algorithms, it was possible to make predictions of future values with a simple input of current values, as well as predicting probabilities of each sample belonging to each class. In order to build a prediction model, an asymmetrical type dataset containing the aforementioned variables was used.

Download Full-text

Energy management using multi-criteria decision making and machine learning classification algorithms for intelligent system

Electric Power Systems Research ◽

10.1016/j.epsr.2021.107645 ◽

2022 ◽

Vol 203 ◽

pp. 107645

Author(s):

Hmeda Musbah ◽

Gama Ali ◽

Hamed H. Aly ◽

Timothy A. Little

Keyword(s):

Machine Learning ◽

Decision Making ◽

Energy Management ◽

Intelligent System ◽

Classification Algorithms ◽

Multi Criteria Decision Making ◽

Machine Learning Classification

Download Full-text

Propensity score adjustment using machine learning classification algorithms to control selection bias in online surveys

PLoS ONE ◽

10.1371/journal.pone.0231500 ◽

2020 ◽

Vol 15 (4) ◽

pp. e0231500 ◽

Cited By ~ 3

Author(s):

Ramón Ferri-García ◽

María del Mar Rueda

Keyword(s):

Machine Learning ◽

Propensity Score ◽

Selection Bias ◽

Classification Algorithms ◽

Online Surveys ◽

Machine Learning Classification ◽

Control Selection ◽

Propensity Score Adjustment

Download Full-text

Machine Learning Classification Algorithms to Predict aGvHD following Allo-HSCT: A Systematic Review

Methods of Information in Medicine ◽

10.1055/s-0040-1709150 ◽

2019 ◽

Vol 58 (06) ◽

pp. 205-212

Author(s):

Cirruse Salehnasab ◽

Abbas Hajifathali ◽

Farkhondeh Asadi ◽

Elham Roshandel ◽

Alireza Kazemi ◽

...

Keyword(s):

Machine Learning ◽

Systemic Review ◽

Predictor Variables ◽

Support Vector ◽

Classification Algorithms ◽

K Nearest Neighbors ◽

Hematopoietic Stem ◽

Machine Learning Classification ◽

Graft Versus Host ◽

Meta Analyses

Abstract Background The acute graft-versus-host disease (aGvHD) is the most important cause of mortality in patients receiving allogeneic hematopoietic stem cell transplantation. Given that it occurs at the stage of severe tissue damage, its diagnosis is late. With the advancement of machine learning (ML), promising real-time models to predict aGvHD have emerged. Objective This article aims to synthesize the literature on ML classification algorithms for predicting aGvHD, highlighting algorithms and important predictor variables used. Methods A systemic review of ML classification algorithms used to predict aGvHD was performed using a search of the PubMed, Embase, Web of Science, Scopus, Springer, and IEEE Xplore databases undertaken up to April 2019 based on Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) statements. The studies with a focus on using the ML classification algorithms in the process of predicting of aGvHD were considered. Results After applying the inclusion and exclusion criteria, 14 studies were selected for evaluation. The results of the current analysis showed that the algorithms used were Artificial Neural Network (79%), Support Vector Machine (50%), Naive Bayes (43%), k-Nearest Neighbors (29%), Regression (29%), and Decision Trees (14%), respectively. Also, many predictor variables have been used in these studies so that we have divided them into more abstract categories, including biomarkers, demographics, infections, clinical, genes, transplants, drugs, and other variables. Conclusion Each of these ML algorithms has a particular characteristic and different proposed predictors. Therefore, it seems these ML algorithms have a high potential for predicting aGvHD if the process of modeling is performed correctly.

Download Full-text