A supervised machine learning classification algorithm for research articles

Rapid growth of network traffic causes the need for the development of new network technologies. Artificial intelligence provides suitable tools to improve currently used network optimization methods. In this paper, we propose a procedure for network traffic prediction. Based on optical networks’ (and other network technologies) characteristics, we focus on the prediction of fixed bitrate levels called traffic levels. We develop and evaluate two approaches based on different supervised machine learning (ML) methods—classification and regression. We examine four different ML models with various selected features. The tested datasets are based on real traffic patterns provided by the Seattle Internet Exchange Point (SIX). Obtained results are analyzed using a new quality metric, which allows researchers to find the best forecasting algorithm in terms of network resources usage and operational costs. Our research shows that regression provides better results than classification in case of all analyzed datasets. Additionally, the final choice of the most appropriate ML algorithm and model should depend on the network operator expectations.

Download Full-text

Source allocation of per- and polyfluoroalkyl substances (PFAS) with supervised machine learning: Classification performance and the role of feature selection in an expanded dataset

Chemosphere ◽

10.1016/j.chemosphere.2021.130124 ◽

2021 ◽

Vol 275 ◽

pp. 130124

Author(s):

Tohren C.G. Kibbey ◽

Rafal Jabrzemski ◽

Denis M. O’Carroll

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Classification Performance ◽

Supervised Machine Learning ◽

Machine Learning Classification ◽

Polyfluoroalkyl Substances ◽

Source Allocation

Download Full-text

Evaluating disaster-related tweet credibility using content-based and user-based features

Information Discovery and Delivery ◽

10.1108/idd-04-2020-0044 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Nasser Assery ◽

Yuan (Dorothy) Xiaohong ◽

Qu Xiuli ◽

Roy Kaushik ◽

Sultan Almalki

Keyword(s):

Machine Learning ◽

Unsupervised Learning ◽

Supervised Learning ◽

Emergency Response ◽

Learning Model ◽

Performance Comparison ◽

Supervised Machine Learning ◽

Learning Methods ◽

Content Type ◽

Machine Learning Classification

Purpose This study aims to propose an unsupervised learning model to evaluate the credibility of disaster-related Twitter data and present a performance comparison with commonly used supervised machine learning models. Design/methodology/approach First historical tweets on two recent hurricane events are collected via Twitter API. Then a credibility scoring system is implemented in which the tweet features are analyzed to give a credibility score and credibility label to the tweet. After that, supervised machine learning classification is implemented using various classification algorithms and their performances are compared. Findings The proposed unsupervised learning model could enhance the emergency response by providing a fast way to determine the credibility of disaster-related tweets. Additionally, the comparison of the supervised classification models reveals that the Random Forest classifier performs significantly better than the SVM and Logistic Regression classifiers in classifying the credibility of disaster-related tweets. Originality/value In this paper, an unsupervised 10-point scoring model is proposed to evaluate the tweets’ credibility based on the user-based and content-based features. This technique could be used to evaluate the credibility of disaster-related tweets on future hurricanes and would have the potential to enhance emergency response during critical events. The comparative study of different supervised learning methods has revealed effective supervised learning methods for evaluating the credibility of Tweeter data.

Download Full-text

Predictive Analysis of Genetic Disease Haemophilia-A based on Machine Learning Classification Algorithm

IJARCCE ◽

10.17148/ijarcce.2021.101210 ◽

2021 ◽

Vol 10 (12) ◽

Author(s):

Dillip Narayan Sahu ◽

Vijay Pal Singh

Keyword(s):

Machine Learning ◽

Genetic Disease ◽

Classification Algorithm ◽

Predictive Analysis ◽

Haemophilia A ◽

Machine Learning Classification

Download Full-text

Supervised Machine Learning Classification of Human Sperm Head Based on Morphological Features

10.1007/978-3-030-75945-2_9 ◽

2021 ◽

pp. 177-191

Author(s):

Natalia V. Revollo ◽

G. Noelia Revollo Sarmiento ◽

Claudio Delrieux ◽

Marcela Herrera ◽

Rolando González-José

Keyword(s):

Machine Learning ◽

Human Sperm ◽

Sperm Head ◽

Morphological Features ◽

Supervised Machine Learning ◽

Machine Learning Classification

Download Full-text

Machine Learning Classification of Head Impact Sensor Data

Volume 3: Biomedical and Biotechnology Engineering ◽

10.1115/imece2019-12173 ◽

2019 ◽

Author(s):

Tyler F. Rooks ◽

Andrea S. Dargie ◽

Valeta Carol Chancey

Keyword(s):

Machine Learning ◽

Decision Tree ◽

External Validation ◽

Classification Algorithm ◽

Sensor Data ◽

Environmental Sensors ◽

Head Acceleration ◽

Machine Learning Classification ◽

Environmental Sensor ◽

Validation Set

Abstract A shortcoming of using environmental sensors for the surveillance of potentially concussive events is substantial uncertainty regarding whether the event was caused by head acceleration (“head impacts”) or sensor motion (with no head acceleration). The goal of the present study is to develop a machine learning model to classify environmental sensor data obtained in the field and evaluate the performance of the model against the performance of the proprietary classification algorithm used by the environmental sensor. Data were collected from Soldiers attending sparring sessions conducted under a U.S. Army Combatives School course. Data from one sparring session were used to train a decision tree classification algorithm to identify good and bad signals. Data from the remaining sparring sessions were kept as an external validation set. The performance of the proprietary algorithm used by the sensor was also compared to the trained algorithm performance. The trained decision tree was able to correctly classify 95% of events for internal cross-validation and 88% of events for the external validation set. Comparatively, the proprietary algorithm was only able to correctly classify 61% of the events. In general, the trained algorithm was better able to predict when a signal was good or bad compared to the proprietary algorithm. The present study shows it is possible to train a decision tree algorithm using environmental sensor data collected in the field.

Download Full-text

Comparison of supervised machine learning classification techniques in prediction of locoregional recurrences in early oral tongue cancer

International Journal of Medical Informatics ◽

10.1016/j.ijmedinf.2019.104068 ◽

2020 ◽

Vol 136 ◽

pp. 104068 ◽

Cited By ~ 4

Author(s):

Rasheed Omobolaji Alabi ◽

Mohammed Elmusrati ◽

Iris Sawazaki‐Calone ◽

Luiz Paulo Kowalski ◽

Caj Haglund ◽

...

Keyword(s):

Machine Learning ◽

Tongue Cancer ◽

Supervised Machine Learning ◽

Oral Tongue ◽

Classification Techniques ◽

Machine Learning Classification ◽

Oral Tongue Cancer

Download Full-text

Predictive Analysis of Coronary Heart Disease (CHD) based on Machine Learning Classification Algorithm

IJARCCE ◽

10.17148/ijarcce.2021.101202 ◽

2021 ◽

Vol 10 (12) ◽

Author(s):

Dillip Narayan Sahu ◽

Vijay Pal Singh

Keyword(s):

Machine Learning ◽

Coronary Heart Disease ◽

Heart Disease ◽

Classification Algorithm ◽

Predictive Analysis ◽

Machine Learning Classification

Download Full-text

Using Machine Learning To Understand Suicide: A New Approach To Classifying Australian Coroner’s Court Decisions

10.21203/rs.3.rs-640308/v1 ◽

2021 ◽

Author(s):

Ravi Iyer ◽

Elizabeth Seabrook ◽

Suku Sukunesan ◽

Maja Nedeljkovic ◽

Denny Meyer

Keyword(s):

Mental Health ◽

Machine Learning ◽

Mental Health Disorder ◽

Supervised Machine Learning ◽

Feature Engineering ◽

Court Case ◽

Learning Approach ◽

Machine Learning Classification ◽

Machine Learning Approach ◽

Case Files

Abstract We aimed to demonstrate how a large collection of publicly accessible Australian Coroner’s Court case files (n=4459) (2009-2019) can be automatically classified for determination of death by suicide, presence of mental health disorder and sex of deceased via Natural Language Processing (NLP) methods - supervised machine learning and unsupervised dictionary-based and string search based approaches. We achieved superior levels of accuracy in the machine learning classification (Gradient Boosting vs. Random Forest baseline) of deaths by suicide of 83.3% (sensitivity = 85.1%, Specificity = 79.1%) and an accuracy of 98.3% for the dictionary-based classification of mental health disorder, as defined by the OCD-10 (sensitivity = 99.0%, specificity = 97.9%). Our machine learning approach automatically classified 24.2% (1078/4459) of the case files as referring to deaths by suicide while 63.7% (2940/4459) where classified as exhibiting a mental health disorder1. We employed a two-stage machine learning approach involving feature engineering, followed by predictive modelling in the second. Feature engineering involved several steps including removal of low value text, parts of speech analysis, term document weighting and topic clustering. Predictive classification involved extensive hyperparameter tuning to yield the most accurate model. We validated our models against a manually pre-coded subsample of case files, and also via binary logistic regression to test the contribution of each classified mental health disorder against determinations of deaths by suicide according to extant literature. This validation step confirmed elevated odds of suicide attributed to diagnoses of Depression, Schizophrenia and Obsessive Compulsive Disorder. Finally, we offer a short case study to demonstrate the efficacy of our approach in investigating a subset of case findings referring to suicides resulting from family violence. We offer a proof of concept model that demonstrates an objective and scalable approach to the analysis of legal texts. The use of NLP methods in analysing Coroner's Court case findings has important implications for the ongoing development of a real-time surveillance of suicide system in Australia.

Download Full-text