A Comparative Analysis of Machine Learning Algorithms in Design Process of Adaptive Traffic Signal Control System

Abstract In recent decades machine learning technology has proved its efficiency in most sectors by making human life easier. With this popularity and efficiency, it is applied to design traffic signal control systems to mitigate traffic congestion and distribute waiting delays. Hence, many researchers around the world are working to address this issue. As a part of the solution, this article presents a comparative analysis of various machine learning models to come up with a suitable model for an isolated intersection. In this context, eight machine learning models including Linear Regression, Ridge, Lasso, Support Vector Regression, k-Nearest Neighbour, Decision Tree, Random Forest, and Gradient Boosting Regression Tree are selected. Shivakumara Swamiji Circle (SSC), one of the intersections in Tumakuru, Karnataka, India is selected as a case study area. Essential data is collected from SSC through videography. The selected models are developed to predict green time based on traffic classification and volume in Passenger Car Units (PCU) for each phase on the PyCharm platform. The models are evaluated based on various performance metrics. Results revealed that all the selected models predict green splits with 91% accuracy using traffic classification as input, whereas, models showed 85% accuracy with PCU as input. And also, Gradient Boosting Regression Tree is the best suitable model for the selected intersection, whereas, Decision Tree is not referred model for this application.

Download Full-text

COMPARATIVE ANALYSIS OF MACHINE LEARNING MODELS AND REGRESSIONS FOR CAR PRICE PREDICTION

Bulletin of V. N. Karazin Kharkiv National University Economic Series ◽

10.26565/2311-2379-2019-97-04 ◽

2019 ◽

Keyword(s):

Neural Network ◽

Machine Learning ◽

Comparative Analysis ◽

Random Forest ◽

Network Models ◽

Gradient Boosting ◽

Learning Models ◽

Neural Network Models ◽

Boosting Algorithms ◽

Machine Learning Models

The purpose of the research described in this article is a comparative analysis of the predictive qualities of some models of machine learning and regression. The factors for models are the consumer characteristics of a used car: brand, transmission type, drive type, engine type, mileage, body type, year of manufacture, seller's region in Ukraine, condition of the car, information about accident, average price for analogue in Ukraine, engine volume, quantity of doors, availability of extra equipment, quantity of passenger’s seats, the first registration of a car, car was driven from abroad or not. Qualitative variables has been encoded as binary variables or by mean target encoding. The information about more than 200 thousand cars have been used for modeling. All models have been evaluated in the Python Software using Sklearn, Catboost, StatModels and Keras libraries. The following regression models and machine learning models were considered in the course of the study: linear regression; polynomial regression; decision tree; neural network; models based on "k-nearest neighbors", "random forest", "gradient boosting" algorithms; ensemble of models. The article presents the best in terms of quality (according to the criteria R2, MAE, MAD, MAPE) options from each class of models. It has been found that the best way to predict the price of a passenger car is through non-linear models. The results of the modeling show that the dependence between the price of a car and its characteristics is best described by the ensemble of models, which includes a neural network, models using "random forest" and "gradient boosting" algorithms. The ensemble of models showed an average relative approximation error of 11.2% and an average relative forecast error of 14.34%. All nonlinear models for car price have approximately the same predictive qualities (the difference between the MAPE within 2%) in this research.

Download Full-text

A Comparative Analysis of Machine Learning Models for Prediction of Passing Bachelor Admission Test in Life-Science Faculty of a Public University in Bangladesh

2020 IEEE Electric Power and Energy Conference (EPEC) ◽

10.1109/epec48502.2020.9320119 ◽

2020 ◽

Author(s):

Md. Abul Ala Walid ◽

S.M. Masum Ahmed ◽

S M Shibly Sadique

Keyword(s):

Machine Learning ◽

Comparative Analysis ◽

Life Science ◽

Public University ◽

Learning Models ◽

Science Faculty ◽

Admission Test ◽

Machine Learning Models

Download Full-text

A Comparative Analysis of Machine Learning Models developed from Homomorphic Encryption based RSA and Paillier algorithm

2021 5th International Conference on Intelligent Computing and Control Systems (ICICCS) ◽

10.1109/iciccs51141.2021.9432348 ◽

2021 ◽

Author(s):

Harsh J. Kiratsata ◽

Mahesh Panchal

Keyword(s):

Machine Learning ◽

Comparative Analysis ◽

Homomorphic Encryption ◽

Learning Models ◽

Machine Learning Models

Download Full-text

Machine learning models to identify low adherence to influenza vaccination among Korean adults with cardiovascular disease

BMC Cardiovascular Disorders ◽

10.1186/s12872-021-01925-7 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Moojung Kim ◽

Young Jae Kim ◽

Sung Jin Park ◽

Kwang Gi Kim ◽

Pyung Chun Oh ◽

...

Keyword(s):

Machine Learning ◽

Cardiovascular Disease ◽

Influenza Vaccination ◽

Machine Learning Techniques ◽

Gradient Boosting ◽

Support Vector ◽

Age Group ◽

Learning Models ◽

Extreme Gradient Boosting ◽

Machine Learning Models

Abstract Background Annual influenza vaccination is an important public health measure to prevent influenza infections and is strongly recommended for cardiovascular disease (CVD) patients, especially in the current coronavirus disease 2019 (COVID-19) pandemic. The aim of this study is to develop a machine learning model to identify Korean adult CVD patients with low adherence to influenza vaccination Methods Adults with CVD (n = 815) from a nationally representative dataset of the Fifth Korea National Health and Nutrition Examination Survey (KNHANES V) were analyzed. Among these adults, 500 (61.4%) had answered "yes" to whether they had received seasonal influenza vaccinations in the past 12 months. The classification process was performed using the logistic regression (LR), random forest (RF), support vector machine (SVM), and extreme gradient boosting (XGB) machine learning techniques. Because the Ministry of Health and Welfare in Korea offers free influenza immunization for the elderly, separate models were developed for the < 65 and ≥ 65 age groups. Results The accuracy of machine learning models using 16 variables as predictors of low influenza vaccination adherence was compared; for the ≥ 65 age group, XGB (84.7%) and RF (84.7%) have the best accuracies, followed by LR (82.7%) and SVM (77.6%). For the < 65 age group, SVM has the best accuracy (68.4%), followed by RF (64.9%), LR (63.2%), and XGB (61.4%). Conclusions The machine leaning models show comparable performance in classifying adult CVD patients with low adherence to influenza vaccination.

Download Full-text

Comparative analysis of machine learning models when solving problem of detecting abnormal errors in parameters measurements of pipeline systems’ operational modes

Automation Telemechanization and Communication in Oil Industry ◽

10.33285/0132-2222-2021-1(570)-55-60 ◽

2021 ◽

pp. 55-60

Author(s):

E.A. Golubyatnikov ◽

◽

D.G. Leonov ◽

Keyword(s):

Machine Learning ◽

Comparative Analysis ◽

Learning Models ◽

Pipeline Systems ◽

Operational Modes ◽

Machine Learning Models

Download Full-text

Predicting Electric Vehicle Charging Station Availability Using Ensemble Machine Learning

Energies ◽

10.3390/en14237834 ◽

2021 ◽

Vol 14 (23) ◽

pp. 7834

Author(s):

Christopher Hecht ◽

Jan Figgener ◽

Dirk Uwe Sauer

Keyword(s):

Machine Learning ◽

Binary Data ◽

Training Data ◽

Gradient Boosting ◽

Traffic Density ◽

Learning Models ◽

Charging Infrastructure ◽

Ensemble Models ◽

Charging Station ◽

Machine Learning Models

Electric vehicles may reduce greenhouse gas emissions from individual mobility. Due to the long charging times, accurate planning is necessary, for which the availability of charging infrastructure must be known. In this paper, we show how the occupation status of charging infrastructure can be predicted for the next day using machine learning models— Gradient Boosting Classifier and Random Forest Classifier. Since both are ensemble models, binary training data (occupied vs. available) can be used to provide a certainty measure for predictions. The prediction may be used to adapt prices in a high-load scenario, predict grid stress, or forecast available power for smart or bidirectional charging. The models were chosen based on an evaluation of 13 different, typically used machine learning models. We show that it is necessary to know past charging station usage in order to predict future usage. Other features such as traffic density or weather have a limited effect. We show that a Gradient Boosting Classifier achieves 94.8% accuracy and a Matthews correlation coefficient of 0.838, making ensemble models a suitable tool. We further demonstrate how a model trained on binary data can perform non-binary predictions to give predictions in the categories “low likelihood” to “high likelihood”.

Download Full-text

Estimation of Chlorophyll-a Concentrations in Small Water Bodies: Comparison of Fused Gaofen-6 and Sentinel-2 Sensors

Remote Sensing ◽

10.3390/rs14010229 ◽

2022 ◽

Vol 14 (1) ◽

pp. 229

Author(s):

Jiarui Shi ◽

Qian Shen ◽

Yue Yao ◽

Junsheng Li ◽

Fu Chen ◽

...

Keyword(s):

Machine Learning ◽

Chlorophyll A ◽

Water Bodies ◽

Gradient Boosting ◽

Learning Models ◽

Small Water ◽

Extreme Gradient Boosting ◽

Machine Learning Models ◽

Sentinel 2 ◽

Small Water Bodies

Chlorophyll-a concentrations in water bodies are one of the most important environmental evaluation indicators in monitoring the water environment. Small water bodies include headwater streams, springs, ditches, flushes, small lakes, and ponds, which represent important freshwater resources. However, the relatively narrow and fragmented nature of small water bodies makes it difficult to monitor chlorophyll-a via medium-resolution remote sensing. In the present study, we first fused Gaofen-6 (a new Chinese satellite) images to obtain 2 m resolution images with 8 bands, which was approved as a good data source for Chlorophyll-a monitoring in small water bodies as Sentinel-2. Further, we compared five semi-empirical and four machine learning models to estimate chlorophyll-a concentrations via simulated reflectance using fused Gaofen-6 and Sentinel-2 spectral response function. The results showed that the extreme gradient boosting tree model (one of the machine learning models) is the most accurate. The mean relative error (MRE) was 9.03%, and the root-mean-square error (RMSE) was 4.5 mg/m3 for the Sentinel-2 sensor, while for the fused Gaofen-6 image, MRE was 6.73%, and RMSE was 3.26 mg/m3. Thus, both fused Gaofen-6 and Sentinel-2 could estimate the chlorophyll-a concentrations in small water bodies. Since the fused Gaofen-6 exhibited a higher spatial resolution and Sentinel-2 exhibited a higher temporal resolution.

Download Full-text

Benchmarking of Machine Learning Models to Assist the Prognosis of Tuberculosis

10.20944/preprints202103.0284.v2 ◽

2021 ◽

Author(s):

Maicon Herverton Lino Ferreira da Silva Barros ◽

Geovanne Oliveira Alves ◽

Lubnnia Morais Florêncio Souza ◽

Élisson da Silva Rocha ◽

João Fausto Lorenzato de Oliveira ◽

...

Keyword(s):

Machine Learning ◽

Clinical Symptoms ◽

Treatment Decision ◽

Gradient Boosting ◽

Original Form ◽

Learning Models ◽

Data Set ◽

Risk Of Death ◽

Increased Risk ◽

Machine Learning Models

Tuberculosis (TB) is an airborne infectious disease caused by organisms in the Mycobacterium tuberculosis (Mtb) complex. In many low and middle-income countries, TB remains a major cause of morbidity and mortality. Once a patient has been diagnosed with TB, it is critical that healthcare workers make the most appropriate treatment decision given the individual conditions of the patient and the likely course of the disease based on medical experience. Depending on the prognosis, delayed or inappropriate treatment can result in unsatisfactory results including the exacerbation of clinical symptoms, poor quality of life, and increased risk of death. This work benchmarks machine learning models to aid TB prognosis using a Brazilian health database of confirmed cases and deaths related to TB in the State of Amazonas. The goal is to predict the probability of death by TB thus aiding the prognosis of TB and associated treatment decision making process. In its original form, the data set comprised 36,228 records and 130 fields but suffered from missing, incomplete, or incorrect data. Following data cleaning and preprocessing, a revised data set was generated comprising 24,015 records and 38 fields, including 22,876 reported cured TB patients and 1,139 deaths by TB. To explore how the data imbalance impacts model performance, two controlled experiments were designed using (1) imbalanced and (2) balanced data sets. The best result is achieved by the Gradient Boosting (GB) model using the balanced data set to predict TB-mortality, and the ensemble model composed by the Random Forest (RF), GB and Multi-layer Perceptron (MLP) models is the best model to predict the cure class.

Download Full-text