Gradient Boosting Machine to Assess the Public Protest Impact on Urban Air Quality

Political and economic protests build-up due to the financial uncertainty and inequality spreading throughout the world. In 2019, Latin America took the main stage in a wave of protests. While the social side of protests is widely explored, the focus of this study is the evolution of gaseous urban air pollutants during and after one of these events. Changes in concentrations of NO2, CO, O3 and SO2 during and after the strike, were studied in Quito, Ecuador using two approaches: (i) inter-period observational analysis; and (ii) machine learning (ML) gradient boosting machine (GBM) developed business-as-usual (BAU) comparison to the observations. During the strike, both methods showed a large reduction in the concentrations of NO2 (31.5–32.36%) and CO (15.55–19.85%) and a slight reduction for O3 and SO2. The GBM approach showed an exclusive potential, especially for a lengthier period of predictions, to estimate strike impact on air quality even after the strike was over. This advocates for the use of machine learning techniques to estimate an extended effect of changes in human activities on urban gaseous pollution.

Download Full-text

Prediction of probable backorder scenarios in the supply chain using Distributed Random Forest and Gradient Boosting Machine learning techniques

Journal Of Big Data ◽

10.1186/s40537-020-00345-2 ◽

2020 ◽

Vol 7 (1) ◽

Cited By ~ 1

Author(s):

Samiul Islam ◽

Saman Hassanzadeh Amin

Keyword(s):

Machine Learning ◽

Supply Chain ◽

Random Forest ◽

Machine Learning Techniques ◽

Gradient Boosting ◽

Learning Techniques ◽

Gradient Boosting Machine

Download Full-text

A Survey on Different Machine Learning Techniques for Air Quality Forecasting for Urban Air Pollution

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2019.4395 ◽

2019 ◽

Vol 7 (4) ◽

pp. 2185-2194

Author(s):

Sayali Nemade

Keyword(s):

Machine Learning ◽

Air Pollution ◽

Air Quality ◽

Urban Air Pollution ◽

Machine Learning Techniques ◽

Urban Air ◽

Learning Techniques ◽

Air Quality Forecasting

Download Full-text

Boosting algorithms for prediction in agriculture: an application of feature importance and feature selection boosting algorithms for prediction crop damage.

10.31220/agrirxiv.2021.00092 ◽

2021 ◽

Author(s):

Viviane Costa Silva ◽

Mateus Silva Rocha ◽

Glaucia Amorim Faria ◽

Silvio Fernando Alves Xavier Junior ◽

Tiago Almeida de Oliveira ◽

...

Keyword(s):

Machine Learning ◽

Area Under The Curve ◽

Crop Damage ◽

Machine Learning Techniques ◽

Gradient Boosting ◽

Agriculture Sector ◽

Light Gradient ◽

Learning Techniques ◽

Gradient Boosting Machine ◽

Boosting Algorithms

Abstract The Agriculture sector has created and collected large amounts of data. It can be gathered, stored, and analyzed to assist in decision making generating competitive value, and the use of Machine Learning techniques has been very effective for this market. In this work, a Machine Learning study was carried out using supervised classification models based on boosting to predict disease in a crop, thus identifying the model with the best areas under curve metrics. Light Gradient Boosting Machine, CatBoost Classifier, Extreme Gradient, Gradient Boosting Classifier, Adaboost models were used to qualify the crop as healthy or sick. One can see that the LightGBM algorithm provided a better fit to the data with an area under the curve of 0.76 under the use of BORUTA variable selection.

Download Full-text

Ensemble Approach for Zoonotic Disease Forecasting Using Machine Learning Techniques

International Journal of Business Analytics and Intelligence ◽

10.21863/ijbai/2015.3.2.009 ◽

2015 ◽

Vol 3 (2) ◽

Author(s):

Vikash Chandra Sharma ◽

David Frankenfield ◽

Anupam Gupta ◽

Rama Krishna Singh

Keyword(s):

Machine Learning ◽

Emerging Infectious Diseases ◽

Forecast Accuracy ◽

Machine Learning Techniques ◽

Gradient Boosting ◽

Support Vector ◽

Ensemble Approach ◽

Mode Decomposition ◽

Learning Techniques ◽

Gradient Boosting Machine

More than two-third of emerging infectious diseases in recent decades are zoonotic in origin. Timely prediction of these diseases which migrate from animals to humans and preventive measures to stop the loss in terms of morbidity and mortality is the requirement of healthcare industry. Avian Influenza is one of the zoonotic diseases that have created havoc in recent past especially in Asian subcontinent. In past, attempts have been made to predict influenza using traditional time-series techniques (AR, MA, ARMA, ARIMA etc.) as well as machine learning techniques to capture the cyclicity and seasonality of these virus strains. In current research an effort has been made to utilize the Empirical Mode Decomposition (EMD) to extract the Intrinsic Mode function (IMF) and then apply state of art Machine Learning (ML) techniques to predict the series. Several machine learning techniques like Random Forest (RF) along with Gradient Boosting Machine (GBM) and Support Vector Regression (SVR)have been applied on the decomposed series. Exogenous models showed variables like temperature, humidity and precipitation have been incorporated to improve upon the forecast. An ensemble approach of ML models showed significant improvement over the traditional models in terms of long term forecast accuracy.

Download Full-text

Prediction of Survivors in the Titanic Cruise

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.c4408.098319 ◽

2019 ◽

Vol 8 (3) ◽

pp. 1268-1271

Keyword(s):

Machine Learning ◽

Data Science ◽

Machine Learning Techniques ◽

Gradient Boosting ◽

Support Vector ◽

Data Set ◽

Learning Techniques ◽

Gradient Boosting Machine ◽

Individual Survival ◽

Long Time

On the 15th of April, 1912 the titanic witnessed a disaster resulting in the sinking of her passengers on the maiden voyage near North Atlantic. Even though it is a very long time since this maritime disaster took place, the idea behind what impacts each individual survival is still a great research attracting researcher’s attention. The approach taken in this paper is to utilize the publically available data set from website called Kaggle. Kaggle is a popular data science webpage that put together information of people in the titanic into a data set for the data mining competition: “Titanic: Machine Learning from Disaster”. The research and comparisons in this paper uses a few machine learning techniques and algorithms to analyse the data for classification and prediction of survivors. The prediction and efficiency of these algorithms depend greatly on data analysis and model. The techniques used to do so are Random Forest, Support Vector Machine, Gradient Boosting Machine.

Download Full-text

Implications of COVID-19 Restriction Measures in Urban Air Quality of Thessaloniki, Greece: A Machine Learning Approach

Atmosphere ◽

10.3390/atmos12111500 ◽

2021 ◽

Vol 12 (11) ◽

pp. 1500

Author(s):

Dimitris Akritidis ◽

Prodromos Zanis ◽

Aristeidis K. Georgoulias ◽

Eleni Papakosta ◽

Paraskevi Tzoumaka ◽

...

Keyword(s):

Machine Learning ◽

Air Quality ◽

Urban Areas ◽

Learning Algorithm ◽

Urban Air Quality ◽

Gradient Boosting ◽

Urban Air ◽

Extreme Gradient Boosting ◽

The Impact

Following the rapid spread of COVID-19, a lockdown was imposed in Thessaloniki, Greece, resulting in an abrupt reduction of human activities. To unravel the impact of restrictions on the urban air quality of Thessaloniki, NO2 and O3 observations are compared against the business-as-usual (BAU) concentrations for the lockdown period. BAU conditions are modeled, applying the XGBoost (eXtreme Gradient Boosting) machine learning algorithm on air quality and meteorological surface measurements, and reanalysis data. A reduction in NO2 concentrations is found during the lockdown period due to the restriction policies at both AGSOFIA and EGNATIA stations of −24.9 [−26.6, −23.2]% and −18.4 [−19.6, −17.1]%, respectively. A reverse effect is revealed for O3 concentrations at AGSOFIA with an increase of 12.7 [10.8, 14.8]%, reflecting the reduced O3 titration by NOx. The implications of COVID-19 lockdowns in the urban air quality of Thessaloniki are in line with the results of several recent studies for other urban areas around the world, highlighting the necessity of more sophisticated emission control strategies for urban air quality management.

Download Full-text

Predicting the 10-year risk of cataract surgery using machine learning techniques on questionnaire data: findings from the 45 and Up Study

British Journal of Ophthalmology ◽

10.1136/bjophthalmol-2020-318609 ◽

2021 ◽

pp. bjophthalmol-2020-318609

Author(s):

Wei Wang ◽

Xiaotong Han ◽

Jiaqing Zhang ◽

Xianwen Shang ◽

Jason Ha ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Cataract Surgery ◽

Logistic Model ◽

Machine Learning Techniques ◽

Gradient Boosting ◽

Questionnaire Data ◽

Gradient Boosting Machine ◽

Logistic Regression Method ◽

Baseline Information

Background/aimsTo investigate the feasibility and accuracy of using machine learning (ML) techniques on self-reported questionnaire data to predict the 10-year risk of cataract surgery, and to identify meaningful predictors of cataract surgery in middle-aged and older Australians.MethodsBaseline information regarding demographic, socioeconomic, medical history and family history, lifestyle, dietary and self-rated health status were collected as risk factors. Cataract surgery events were confirmed by the Medicare Benefits Schedule Claims dataset. Three ML algorithms (random forests [RF], gradient boosting machine and deep learning) and one traditional regression algorithm (logistic model) were compared on the accuracy of their predictions for the risk of cataract surgery. The performance was assessed using 10-fold cross-validation. The main outcome measures were areas under the receiver operating characteristic curves (AUCs).ResultsIn total, 207 573 participants, aged 45 years and above without a history of cataract surgery at baseline, were recruited from the 45 and Up Study. The performance of gradient boosting machine (AUC 0.790, 95% CI 0.785 to 0.795), RF (AUC 0.785, 95% CI 0.780 to 0.790) and deep learning (AUC 0.781, 95% CI 0.775 to 61 0.786) were robust and outperformed the traditional logistic regression method (AUC 0.767, 95% CI 0.762 to 0.773, all p<0.05). Age, self-rated eye vision and health insurance were consistently identified as important predictors in all models.ConclusionsThe study demonstrated that ML modelling was able to reasonably accurately predict the 10-year risk of cataract surgery based on questionnaire data alone and was marginally superior to the conventional logistic model.

Download Full-text

Machine learning models to identify low adherence to influenza vaccination among Korean adults with cardiovascular disease

BMC Cardiovascular Disorders ◽

10.1186/s12872-021-01925-7 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Moojung Kim ◽

Young Jae Kim ◽

Sung Jin Park ◽

Kwang Gi Kim ◽

Pyung Chun Oh ◽

...

Keyword(s):

Machine Learning ◽

Cardiovascular Disease ◽

Influenza Vaccination ◽

Machine Learning Techniques ◽

Gradient Boosting ◽

Support Vector ◽

Age Group ◽

Learning Models ◽

Extreme Gradient Boosting ◽

Machine Learning Models

Abstract Background Annual influenza vaccination is an important public health measure to prevent influenza infections and is strongly recommended for cardiovascular disease (CVD) patients, especially in the current coronavirus disease 2019 (COVID-19) pandemic. The aim of this study is to develop a machine learning model to identify Korean adult CVD patients with low adherence to influenza vaccination Methods Adults with CVD (n = 815) from a nationally representative dataset of the Fifth Korea National Health and Nutrition Examination Survey (KNHANES V) were analyzed. Among these adults, 500 (61.4%) had answered "yes" to whether they had received seasonal influenza vaccinations in the past 12 months. The classification process was performed using the logistic regression (LR), random forest (RF), support vector machine (SVM), and extreme gradient boosting (XGB) machine learning techniques. Because the Ministry of Health and Welfare in Korea offers free influenza immunization for the elderly, separate models were developed for the < 65 and ≥ 65 age groups. Results The accuracy of machine learning models using 16 variables as predictors of low influenza vaccination adherence was compared; for the ≥ 65 age group, XGB (84.7%) and RF (84.7%) have the best accuracies, followed by LR (82.7%) and SVM (77.6%). For the < 65 age group, SVM has the best accuracy (68.4%), followed by RF (64.9%), LR (63.2%), and XGB (61.4%). Conclusions The machine leaning models show comparable performance in classifying adult CVD patients with low adherence to influenza vaccination.

Download Full-text

Feasibility of Machine Learning Algorithms for Predicting the Deformation of Anodic Titanium Films by Modulating Anodization Processes

Materials ◽

10.3390/ma14051089 ◽

2021 ◽

Vol 14 (5) ◽

pp. 1089

Author(s):

Sung-Hee Kim ◽

Chanyoung Jeong

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Multiclass Classification ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Smart Manufacturing ◽

Gradient Boosting ◽

Experimental Conditions ◽

Learning Techniques ◽

Tio2 Nanostructures

This study aims to demonstrate the feasibility of applying eight machine learning algorithms to predict the classification of the surface characteristics of titanium oxide (TiO2) nanostructures with different anodization processes. We produced a total of 100 samples, and we assessed changes in TiO2 nanostructures’ thicknesses by performing anodization. We successfully grew TiO2 films with different thicknesses by one-step anodization in ethylene glycol containing NH4F and H2O at applied voltage differences ranging from 10 V to 100 V at various anodization durations. We found that the thicknesses of TiO2 nanostructures are dependent on anodization voltages under time differences. Therefore, we tested the feasibility of applying machine learning algorithms to predict the deformation of TiO2. As the characteristics of TiO2 changed based on the different experimental conditions, we classified its surface pore structure into two categories and four groups. For the classification based on granularity, we assessed layer creation, roughness, pore creation, and pore height. We applied eight machine learning techniques to predict classification for binary and multiclass classification. For binary classification, random forest and gradient boosting algorithm had relatively high performance. However, all eight algorithms had scores higher than 0.93, which signifies high prediction on estimating the presence of pore. In contrast, decision tree and three ensemble methods had a relatively higher performance for multiclass classification, with an accuracy rate greater than 0.79. The weakest algorithm used was k-nearest neighbors for both binary and multiclass classifications. We believe that these results show that we can apply machine learning techniques to predict surface quality improvement, leading to smart manufacturing technology to better control color appearance, super-hydrophobicity, super-hydrophilicity or batter efficiency.

Download Full-text

Predictors of outpatients’ no-show: big data analytics using apache spark

Journal Of Big Data ◽

10.1186/s40537-020-00384-9 ◽

2020 ◽

Vol 7 (1) ◽

Author(s):

Tahani Daghistani ◽

Huda AlGhamdi ◽

Riyad Alshammari ◽

Raed H. AlHazme

Keyword(s):

Machine Learning ◽

Big Data ◽

Negative Impact ◽

Big Data Analytics ◽

Quality Of Healthcare ◽

Machine Learning Techniques ◽

Gradient Boosting ◽

Healthcare Organizations ◽

Data Framework ◽

Learning Techniques

AbstractOutpatients who fail to attend their appointments have a negative impact on the healthcare outcome. Thus, healthcare organizations facing new opportunities, one of them is to improve the quality of healthcare. The main challenges is predictive analysis using techniques capable of handle the huge data generated. We propose a big data framework for identifying subject outpatients’ no-show via feature engineering and machine learning (MLlib) in the Spark platform. This study evaluates the performance of five machine learning techniques, using the (2,011,813‬) outpatients’ visits data. Conducting several experiments and using different validation methods, the Gradient Boosting (GB) performed best, resulting in an increase of accuracy and ROC to 79% and 81%, respectively. In addition, we showed that exploring and evaluating the performance of the machine learning models using various evaluation methods is critical as the accuracy of prediction can significantly differ. The aim of this paper is exploring factors that affect no-show rate and can be used to formulate predictions using big data machine learning techniques.

Download Full-text