scholarly journals Evaluation of Boosted Regression Tree for the Prediction of the Maximum 24-Hour Concentration of Particulate Matter

Author(s):  
Wan Nur Shaziayani ◽  
◽  
Ahmad Zia Ul-Saufie ◽  
Syarifah Adilah Mohamed Yusoff ◽  
Hasfazilah Ahmat ◽  
...  

Air pollution is a considerable health danger to the environment. The objective of this study was to assess the characteristics of air quality and predict PM10 concentrations using boosted regression trees (BRTs). The maximum daily PM10 concentration data from 2002 to 2016 were obtained from the air quality monitoring station in Kuching, Sarawak. Eighty percent of the monitoring records were used for the training and twenty percent for the validation of the models. The best iteration of the BRT model was performed by optimizing the prediction performance, while the BRT algorithm model was constructed from multiple regression models. The two main parameters that were used were the learning rate (lr) and tree complexity (tc), which were fixed at 0.01 and 5, respectively. Meanwhile, the number of trees (nt) was determined by using an independent test set (test), a 5-fold cross validation (CV) and out-of-bag (OOB) estimation. The algorithm model for the BRT produced by using the CV was the best guide to be used compared with the OOB to test the predicted PM10 concentration. The performance indicators showed that the model was adequate for the next day’s prediction (PA=0.638, R2=0.427, IA=0.749, NAE=0.267, and RMSE=28.455).

Author(s):  
Sandra Ceballos-Santos ◽  
Jaime González-Pardo ◽  
David C. Carslaw ◽  
Ana Santurtún ◽  
Miguel Santibáñez ◽  
...  

The global COVID-19 pandemic that began in late December 2019 led to unprecedented lockdowns worldwide, providing a unique opportunity to investigate in detail the impacts of restricted anthropogenic emissions on air quality. A wide range of strategies and approaches exist to achieve this. In this paper, we use the “deweather” R package, based on Boosted Regression Tree (BRT) models, first to remove the influences of meteorology and emission trend patterns from NO, NO2, PM10 and O3 data series, and then to calculate the relative changes in air pollutant levels in 2020 with respect to the previous seven years (2013–2019). Data from a northern Spanish region, Cantabria, with all types of monitoring stations (traffic, urban background, industrial and rural) were used, dividing the calendar year into eight periods according to the intensity of government restrictions. The results showed mean reductions in the lockdown period above −50% for NOx, around −10% for PM10 and below −5% for O3. Small differences were found between the relative changes obtained from normalised data with respect to those from observations. These results highlight the importance of developing an integrated policy to reduce anthropogenic emissions and the need to move towards sustainable mobility to ensure safer air quality levels, as pre-existing concentrations in some cases exceed the safe threshold.


2019 ◽  
Vol 70 (12) ◽  
pp. 2476-2483 ◽  
Author(s):  
Alpha Forna ◽  
Pierre Nouvellet ◽  
Ilaria Dorigatti ◽  
Christl A Donnelly

Abstract Background The 2013–2016 West African Ebola epidemic has been the largest to date with >11 000 deaths in the affected countries. The data collected have provided more insight into the case fatality ratio (CFR) and how it varies with age and other characteristics. However, the accuracy and precision of the naive CFR remain limited because 44% of survival outcomes were unreported. Methods Using a boosted regression tree model, we imputed survival outcomes (ie, survival or death) when unreported, corrected for model imperfection to estimate the CFR without imputation, with imputation, and adjusted with imputation. The method allowed us to further identify and explore relevant clinical and demographic predictors of the CFR. Results The out-of-sample performance (95% confidence interval [CI]) of our model was good: sensitivity, 69.7% (52.5–75.6%); specificity, 69.8% (54.1–75.6%); percentage correctly classified, 69.9% (53.7–75.5%); and area under the receiver operating characteristic curve, 76.0% (56.8–82.1%). The adjusted CFR estimates (95% CI) for the 2013–2016 West African epidemic were 82.8% (45.6–85.6%) overall and 89.1% (40.8–91.6%), 65.6% (61.3–69.6%), and 79.2% (45.4–84.1%) for Sierra Leone, Guinea, and Liberia, respectively. We found that district, hospitalisation status, age, case classification, and quarter (date of case reporting aggregated at three-month intervals) explained 93.6% of the variance in the naive CFR. Conclusions The adjusted CFR estimates improved the naive CFR estimates obtained without imputation and were more representative. Used in conjunction with other resources, adjusted estimates will inform public health contingency planning for future Ebola epidemics, and help better allocate resources and evaluate the effectiveness of future inventions.


2019 ◽  
Vol 8 (4) ◽  
pp. 1565-1575 ◽  

The stochastic boosted regression trees (BRT) technique has the capability to quantify and explain the relationships between explanatory variables. We applied this machine learning modelling technique to derive the relationships between the gases air pollutants, meteorological conditions and time system variables of particulate matter (PM10) concentrations. In order to get lowest prediction error and to avoid over-fitting, the parameters of the BRT model need to be tuned. In this experiment, 25 BRT models were generated from 14 years’ worth of hourly data (122,736 a one hour averaged data from January 2000 to December 2013 gathered from four Continuous Automated Air Quality Monitoring Stations in peninsular Malaysia (located in Klang, Selangor (CA0011), Perai, Penang (CA0003), Kota Bharu, Kelantan (CA0022) and Kemaman, Terengganu (CA0002)). Seventy percent of the data were used for training and 30 percent for validation of the models. An experiment was conducted to determine the best iteration that could model hourly PM10 concentrations by optimizing the BRT parameter which are learning rate (lr), tree complexity (tc) and number of trees (nt). Five different lr (0.001, 0.005, 0.01, 0.05 and 0.1) were tested with different tree complexities (1 to 20) in the BRT model development process. From the experiment, the combination of lr = 0.05 and tc = 5 for the training set for the BRT model achieved the lowest root mean squared error (RMSE) compared to the other tested combinations. It was also found that the number of trees increased with the increment in the number of samples. A high coefficient of determinant (R2 ) value (0.90) for the linear relationship between the number of samples and nt was found for all the four stations. The optimum number of trees for the model was estimated by using 10-fold cross-validation. It was found that the best number of iterations for Klang, Perai, Kota Bahru and Kemaman were 12,327, 32,987, 16,370 and 57,634, respectively. The prediction accuracy of the model was tested by using the fraction of prediction namely a factor of two (FAC2), mean bias, mean gross error, RMSE, correlation coefficient (R), and index of agreement (IOA). The prediction performance of the final BRT model based on the R value was 0.81, 0.78, 0.85 and 0.81 for for Perai, Kemaman, Klang and Kota Bahru, respectively, which indicates that the BRT model developed and applicability of this can be used in other atmospheric environment data.


Atmosphere ◽  
2021 ◽  
Vol 12 (5) ◽  
pp. 609
Author(s):  
Alexandra Abigail Encalada-Malca ◽  
Javier David Cochachi-Bustamante ◽  
Paulo Canas Rodrigues ◽  
Rodrigo Salas ◽  
Javier Linkolk López-Gonzales

Lima is considered one of the cities with the highest air pollution in Latin America. Institutions such as DIGESA, PROTRANSPORTE and SENAMHI are in charge of permanently monitoring air quality; therefore, the air quality visualization system must manage large amounts of data of different concentrations. In this study, a spatio-temporal visualization approach was developed for the exploration of data of the PM10 concentration in Metropolitan Lima, where the spatial behavior, at different time scales, of hourly concentrations of PM10 are analyzed using basic and specialized charts. The results show that the stations located to the east side of the metropolitan area had the highest concentrations, in contrast to the stations located in the center and north that reported better air quality. According to the temporal variation, the station with the highest average of biannual and annual PM10 was the HCH station. The highest PM10 concentrations were registered in 2018, during the summer, highlighting the month of March with daily averages that reached 435 μμg/m3. During the study period, the CRB was the station that recorded the lowest concentrations and the only one that met the Environmental Quality Standard for air quality. The proposed approach exposes a sequence of steps for the elaboration of charts with increasingly specific time periods according to their relevance, and a statistical analysis, such as the dynamic temporal correlation, that allows to obtain a detailed visualization of the spatio-temporal variations of PM10 concentrations. Furthermore, it was concluded that the meteorological variables do not indicate a causal relationship with respect to PM10 levels, but rather that the concentrations of particulate material are related to the urban characteristics of each district.


2020 ◽  
Author(s):  
Adrienn Varga-Balogh ◽  
Ádám Leelőssy ◽  
István Lagzi ◽  
Róbert Mészáros

<div> <p><span>Winter air pollution in Budapest is a major environmental issue, caused by an interaction of residential heating, urban traffic and large-scale transport. I</span><span>ncreasing public and political demand are</span><span> present to achieve </span><span>more accurate air quality predictions to support both real-time public health measures and long-term mitigation policies.  </span><span>A</span><span>tmospheric chemistry and transport models of the Copernicus Atmospheric Monitoring Service (CAMS) provide </span><span>near-real-time </span><span>air quality forecasts for Europe</span><span>. The validation of these model predictions for Budapest showed that although large-scale processes are well captured, the complex interaction of large-scale plumes with significant and highly variable local residential emissions leads to the underestimation of winter PM10 concentrations. Furthermore, CAMS models are not expected to fully predict the non-representative concentrations at specific urban monitoring locations, which, on the other hand, serve as the legal basis of all public policies and measures. Therefore, obtaining a relationship between monitoring site observations and CAMS model predictions is of primary importance.</span> </p> </div><div> <p><span>In this study, we used observed PM10 concentration data from 12 air quality monitoring sites within Budapest, as well as 24-hour predictions from 7 of the 9 CAMS models to </span><span>produce an optimal linear combination of models that best matched, in terms of RMSE, the observed time series. A zero-degree term to correct the model bias was also applied. The applied data fusion method was cross-validated on urban monitoring sites not used in fitting the model, and found to improve PM10 forecast validation statistics compared to the pointwise model median (CAMS ensemble) as well as each of the 7 single models. The presented fusion of CAMS models can therefore provide an improved prediction of PM10 concentrations at urban monitoring sites in Budapest. </span> </p> </div>


Author(s):  
Ying Wang ◽  
Jing Tao ◽  
Rong Wang ◽  
Chuanmin Mi

The large-scale construction of subway systems, which is viewed as one of the potential measures to mitigate traffic congestion and its resulting air pollution and health impact, is taking place in major cities throughout China. However, the literature on the impact of the new subway line openings on particulate matter with a diameter less than 10 µm (PM10) at the city level is scarce. Employing the Propensity Score Matching–Difference-in-differences method, this paper examines the effect of the new subway line openings on air quality in terms of PM10 in China, using the daily PM10 concentration data from January 2014 to December 2017. Our finding shows that the short-term treatment effect on PM10 is more controversial. Furthermore, for different time windows, the result confirms an increase in PM10 pollution during the short term, while the subway line openings improve air quality in the longer term. In addition, we find that the treatment effect results in high PM10 pollution for cities with 1–2 million people, while it improves air quality for cities with over 2 million people. Moreover, for cities with varying levels of GDP, there is evidence of a reduction in PM10 after the subway line openings. Mechanism analysis supports the conclusion that the PM10 reduction originated from substituting the subway for driving.


2020 ◽  
Vol 638 ◽  
pp. 149-164
Author(s):  
GM Svendsen ◽  
M Ocampo Reinaldo ◽  
MA Romero ◽  
G Williams ◽  
A Magurran ◽  
...  

With the unprecedented rate of biodiversity change in the world today, understanding how diversity gradients are maintained at mesoscales is a key challenge. Drawing on information provided by 3 comprehensive fishery surveys (conducted in different years but in the same season and with the same sampling design), we used boosted regression tree (BRT) models in order to relate spatial patterns of α-diversity in a demersal fish assemblage to environmental variables in the San Matias Gulf (Patagonia, Argentina). We found that, over a 4 yr period, persistent diversity gradients of species richness and probability of an interspecific encounter (PIE) were shaped by 3 main environmental gradients: bottom depth, connectivity with the open ocean, and proximity to a thermal front. The 2 main patterns we observed were: a monotonic increase in PIE with proximity to fronts, which had a stronger effect at greater depths; and an increase in PIE when closer to the open ocean (a ‘bay effect’ pattern). The originality of this work resides on the identification of high-resolution gradients in local, demersal assemblages driven by static and dynamic environmental gradients in a mesoscale seascape. The maintenance of environmental gradients, specifically those associated with shared resources and connectivity with an open system, may be key to understanding community stability.


Atmosphere ◽  
2021 ◽  
Vol 12 (5) ◽  
pp. 562
Author(s):  
Jorge Moreda-Piñeiro ◽  
Joel Sánchez-Piñero ◽  
María Fernández-Amado ◽  
Paula Costa-Tomé ◽  
Nuria Gallego-Fernández ◽  
...  

Due to the exponential growth of the SARS-CoV-2 pandemic in Spain (2020), the Spanish Government adopted lockdown measures as mitigating strategies to reduce the spread of the pandemic from 14 March. In this paper, we report the results of the change in air quality at two Atlantic Coastal European cities (Northwest Spain) during five lockdown weeks. The temporal evolution of gaseous (nitrogen oxides, comprising NOx, NO, and NO2; sulfur dioxide, SO2; carbon monoxide, CO; and ozone, O3) and particulate matter (PM10; PM2.5; and equivalent black carbon, eBC) pollutants were recorded before (7 February to 13 March 2020) and during the first five lockdown weeks (14 March to 20 April 2020) at seven air quality monitoring stations (urban background, traffic, and industrial) in the cities of A Coruña and Vigo. The influences of the backward trajectories and meteorological parameters on air pollutant concentrations were considered during the studied period. The temporal trends indicate that the concentrations of almost all species steadily decreased during the lockdown period with statistical significance, with respect to the pre-lockdown period. In this context, great reductions were observed for pollutants related mainly to fossil fuel combustion, road traffic, and shipping emissions (−38 to −78% for NO, −22 to −69% for NO2, −26 to −75% for NOx, −3 to −77% for SO2, −21% for CO, −25 to −49% for PM10, −10 to −38% for PM2.5, and −29 to −51% for eBC). Conversely, O3 concentrations increased from +5 to +16%. Finally, pollutant concentration data for 14 March to 20 April of 2020 were compared with those of the previous two years. The results show that the overall air pollutants levels were higher during 2018–2019 than during the lockdown period.


Author(s):  
Ghalia Gamaleldin ◽  
Haitham Al-Deek ◽  
Adrian Sandt ◽  
John McCombs ◽  
Alan El-Urfali

Safety performance functions (SPFs) are essential tools to help agencies predict crashes and understand influential factors. Florida Department of Transportation (FDOT) has implemented a context classification system which classifies intersections into eight context categories rather than the three classifications used in the Highway Safety Manual (HSM). Using this system, regional SPFs could be developed for 32 intersection types (unsignalized and signalized 3-leg and 4-leg for each category) rather than the 10 HSM intersection types. In this paper, eight individual intersection group SPFs were developed for the C3R-Suburban Residential and C4-Urban General categories and compared with full SPFs for these categories. These comparisons illustrate the unique and regional insights that agencies can gain by developing these individual SPFs. Poisson, negative binomial, zero-inflated, and boosted regression tree models were developed for each studied group as appropriate, with the best model selected for each group based on model interpretability and five performance measures. Additionally, a linear regression model was built to predict minor roadway traffic volumes for intersections which were missing these volumes. The full C3R and C4 SPFs contained four and six significant variables, respectively, while the individual intersection group SPFs in these categories contained six and nine variables. Factors such as major median, intersection angle, and FDOT District 7 regional variable were absent from the full SPFs. By developing individual intersection group SPFs with regional factors, agencies can better understand the factors and regional differences which affect crashes in their jurisdictions and identify effective treatments.


Sign in / Sign up

Export Citation Format

Share Document