scholarly journals Addressing Missing Environmental Data via a Machine Learning Scheme

Atmosphere ◽  
2021 ◽  
Vol 12 (4) ◽  
pp. 499
Author(s):  
Chris G. Tzanis ◽  
Anastasios Alimissis ◽  
Ioannis Koutsogiannis

An important aspect in environmental sciences is the study of air quality, using statistical methods (environmental statistics) which utilize large datasets of climatic parameters. The air-quality-monitoring networks that operate in urban areas provide data on the most important pollutants, which, via environmental statistics, can be used for the development of continuous surfaces of pollutants’ concentrations. Generating ambient air-quality maps can help guide policy makers and researchers to formulate measures to minimize the adverse effects. The information needed for a mapping application can be obtained by employing spatial interpolation methods to the available data, for generating estimations of air-quality distributions. This study used point-monitoring data from the network of stations that operates in Athens, Greece. A machine-learning scheme was applied as a method to spatially estimate pollutants’ concentrations, and the results can be effectively used to implement missing values and provide representative data for statistical analyses purposes.

2020 ◽  
Vol 4 (1) ◽  
pp. 11
Author(s):  
Chris G. Tzanis ◽  
Anastasios Alimissis ◽  
Ioannis Koutsogiannis

An important aspect in environmental sciences is the study of air quality, using statistical methods (environmental statistics) which utilize large datasets of climatic parameters. The air quality monitoring networks that operate in urban areas provide data on the most important pollutants, which via environmental statistics can be used for the development of continuous surfaces of pollutants’ concentrations. Generating ambient air quality maps can help guide policy makers and researchers to formulate measures to minimize the adverse effects. The information needed for a mapping application can be obtained by employing spatial interpolation methods to the available data, for generating estimations of air quality distributions. This study used point monitoring data from the network of stations that operates in Athens. A machine learning scheme was applied as a method to spatially estimate pollutants’ concentrations and the results could be effectively used to implement missing values and provide representative data for statistical analyses purposes.


2017 ◽  
Vol 68 (4) ◽  
pp. 841-846
Author(s):  
Hai-Ying Liu ◽  
Daniel Dunea ◽  
Mihaela Oprea ◽  
Tom Savu ◽  
Stefania Iordache

This paper presents the approach used to develop the information chain required to reach the objectives of the EEA Grants� RokidAIR project in two Romanian cities i.e., Targoviste and Ploiesti. It describes the PM2.5 monitoring infrastructure and architecture to the web-based GIS platform, the early warning system and the decision support system, and finally, the linking of air pollution to health effects in children. In addition, it shows the analysis performance of the designed system to process the collected time series from various data sources using the benzene concentrations monitored in Ploiesti. Moreover, this paper suggests that biomarkers, mobile technologies, and Citizens� Observatories are potential perspectives to improve data coverage by the provision of near-real-time air quality maps, and provide personal exposure and health assessment results, enabling the citizens� engagement and behavioural change. This paper also addresses new fields in nature-based solutions to improve air quality, and studies on air pollution and its mental health effects in the urban areas of Romania.


Author(s):  
Ahmad R. Alsaber ◽  
Jiazhu Pan ◽  
Adeeba Al-Hurban 

In environmental research, missing data are often a challenge for statistical modeling. This paper addressed some advanced techniques to deal with missing values in a data set measuring air quality using a multiple imputation (MI) approach. MCAR, MAR, and NMAR missing data techniques are applied to the data set. Five missing data levels are considered: 5%, 10%, 20%, 30%, and 40%. The imputation method used in this paper is an iterative imputation method, missForest, which is related to the random forest approach. Air quality data sets were gathered from five monitoring stations in Kuwait, aggregated to a daily basis. Logarithm transformation was carried out for all pollutant data, in order to normalize their distributions and to minimize skewness. We found high levels of missing values for NO2 (18.4%), CO (18.5%), PM10 (57.4%), SO2 (19.0%), and O3 (18.2%) data. Climatological data (i.e., air temperature, relative humidity, wind direction, and wind speed) were used as control variables for better estimation. The results show that the MAR technique had the lowest RMSE and MAE. We conclude that MI using the missForest approach has a high level of accuracy in estimating missing values. MissForest had the lowest imputation error (RMSE and MAE) among the other imputation methods and, thus, can be considered to be appropriate for analyzing air quality data.


Atmosphere ◽  
2021 ◽  
Vol 12 (2) ◽  
pp. 264 ◽  
Author(s):  
Giovanni Lonati ◽  
Federico Riva

The impact of the reduced atmospheric emissions due to the COVID-19 lockdown on ambient air quality in the Po Valley of Northern Italy was assessed for gaseous pollutants (NO2, benzene, ammonia) based on data collected at the monitoring stations distributed all over the area. Concentration data for each month of the first semester of 2020 were compared with those of the previous six years, on monthly, daily, and hourly bases, so that pre, during, and post-lockdown conditions of air quality could be separately analyzed. The results show that, as in many other areas worldwide, the Po Valley experienced better air quality during 2020 spring months for NO2 and benzene. In agreement with the reductions of nitrogen oxides and benzene emissions from road traffic, estimated to be −35% compared to the regional average, the monthly mean concentration levels for 2020 showed reductions in the −40% to −35% range compared with the previous years, but with higher reductions, close to −50%, at high-volume-traffic sites in urban areas. Conversely, NH3 ambient concentration levels, almost entirely due the emissions of the agricultural sector, did not show any relevant change, even at high-volume-traffic sites in urban areas. These results point out the important role of traffic emissions in NO2 and benzene ambient levels in the Po Valley, and confirm that this region is a rather homogeneous air basin with urban area hot-spots, the contributions of which add up to a relatively high regional background concentration level. Additionally, the relatively slow response of the air quality levels to the sudden decrease of the emissions due to the lockdown shows that this region is characterized by a weak exchange of the air masses that favors both the build-up of atmospheric pollutants and the development of secondary formation processes. Thus, air quality control strategies should aim for structural interventions intended to reduce traffic emissions at the regional scale and not only in the largest urban areas.


2021 ◽  
Vol 21 (9) ◽  
pp. 7373-7394
Author(s):  
Jérôme Barré ◽  
Hervé Petetin ◽  
Augustin Colette ◽  
Marc Guevara ◽  
Vincent-Henri Peuch ◽  
...  

Abstract. This study provides a comprehensive assessment of NO2 changes across the main European urban areas induced by COVID-19 lockdowns using satellite retrievals from the Tropospheric Monitoring Instrument (TROPOMI) onboard the Sentinel-5p satellite, surface site measurements, and simulations from the Copernicus Atmosphere Monitoring Service (CAMS) regional ensemble of air quality models. Some recent TROPOMI-based estimates of changes in atmospheric NO2 concentrations have neglected the influence of weather variability between the reference and lockdown periods. Here we provide weather-normalized estimates based on a machine learning method (gradient boosting) along with an assessment of the biases that can be expected from methods that omit the influence of weather. We also compare the weather-normalized satellite-estimated NO2 column changes with weather-normalized surface NO2 concentration changes and the CAMS regional ensemble, composed of 11 models, using recently published estimates of emission reductions induced by the lockdown. All estimates show similar NO2 reductions. Locations where the lockdown measures were stricter show stronger reductions, and, conversely, locations where softer measures were implemented show milder reductions in NO2 pollution levels. Average reduction estimates based on either satellite observations (−23 %), surface stations (−43 %), or models (−32 %) are presented, showing the importance of vertical sampling but also the horizontal representativeness. Surface station estimates are significantly changed when sampled to the TROPOMI overpasses (−37 %), pointing out the importance of the variability in time of such estimates. Observation-based machine learning estimates show a stronger temporal variability than model-based estimates.


Atmosphere ◽  
2020 ◽  
Vol 11 (5) ◽  
pp. 492 ◽  
Author(s):  
Petra Bauerová ◽  
Adriana Šindelářová ◽  
Štěpán Rychlík ◽  
Zbyněk Novák ◽  
Josef Keder

With attention increasing regarding the level of air pollution in different metropolitan and industrial areas worldwide, interest in expanding the monitoring networks by low-cost air quality sensors is also increasing. Although the role of these small and affordable sensors is rather supplementary, determination of the measurement uncertainty is one of the main questions of their applicability because there is no certificate for quality assurance of these non-reference technologies. This paper presents the results of almost one-year field testing measurements, when the data from different low-cost sensors (for SO2, NO2, O3, and CO: Cairclip, Envea, FR; for PM1, PM2.5, and PM10: PMS7003, Plantower, CHN, and OPC-N2, Alphasense, UK) were compared with co-located reference monitors used within the Czech national ambient air quality monitoring network. The results showed that in addition to the given reduced measurement accuracy of the sensors, the data quality depends on the early detection of defective units and changes caused by the effect of meteorological conditions (effect of air temperature and humidity on gas sensors and effect of air humidity with condensation conditions on particle counters), or by the interference of different pollutants (especially in gas sensors). Comparative measurement is necessary prior to each sensor’s field applications.


2019 ◽  
Vol 109 ◽  
pp. 277-282 ◽  
Author(s):  
Corbett Grainger ◽  
Andrew Schreiber

In the United States, ambient air quality is regulated through National Ambient Air Quality standards (NAAQS). Enforcement of these standards is delegated to state and sub-state regulators who are also tasked with designing their own monitoring networks for ambient pollution. Past work has found evidence consistent with strategic behavior: local regulators strategically avoid pollution hotspots when siting monitors. This paper assesses whether income and race have historically played a role in monitor siting decisions.


Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-23 ◽  
Author(s):  
Mauro Castelli ◽  
Fabiana Martins Clemente ◽  
Aleš Popovič ◽  
Sara Silva ◽  
Leonardo Vanneschi

Predicting air quality is a complex task due to the dynamic nature, volatility, and high variability in time and space of pollutants and particulates. At the same time, being able to model, predict, and monitor air quality is becoming more and more relevant, especially in urban areas, due to the observed critical impact of air pollution on citizens’ health and the environment. In this paper, we employ a popular machine learning method, support vector regression (SVR), to forecast pollutant and particulate levels and to predict the air quality index (AQI). Among the various tested alternatives, radial basis function (RBF) was the type of kernel that allowed SVR to obtain the most accurate predictions. Using the whole set of available variables revealed a more successful strategy than selecting features using principal component analysis. The presented results demonstrate that SVR with RBF kernel allows us to accurately predict hourly pollutant concentrations, like carbon monoxide, sulfur dioxide, nitrogen dioxide, ground-level ozone, and particulate matter 2.5, as well as the hourly AQI for the state of California. Classification into six AQI categories defined by the US Environmental Protection Agency was performed with an accuracy of 94.1% on unseen validation data.


2009 ◽  
Vol 18 (3) ◽  
pp. 336 ◽  
Author(s):  
Yongqiang Liu ◽  
Scott Goodrick ◽  
Gary Achtemeier ◽  
William A. Jackson ◽  
John J. Qu ◽  
...  

This study investigates smoke incursion into urban areas by examining a prescribed burn in central Georgia, USA, on 28 February 2007. Simulations were conducted with a regional modeling framework to understand transport, dispersion, and structure of smoke plumes, the air quality effects, sensitivity to emissions, and the roles of burn management strategy in mitigating the effects. The results indicate that smoke plumes first went west, but turned north-west at noon owing to a shift in wind direction. The smoke then invaded metropolitan Atlanta during the evening rush hour. The plumes caused severe air quality problems in Atlanta. Some hourly ground PM2.5 (particulate matter not greater than 2.5 μm in diameter) concentrations at three metropolitan Atlanta locations were three to four times as high as the daily (24-h) US National Ambient Air Quality Standard. The simulated shift in the smoke transport direction and the resultant effects on air quality are supported by the satellite and ambient air measurements. Two sensitivity simulations indicate a nearly linear relation between the emission intensities and PM2.5 concentrations. Two other simulations indicate that the impacts on air quality for the residents of Atlanta during the evening commute could have been reduced if the starting time of the burn had been altered.


Sign in / Sign up

Export Citation Format

Share Document