scholarly journals AirInsight: Visual Exploration and Interpretation of Latent Patterns and Anomalies in Air Quality Data

2019 ◽  
Vol 11 (10) ◽  
pp. 2944 ◽  
Author(s):  
Huijie Zhang ◽  
Ke Ren ◽  
Yiming Lin ◽  
Dezhan Qu ◽  
Zhenxin Li

Nowadays, huge volume of air quality data provides unprecedented opportunities for analyzing pollution. However, due to the high complexity, most traditional analytical methods focus on abstracting data, so these techniques discard the original structure and limit the understanding of the results. Visual analysis is a powerful technique for exploring unknown patterns since it retains the details of the original data and gives visual feedback to users. In this paper, we focus on air quality data and propose the AirInsight design, an interactive visual analytic system for recognizing, exploring, and summarizing regular patterns, as well as detecting, classifying, and interpreting abnormal cases. Based on the time-varying and multivariate features of air quality data, a dimension reduction method Composite Least Square Projection (CLSP) is proposed, which allows appreciating and interpreting the data patterns in the context of attributes. On the basis of the observed regular patterns, multiple abnormal cases are further detected, including the multivariate anomalies by the proposed Noise Hierarchical Clustering (NHC) method, abruptly changing timestamps by Time diversity (TD) indicator, and cities with unique patterns by the Geographical Surprise (GS) measure. Moreover, we combine TD and GS to group anomalies based on their underlying spatiotemporal correlations. AirInsight includes multiple coordinated views and rich interactive functions to provide contextual information from different aspects and facilitate a comprehensive understanding. In particular, a pair of glyphs are designed that provide a visual representation of the temporal variation in air quality conditions for a user-selected city. Experiments show that CLSP improves the accuracy of Least Square Projection (LSP) and that NHC has the ability to separate noises. Meanwhile, several case studies and task-based user evaluation demonstrate that our system is effective and practical for exploring and interpreting multivariate spatiotemporal patterns and anomalies in air quality data.

2019 ◽  
Vol 9 (1) ◽  
pp. 1-23 ◽  
Author(s):  
Fangzhou Guo ◽  
Tianlong Gu ◽  
Wei Chen ◽  
Feiran Wu ◽  
Qi Wang ◽  
...  

2020 ◽  
Vol 23 (6) ◽  
pp. 1129-1145
Author(s):  
Dezhan Qu ◽  
Xiaoli Lin ◽  
Ke Ren ◽  
Quanle Liu ◽  
Huijie Zhang

2021 ◽  
Vol 33 (9) ◽  
pp. 1326-1336
Author(s):  
Dong Tian ◽  
Guan Li ◽  
Shiyu Cheng ◽  
Lei Kong ◽  
Xiao Tang ◽  
...  

Author(s):  
Ahmad R. Alsaber ◽  
Jiazhu Pan ◽  
Adeeba Al-Hurban 

In environmental research, missing data are often a challenge for statistical modeling. This paper addressed some advanced techniques to deal with missing values in a data set measuring air quality using a multiple imputation (MI) approach. MCAR, MAR, and NMAR missing data techniques are applied to the data set. Five missing data levels are considered: 5%, 10%, 20%, 30%, and 40%. The imputation method used in this paper is an iterative imputation method, missForest, which is related to the random forest approach. Air quality data sets were gathered from five monitoring stations in Kuwait, aggregated to a daily basis. Logarithm transformation was carried out for all pollutant data, in order to normalize their distributions and to minimize skewness. We found high levels of missing values for NO2 (18.4%), CO (18.5%), PM10 (57.4%), SO2 (19.0%), and O3 (18.2%) data. Climatological data (i.e., air temperature, relative humidity, wind direction, and wind speed) were used as control variables for better estimation. The results show that the MAR technique had the lowest RMSE and MAE. We conclude that MI using the missForest approach has a high level of accuracy in estimating missing values. MissForest had the lowest imputation error (RMSE and MAE) among the other imputation methods and, thus, can be considered to be appropriate for analyzing air quality data.


2021 ◽  
Vol 138 ◽  
pp. 104976
Author(s):  
Juan José Díaz ◽  
Ivan Mura ◽  
Juan Felipe Franco ◽  
Raha Akhavan-Tabatabaei

Sign in / Sign up

Export Citation Format

Share Document