scholarly journals Filling the gaps of in-situ hourly PM<sub>2.5</sub> concentration data with the aid of empirical orthogonal function constrained by diurnal cycles

Author(s):  
Kaixu Bai ◽  
Ke Li ◽  
Jianping Guo ◽  
Yuanjian Yang ◽  
Ni-Bin Chang

Abstract. Data gaps are frequently observed in the hourly PM2.5 mass concentration records measured from the China national air quality monitoring network. In this study, we proposed a novel gap filling method called the diurnal cycle constrained empirical orthogonal function (DCCEOF) to fill in data gaps present in hourly PM2.5 concentration records. This method mainly calibrates the diurnal cycle of PM2.5 that is reconstructed from discrete PM2.5 neighborhood fields in space and time to the level of valid PM2.5 data values observed at adjacent times. Prior to gap filling, possible impacts of varied number of data gaps in the time series of hourly PM2.5 concentration on PM2.5 daily averages were examined via sensitivity experiments. The results showed that PM2.5 data suffered from the gaps on about 40% of days, indicating a high frequency of missing data in the hourly PM2.5 records. These gaps could introduce significant bias to daily-averaged PM2.5. Particularly, given the same number of gaps, larger biases would be introduced to daily-averaged PM2.5 during clean days than polluted days. The cross-validation results indicate that the predicted missing values from the DCCEOF method with the consideration of the local diurnal phases of PM2.5 are more accurate and reasonable than those from the conventional spline interpolation approach, especially for the reconstruction of daily peaks and/or minima that cannot be restored by the latter method. To fill the gaps in the hourly PM2.5 records across China during 2014 to 2019, as a practical application, the DCCEOF method can be able to reduce the averaged frequency of missingness from 42.6 % to 5.7 %. In general, the present work implies that the DCCEOF method is realistic and robust to be able to handle the missingness issues in time series of geophysical parameters with significant diurnal variability and can be expectably applied in other data sets with similar barriers because of its self-consistent capability.

2020 ◽  
Vol 13 (3) ◽  
pp. 1213-1226 ◽  
Author(s):  
Kaixu Bai ◽  
Ke Li ◽  
Jianping Guo ◽  
Yuanjian Yang ◽  
Ni-Bin Chang

Abstract. Data gaps in surface air quality measurements significantly impair the data quality and the exploration of these valuable data sources. In this study, a novel yet practical method called diurnal-cycle-constrained empirical orthogonal function (DCCEOF) was developed to fill in data gaps present in data records with evident temporal variability. The hourly PM2.5 concentration data retrieved from the national ambient air quality monitoring network in China were used as a demonstration. The DCCEOF method aims to reconstruct the diurnal cycle of PM2.5 concentration from its discrete neighborhood field in space and time firstly and then predict the missing values by calibrating the reconstructed diurnal cycle to the level of valid PM2.5 concentrations observed at adjacent times. The statistical results indicate a high frequency of data gaps in our retrieved hourly PM2.5 concentration record, with PM2.5 concentration measured on about 40 % of the days suffering from data gaps. Further sensitivity analysis results reveal that data gaps in the hourly PM2.5 concentration record may introduce significant bias to its daily averages, especially during clean episodes at which PM2.5 daily averages are observed to be subject to larger uncertainties compared to the polluted days (even in the presence of the same amount of missingness). The cross-validation results indicate that our suggested DCCEOF method has a good prediction accuracy, particularly in predicting daily peaks and/or minima that cannot be restored by conventional interpolation approaches, thus confirming the effectiveness of the consideration of the local diurnal variation pattern in gap filling. By applying the DCCEOF method to the hourly PM2.5 concentration record measured in China from 2014 to 2019, the data completeness ratio was substantially improved while the frequency of days with gapped PM2.5 records reduced from 42.6 % to 5.7 %. In general, our DCCEOF method provides a practical yet effective approach to handle data gaps in time series of geophysical parameters with significant diurnal variability, and this method is also transferable to other data sets with similar barriers because of its self-consistent capability.


2012 ◽  
Vol 5 (2) ◽  
pp. 267-273 ◽  
Author(s):  
A. Devasthale ◽  
K.-G. Karlsson ◽  
J. Quaas ◽  
H. Grassl

Abstract. The Advanced Very High Resolution Radiometer (AVHRR) instruments onboard the series of National Oceanic and Atmospheric Administration (NOAA) satellites offer the longest available meteorological data records from space. These satellites have drifted in orbit resulting in shifts in the local time sampling during the life span of the sensors onboard. Depending upon the amplitude of the diurnal cycle of the geophysical parameters derived, orbital drift may cause spurious trends in their time series. We investigate tropical deep convective clouds, which show pronounced diurnal cycle amplitude, to estimate an upper bound of the impact of orbital drift on their time series. We carry out a rotated empirical orthogonal function analysis (REOF) and show that the REOFs are useful in delineating orbital drift signal and, more importantly, in subtracting this signal in the time series of convective cloud amount. These results will help facilitate the derivation of homogenized data series of cloud amount from NOAA satellite sensors and ultimately analyzing trends from them. However, we suggest detailed comparison of various methods and rigorous testing thereof applying final orbital drift corrections.


Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-15
Author(s):  
Hongwu Zhou ◽  
Haidong Pan ◽  
Shuang Li ◽  
Xianqing Lv

Fine particulate matter with diameters less than 2.5 μm (PM2.5) concentration monitoring is closely related to public health, outdoor activities, environmental protection, and other fields. However, the incomplete PM2.5 observation records provided by ground-based PM2.5 concentration monitoring stations pose a challenge to the study of PM2.5 propagation and evolution model. Consequently, PM2.5 concentration data imputation has been widely studied. Based on empirical orthogonal function (EOF), a new spatiotemporal interpolation method, EOF interpolation (EOFI) is introduced in this paper, and then, EOFI is applied to reconstruct the hourly PM2.5 concentration records of two stations in the first half of the year. The main steps of EOFI here are to firstly decompose the spatiotemporal data matrix of the original observation site into mutually orthogonal temporal and spatial modes with EOF method. Secondly, the spatial mode of the missing data station is estimated by inverse distance weighting interpolation of the spatial mode of the observation sites. After that, the records of the missing data station can be reconstructed by multiplying the estimated spatial mode and the corresponding temporal mode. The optimal mode number for EOFI is determined by minimizing the root mean square error (RMSE) between reconstructed records and corresponding valid records. Finally, six evaluation indices (mean absolute error (MAE), RMSE, correlation coefficient (Corr), deviation rate bias, Nash–Sutcliffe efficiency (NSE), and index of agreement (IA)) are calculated. The results show that EOFI performs better than the other three interpolation methods, namely, inverse distance weight interpolation, thin plate spline, and surface spline interpolation. The EOFI has the advantages of less computation, less parameter selection, and ease of implementation, it is an alternative method when the number of observation stations is rare, and the proportion of missing value at some stations is large. Moreover, it can also be applied to other spatiotemporal variables interpolation and imputation.


2011 ◽  
Vol 4 (3) ◽  
pp. 3877-3890
Author(s):  
A. Devasthale ◽  
K. Karlsson ◽  
J. Quaas ◽  
H. Grassl

Abstract. The AVHRRs instruments onboard the series of NOAA satellites offer the longest available meteorological data records from space. These satellites have drifted in orbit resulting in shifts in the local time sampling during the life span of sensors onboard. Depending on the amplitude of a diurnal cycle of the geophysical parameters derived, orbital drift may cause spurious trends in their time series. We investigate tropical deep convective clouds, which show pronounced diurnal cycle amplitude, to bracket an upper bound of the impact of orbital drift on their time series. We carry out a rotated empirical orthogonal function analysis and show that the REOFs are useful in delineating orbital drift signal and, more importantly, in correcting this signal in the time series of convective cloud amount. These results will help facilitate the derivation of homogenized data series of cloud amount from NOAA satellite sensors and ultimately analyzing trends from them. However, we suggest detailed comparison of various methods and their rigorous testing before applying final orbital drift corrections.


2010 ◽  
Vol 35 (4) ◽  
pp. 710-721 ◽  
Author(s):  
Ying-Tsong Lin ◽  
Arthur E. Newhall ◽  
Timothy F. Duda ◽  
Pierre F. J. Lermusiaux ◽  
Patrick J. Haley

2018 ◽  
Vol 75 (10) ◽  
pp. 3507-3519 ◽  
Author(s):  
Aditi Sheshadri ◽  
R. Alan Plumb ◽  
Erik A. Lindgren ◽  
Daniela I. V. Domeisen

Stratosphere–troposphere interactions are conventionally characterized using the first empirical orthogonal function (EOF) of fields such as zonal-mean zonal wind. Perpetual-winter integrations of an idealized model are used to contrast the vertical structures of EOFs with those of principal oscillation patterns (POPs; the modes of a linearized system governing the evolution of zonal flow anomalies). POP structures are shown to be insensitive to pressure weighting of the time series of interest, a factor that is particularly important for a deep system such as the stratosphere and troposphere. In contrast, EOFs change from being dominated by tropospheric variability with pressure weighting to being dominated by stratospheric variability without it. The analysis reveals separate tropospheric and stratospheric modes in model integrations that are set up to resemble midwinter variability of the troposphere and stratosphere in both hemispheres. Movies illustrating the time evolution of POP structures show the existence of a fast, propagating tropospheric mode in both integrations, and a pulsing stratospheric mode with a tropospheric extension in the Northern Hemisphere–like integration.


2019 ◽  
Author(s):  
Huiying Ren ◽  
Erol Cromwell ◽  
Ben Kravitz ◽  
Xingyuan Chen

Abstract. Long-term spatio-temporal changes in subsurface hydrological flow are usually quantified through a network of wells; however, such observations often are spatially sparse and temporal gaps exist due to poor quality or instrument failure. In this study, we explore the ability of deep neural networks to fill in gaps in spatially distributed time-series data. We selected a location at the U.S. Department of Energy's Hanford site to demonstrate and evaluate the new method, using a 10-year spatio-temporal hydrological dataset of temperature, specific conductance, and groundwater table elevation from 42 wells that monitor the dynamic and heterogeneous hydrologic exchanges between the Columbia River and its adjacent groundwater aquifer. We employ a long short-term memory (LSTM)-based architecture, which is specially designed to address both spatial and temporal variations in the property fields. The performance of gap filling using an LSTM framework is evaluated using test datasets with synthetic data gaps created by assuming the observations were missing for a given time window (i.e., gap length), such that the mean absolute percentage error can be calculated against true observations. Such test datasets also allow us to examine how well the original nonlinear dynamics are captured in gap-filled time series beyond the error statistics. The performance of the LSTM-based gap-filling method is compared to that of a traditional, popular gap-filling method: autoregressive integrated moving average (ARIMA). Although ARIMA appears to perform slightly better than LSTM on average error statistics, LSTM is better able to capture nonlinear dynamics that are present in time series. Thus, LSTMs show promising potential to outperform ARIMA for gap filling in highly dynamic time-series observations characterized by multiple dominant modes of variability. Capturing such dynamics is essential to generate the most valuable observations to advance our understanding of dynamic complex systems.


Sign in / Sign up

Export Citation Format

Share Document