BiLSTM-I: A Deep Learning-Based Long Interval Gap-Filling Method for Meteorological Observation Data

Complete and high-resolution temperature observation data are important input parameters for agrometeorological disaster monitoring and ecosystem modelling. Due to the limitation of field meteorological observation conditions, observation data are commonly missing, and an appropriate data imputation method is necessary in meteorological data applications. In this paper, we focus on filling long gaps in meteorological observation data at field sites. A deep learning-based model, BiLSTM-I, is proposed to impute missing half-hourly temperature observations with high accuracy by considering temperature observations obtained manually at a low frequency. An encoder-decoder structure is adopted by BiLSTM-I, which is conducive to fully learning the potential distribution pattern of data. In addition, the BiLSTM-I model error function incorporates the difference between the final estimates and true observations. Therefore, the error function evaluates the imputation results more directly, and the model convergence error and the imputation accuracy are directly related, thus ensuring that the imputation error can be minimized at the time the model converges. The experimental analysis results show that the BiLSTM-I model designed in this paper is superior to other methods. For a test set with a time interval gap of 30 days, or a time interval gap of 60 days, the root mean square errors (RMSEs) remain stable, indicating the model’s excellent generalization ability for different missing value gaps. Although the model is only applied to temperature data imputation in this study, it also has the potential to be applied to other meteorological dataset-filling scenarios.

Download Full-text

Surrounding Vehicles’ Contribution to Car-Following Models: Deep-Learning-Based Analysis

Transportation Research Record Journal of the Transportation Research Board ◽

10.1177/03611981211018693 ◽

2021 ◽

pp. 036119812110186

Author(s):

Saeed Vasebi ◽

Yeganeh M. Hayeri ◽

Peter J. Jin

Keyword(s):

Deep Learning ◽

Traffic Flow ◽

Short Term Memory ◽

Data Availability ◽

Car Following ◽

Preceding Vehicle ◽

Long Short Term Memory ◽

The Right ◽

Mean Square Errors ◽

Deep Learning Model

Relatively recent increased computational power and extensive traffic data availability have provided a unique opportunity to re-investigate drivers’ car-following (CF) behavior. Classic CF models assume drivers’ behavior is only influenced by their preceding vehicle. Recent studies have indicated that considering surrounding vehicles’ information (e.g., multiple preceding vehicles) could affect CF models’ performance. An in-depth investigation of surrounding vehicles’ contribution to CF modeling performance has not been reported in the literature. This study uses a deep-learning model with long short-term memory (LSTM) to investigate to what extent considering surrounding vehicles could improve CF models’ performance. This investigation helps to select the right inputs for traffic flow modeling. Five CF models are compared in this study (i.e., classic, multi-anticipative, adjacent-lanes, following-vehicle, and all-surrounding-vehicles CF models). Performance of the CF models is compared in relation to accuracy, stability, and smoothness of traffic flow. The CF models are trained, validated, and tested by a large publicly available dataset. The average mean square errors (MSEs) for the classic, multi-anticipative, adjacent-lanes, following-vehicle, and all-surrounding-vehicles CF models are 1.58 × 10−3, 1.54 × 10−3, 1.56 × 10−3, 1.61 × 10−3, and 1.73 × 10−3, respectively. However, the results show insignificant performance differences between the classic CF model and multi-anticipative model or adjacent-lanes model in relation to accuracy, stability, or smoothness. The following-vehicle CF model shows similar performance to the multi-anticipative model. The all-surrounding-vehicles CF model has underperformed all the other models.

Download Full-text

Continuous missing data imputation with incomplete dataset by generative adversarial networks–based unsupervised learning for long-term bridge health monitoring

Structural Health Monitoring ◽

10.1177/14759217211021942 ◽

2021 ◽

pp. 147592172110219

Author(s):

Huachen Jiang ◽

Chunfeng Wan ◽

Kang Yang ◽

Youliang Ding ◽

Songtao Xue

Keyword(s):

Missing Data ◽

Health Monitoring ◽

Signal Transmission ◽

Imputation Accuracy ◽

Generative Adversarial Networks ◽

Data Imputation ◽

Sensor Failure ◽

Generative Adversarial Network ◽

Missing Data Imputation ◽

Adversarial Network

Wireless sensors are the key components of structural health monitoring systems. During the signal transmission, sensor failure is inevitable, among which, data loss is the most common type. Missing data problem poses a huge challenge to the consequent damage detection and condition assessment, and therefore, great importance should be attached. Conventional missing data imputation basically adopts the correlation-based method, especially for strain monitoring data. However, such methods often require delicate model selection, and the correlations for vehicle-induced strains are much harder to be captured compared with temperature-induced strains. In this article, a novel data-driven generative adversarial network (GAN) for imputing missing strain response is proposed. As opposed to traditional ways where correlations for inter-strains are explicitly modeled, the proposed method directly imputes the missing data considering the spatial–temporal relationships with other strain sensors based on the remaining observed data. Furthermore, the intact and complete dataset is not even necessary during the training process, which shows another great superiority over the model-based imputation method. The proposed method is implemented and verified on a real concrete bridge. In order to demonstrate the applicability and robustness of the GAN, imputation for single and multiple sensors is studied. Results show the proposed method provides an excellent performance of imputation accuracy and efficiency.

Download Full-text

Long-Term Observations of Beach Variability at Hasaki, Japan

Journal of Marine Science and Engineering ◽

10.3390/jmse8110871 ◽

2020 ◽

Vol 8 (11) ◽

pp. 871

Author(s):

Masayuki Banno ◽

Satoshi Nakamura ◽

Taichi Kosako ◽

Yasuyuki Nakagawa ◽

Shin-ichi Yanagishima ◽

...

Keyword(s):

Morphological Change ◽

High Frequency ◽

Climate Changes ◽

Wave Climate ◽

Time Interval ◽

Observation Data ◽

Beach Profile ◽

Water Level Variations ◽

Monthly Variations

Long-term beach observation data for several decades are essential to validate beach morphodynamic models that are used to predict coastal responses to sea-level rise and wave climate changes. At the Hasaki coast, Japan, the beach profile has been measured for 34 years at a daily to weekly time interval. This beach morphological dataset is one of the longest and most high-frequency measurements of the beach morphological change worldwide. The profile data, with more than 6800 records, reflect short- to long-term beach morphological change, showing coastal dune development, foreshore morphological change and longshore bar movement. We investigated the temporal beach variability from the decadal and monthly variations in elevation. Extremely high waves and tidal anomalies from an extratropical cyclone caused a significant change in the long-term bar behavior and foreshore slope. The berm and bar variability were also affected by seasonal wave and water level variations. The variabilities identified here from the long-term observations contribute to our understanding of various coastal phenomena.

Download Full-text

Denoising of river surface photogrammetric DEMs using deep learning

10.5194/egusphere-egu21-10266 ◽

2021 ◽

Author(s):

Radosław Szostak ◽

Przemysław Wachniew ◽

Mirosław Zimnoch ◽

Paweł Ćwiąkała ◽

Edyta Puniach ◽

...

Keyword(s):

Higher Education ◽

Deep Learning ◽

Water Level ◽

Water Surface ◽

Research University ◽

Static Characteristic ◽

Water Levels ◽

Training Dataset ◽

Observation Data ◽

Characteristic Points

<p>Unmanned Aerial Vehicles (UAVs) can be an excellent tool for environmental measurements due to their ability to reach inaccessible places and fast data acquisition over large areas. In particular drones may have a potential application in hydrology, as they can be used to create photogrammetric digital elevation models (DEM) of the terrain allowing to obtain high resolution spatial distribution of water level in the river to be fed into hydrological models. Nevertheless, photogrammetric algorithms generate distortions on the DEM at the water bodies. This is due to light penetration below the water surface and the lack of static characteristic points on water surface that can be distinguished by the photogrammetric algorithm. The correction of these disturbances could be achieved by applying deep learning methods. For this purpose, it is necessary to build a training dataset containing DEMs before and after water surfaces denoising. A method has been developed to prepare such a dataset. It is divided into several stages. In the first step a photogrammetric surveys and geodetic water level measurements are performed. The second one includes generation of DEMs and orthomosaics using photogrammetric software. Finally in the last one the interpolation of the measured water levels is done to obtain a plane of the water surface and apply it to the DEMs to correct the distortion. The resulting dataset was used to train deep learning model based on convolutional neural networks. The proposed method has been validated on observation data representing part of Kocinka river catchment located in the central Poland.</p><p>This research has been partly supported by the Ministry of Science and Higher Education Project &#8220;Initiative for Excellence &#8211; Research University&#8221; and Ministry of Science and Higher Education subsidy, project no. 16.16.220.842-B02 / 16.16.150.545.</p>

Download Full-text

Use of electrochemical sensors for measurement of air pollution: correcting interference response and validating measurements

Atmospheric Measurement Techniques ◽

10.5194/amt-10-3575-2017 ◽

2017 ◽

Vol 10 (9) ◽

pp. 3575-3588 ◽

Cited By ~ 83

Author(s):

Eben S. Cross ◽

Leah R. Williams ◽

David K. Lewis ◽

Gregory R. Magoon ◽

Timothy B. Onasch ◽

...

Keyword(s):

Air Pollution ◽

Air Quality ◽

Lower Cost ◽

Electrochemical Sensors ◽

Air Pollutant ◽

Sensor System ◽

Time Interval ◽

High Dimensional Model Representation ◽

Integrated Sensor ◽

Mean Square Errors

Abstract. The environments in which we live, work, and play are subject to enormous variability in air pollutant concentrations. To adequately characterize air quality (AQ), measurements must be fast (real time), scalable, and reliable (with known accuracy, precision, and stability over time). Lower-cost air-quality-sensor technologies offer new opportunities for fast and distributed measurements, but a persistent characterization gap remains when it comes to evaluating sensor performance under realistic environmental sampling conditions. This limits our ability to inform the public about pollution sources and inspire policy makers to address environmental justice issues related to air quality. In this paper, initial results obtained with a recently developed lower-cost air-quality-sensor system are reported. In this project, data were acquired with the ARISense integrated sensor package over a 4.5-month time interval during which the sensor system was co-located with a state-operated (Massachusetts, USA) air quality monitoring station equipped with reference instrumentation measuring the same pollutant species. This paper focuses on validating electrochemical (EC) sensor measurements of CO, NO, NO2, and O3 at an urban neighborhood site with pollutant concentration ranges (parts per billion by volume, ppb; 5 min averages, ±1σ): [CO]  =  231 ± 116 ppb (spanning 84–1706 ppb), [NO]  =  6.1 ± 11.5 ppb (spanning 0–209 ppb), [NO2]  =  11.7 ± 8.3 ppb (spanning 0–71 ppb), and [O3]  =  23.2 ± 12.5 ppb (spanning 0–99 ppb). Through the use of high-dimensional model representation (HDMR), we show that interference effects derived from the variable ambient gas concentration mix and changing environmental conditions over three seasons (sensor flow-cell temperature  =  23.4 ± 8.5 °C, spanning 4.1 to 45.2 °C; and relative humidity  =  50.1 ± 15.3 %, spanning 9.8–79.9 %) can be effectively modeled for the Alphasense CO-B4, NO-B4, NO2-B43F, and Ox-B421 sensors, yielding (5 min average) root mean square errors (RMSE) of 39.2, 4.52, 4.56, and 9.71 ppb, respectively. Our results substantiate the potential for distributed air pollution measurements that could be enabled with these sensors.

Download Full-text

Incomplete big data imputation algorithm using optimized possibilistic c-means and deep learning

Informatics, Networking and Intelligent Computing ◽

10.1201/b18413-10 ◽

2015 ◽

pp. 43-48

Author(s):

H Shen ◽

E Zhang

Keyword(s):

Big Data ◽

Deep Learning ◽

Data Imputation

Download Full-text

Short-term tidal variations in UT1: compliance between modelling and observation

Proceedings of the International Astronomical Union ◽

10.1017/s1743921310008847 ◽

2009 ◽

Vol 5 (H15) ◽

pp. 215-215 ◽

Cited By ~ 1

Author(s):

Sigrid Englich ◽

Harald Schuh ◽

Robert Weber

Keyword(s):

Solid Earth ◽

Earth Tides ◽

Time Interval ◽

Observation Data ◽

Length Of Day ◽

Tidal Effects ◽

Short Term ◽

Solid Earth Tides ◽

Oceanic Tides ◽

Tidal Variations

AbstractThe Earth rotation rate and consequently universal time (UT1) and length of day (LOD) are periodically affected by solid Earth tides and oceanic tides. Solid Earth tides induce changes with periods from around 5 days to 18.6 years, with the largest amplitudes occurring at fortnightly, monthly, semi-annual and annual periods, and at 18.6 years. The principal variations caused by oceanic tides have diurnal and semi-diurnal periods. For the investigation of the tidal effects with periods of up to 35 days, UT1 series are estimated from VLBI observation data of the time interval 1984–2008. The amplitudes and phases of the terms of interest are calculated and the results for diurnal and sub-diurnal periods are compared and evaluated with tidal variations derived from a GNSS-based LOD time series of 8 months. The observed tidal signals are finally compared to the predicted tidal variations according to recent geophysical models.

Download Full-text

A Novel Missing Data Imputation Algorithm for Deep Learning-Based Anomaly Detection System in IIoT Networks

10.1201/9781003156123-2 ◽

2021 ◽

pp. 27-46

Author(s):

Ancy Jose ◽

S.V. Annlin Jeba ◽

Beulah Joslyn Jose

Keyword(s):

Deep Learning ◽

Missing Data ◽

Anomaly Detection ◽

Detection System ◽

Data Imputation ◽

Missing Data Imputation ◽

Anomaly Detection System

Download Full-text

Complex Data Imputation by Auto-Encoders and Convolutional Neural Networks—A Case Study on Genome Gap-Filling

Computers ◽

10.3390/computers9020037 ◽

2020 ◽

Vol 9 (2) ◽

pp. 37 ◽

Cited By ~ 1

Author(s):

Luca Cappelletti ◽

Tommaso Fontana ◽

Guido Walter Di Donato ◽

Lorenzo Di Tucci ◽

Elena Casiraghi ◽

...

Keyword(s):

Deep Learning ◽

Missing Data ◽

State Of The Art ◽

The State ◽

Complex Data ◽

Data Imputation ◽

Genome Sequences ◽

Missing Data Imputation ◽

The Past ◽

Learning Techniques

Missing data imputation has been a hot topic in the past decade, and many state-of-the-art works have been presented to propose novel, interesting solutions that have been applied in a variety of fields. In the past decade, the successful results achieved by deep learning techniques have opened the way to their application for solving difficult problems where human skill is not able to provide a reliable solution. Not surprisingly, some deep learners, mainly exploiting encoder-decoder architectures, have also been designed and applied to the task of missing data imputation. However, most of the proposed imputation techniques have not been designed to tackle “complex data”, that is high dimensional data belonging to datasets with huge cardinality and describing complex problems. Precisely, they often need critical parameters to be manually set or exploit complex architecture and/or training phases that make their computational load impracticable. In this paper, after clustering the state-of-the-art imputation techniques into three broad categories, we briefly review the most representative methods and then describe our data imputation proposals, which exploit deep learning techniques specifically designed to handle complex data. Comparative tests on genome sequences show that our deep learning imputers outperform the state-of-the-art KNN-imputation method when filling gaps in human genome sequences.

Download Full-text

Seasonal and Diurnal Variations in the Priestley–Taylor Coefficient for a Large Ephemeral Lake

Water ◽

10.3390/w12030849 ◽

2020 ◽

Vol 12 (3) ◽

pp. 849 ◽

Cited By ~ 1

Author(s):

Guojing Gan ◽

Yuanbo Liu ◽

Xin Pan ◽

Xiaosong Zhao ◽

Mei Li ◽

...

Keyword(s):

Poyang Lake ◽

High Water ◽

Diurnal Variations ◽

Observation Data ◽

Available Energy ◽

Diurnal Patterns ◽

Ephemeral Lakes ◽

Seasonal And Diurnal Variations ◽

Daily Scale ◽

Mean Square Errors

The Priestley–Taylor equation (PTE) is widely used with its sole parameter (α) set as 1.26 for estimating the evapotranspiration (ET) of water bodies. However, variations in α may be large for ephemeral lakes. Poyang Lake, which is the largest freshwater lake in China, is water-covered and wetland-covered during its high-water and low-water periods, respectively, over a year. This paper examines the seasonal and diurnal variations in α using eddy covariance observation data for Poyang Lake. The results show that α = 1.26 is overall feasible for both periods at daily and subdaily scales. No obvious seasonal trend was observed, although the standard deviation in α for the wetland was larger than that for the water surface. The mean bias in evaporation estimations using the PTE was less than 5 W·m−2 during both periods, and the root mean square errors were much smaller than the average evaporation measurements at daily scale. U-shaped diurnal patterns of α were found during both periods, due partly to the negative correlation between α and the available energy (A). Compared to the vapor pressure deficit (VPD), wind speed (u) exerts a larger contribution to these variations. In addition, u is positively correlated with α during both periods, however, VPD was positively and negatively correlated with α during the high-water and low-water periods, respectively. Subdaily α exhibited contrasting clusters in the (u, VPD) plane under the same available energy ranges. Our study highlights the seasonal and diurnal course of α and suggests the careful use of PTE at subdaily scales.

Download Full-text