scholarly journals A Missing Data Compensation Method Using LSTM Estimates and Weights in AMI System

Information ◽  
2021 ◽  
Vol 12 (9) ◽  
pp. 341
Author(s):  
Hyuk-Rok Kwon ◽  
Pan-Koo Kim

With the expansion of advanced metering infrastructure (AMI) installations, various additional services using AMI data have emerged. However, some data is lost in the communication process of data collection. Hence, to address this challenge, the estimation of the missing data is required. To estimate the missing values in the time-series data generated from smart meters, we investigated four methods, ranging from a conventional method to an estimation method applying long short-term memory (LSTM), which exhibits excellent performance in the time-series field, and provided the performance comparison data. Furthermore, because power usages represent estimates of data that are missing some values in the middle, rather than regular time-series estimation data, the simple estimation may lead to an error where the estimated accumulated power usage in the missing data is larger than the real accumulated power usage appearing in the data after the end of the missing data interval. Therefore, this study proposes a hybrid method that combines the advantages of the linear interpolation method and the LSTM estimation-based compensation method, rather than those of conventional methods adopted in the time-series field. The performance of the proposed method is more stable and better than that of other methods.

2021 ◽  
Vol 6 (1) ◽  
pp. 1-4
Author(s):  
Bo Yuan Chang ◽  
Mohamed A. Naiel ◽  
Steven Wardell ◽  
Stan Kleinikkink ◽  
John S. Zelek

Over the past years, researchers have proposed various methods to discover causal relationships among time-series data as well as algorithms to fill in missing entries in time-series data. Little to no work has been done in combining the two strategies for the purpose of learning causal relationships using unevenly sampled multivariate time-series data. In this paper, we examine how the causal parameters learnt from unevenly sampled data (with missing entries) deviates from the parameters learnt using the evenly sampled data (without missing entries). However, to obtain the causal relationship from a given time-series requires evenly sampled data, which suggests filling the missing data values before obtaining the causal parameters. Therefore, the proposed method is based on applying a Gaussian Process Regression (GPR) model for missing data recovery, followed by several pairwise Granger causality equations in Vector Autoregssive form to fit the recovered data and obtain the causal parameters. Experimental results show that the causal parameters generated by using GPR data filling offers much lower RMSE than the dummy model (fill with last seen entry) under all missing values percentage, suggesting that GPR data filling can better preserve the causal relationships when compared with dummy data filling, thus should be considered when dealing with unevenly sampled time-series causality learning.


2021 ◽  
Vol 5 (1) ◽  
pp. 57
Author(s):  
Sophie Castel ◽  
Wesley S. Burr

Real-world time series data often contain missing values due to human error, irregular sampling, or unforeseen equipment failure. The ability of a computational interpolation method to repair such data greatly depends on the characteristics of the time series itself, such as the number of periodic and polynomial trends and noise structure, as well as the particular configuration of the missing values themselves. The interpTools package presents a systematic framework for analyzing the statistical performance of a time series interpolator in light of such data features. Its utility and features are demonstrated through evaluation of a novel algorithm, the Hybrid Wiener Interpolator.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Tuan D. Pham

AbstractAutomated analysis of physiological time series is utilized for many clinical applications in medicine and life sciences. Long short-term memory (LSTM) is a deep recurrent neural network architecture used for classification of time-series data. Here time–frequency and time–space properties of time series are introduced as a robust tool for LSTM processing of long sequential data in physiology. Based on classification results obtained from two databases of sensor-induced physiological signals, the proposed approach has the potential for (1) achieving very high classification accuracy, (2) saving tremendous time for data learning, and (3) being cost-effective and user-comfortable for clinical trials by reducing multiple wearable sensors for data recording.


Author(s):  
Andrew Q. Philips

In cross-sectional time-series data with a dichotomous dependent variable, failing to account for duration dependence when it exists can lead to faulty inferences. A common solution is to include duration dummies, polynomials, or splines to proxy for duration dependence. Because creating these is not easy for the common practitioner, I introduce a new command, mkduration, that is a straightforward way to generate a duration variable for binary cross-sectional time-series data in Stata. mkduration can handle various forms of missing data and allows the duration variable to easily be turned into common parametric and nonparametric approximations.


Hydrology ◽  
2018 ◽  
Vol 5 (4) ◽  
pp. 63 ◽  
Author(s):  
Benjamin Nelsen ◽  
D. Williams ◽  
Gustavious Williams ◽  
Candace Berrett

Complete and accurate data are necessary for analyzing and understanding trends in time-series datasets; however, many of the available time-series datasets have gaps that affect the analysis, especially in the earth sciences. As most available data have missing values, researchers use various interpolation methods or ad hoc approaches to data imputation. Since the analysis based on inaccurate data can lead to inaccurate conclusions, more accurate data imputation methods can provide accurate analysis. We present a spatial-temporal data imputation method using Empirical Mode Decomposition (EMD) based on spatial correlations. We call this method EMD-spatial data imputation or EMD-SDI. Though this method is applicable to other time-series data sets, here we demonstrate the method using temperature data. The EMD algorithm decomposes data into periodic components called intrinsic mode functions (IMF) and exactly reconstructs the original signal by summing these IMFs. EMD-SDI initially decomposes the data from the target station and other stations in the region into IMFs. EMD-SDI evaluates each IMF from the target station in turn and selects the IMF from other stations in the region with periodic behavior most correlated to target IMF. EMD-SDI then replaces a section of missing data in the target station IMF with the section from the most closely correlated IMF from the regional stations. We found that EMD-SDI selects the IMFs used for reconstruction from different stations throughout the region, not necessarily the station closest in the geographic sense. EMD-SDI accurately filled data gaps from 3 months to 5 years in length in our tests and favorably compares to a simple temporal method. EMD-SDI leverages regional correlation and the fact that different stations can be subject to different periodic behaviors. In addition to data imputation, the EMD-SDI method provides IMFs that can be used to better understand regional correlations and processes.


Author(s):  
Eun-Su Go ◽  
In-Gul Kim ◽  
Minhyeok Jeon ◽  
Hyun-Jun Cho ◽  
Jae-Sang Park ◽  
...  

Open Physics ◽  
2021 ◽  
Vol 19 (1) ◽  
pp. 618-627
Author(s):  
Weixing Song ◽  
Jingjing Wu ◽  
Jianshe Kang ◽  
Jun Zhang

Abstract The aim of this study was to improve the low accuracy of equipment spare parts requirement predicting, which affects the quality and efficiency of maintenance support, based on the summary and analysis of the existing spare parts requirement predicting research. This article introduces the current latest popular long short-term memory (LSTM) algorithm which has the best effect on time series data processing to equipment spare parts requirement predicting, according to the time series characteristics of spare parts consumption data. A method for predicting the requirement for maintenance spare parts based on the LSTM recurrent neural network is proposed, and the network structure is designed in detail, the realization of network training and network prediction is given. The advantages of particle swarm algorithm are introduced to optimize the network parameters, and actual data of three types of equipment spare parts consumption are used for experiments. The performance comparison of predictive models such as BP neural network, generalized regression neural network, wavelet neural network, and squeeze-and-excitation network prove that the new method is effective and provides an effective method for scientifically predicting the requirement for maintenance spare parts and improving the quality of equipment maintenance.


2018 ◽  
Vol 17 (02) ◽  
pp. 1850017 ◽  
Author(s):  
Mahdi Kalantari ◽  
Masoud Yarmohammadi ◽  
Hossein Hassani ◽  
Emmanuel Sirimal Silva

Missing values in time series data is a well-known and important problem which many researchers have studied extensively in various fields. In this paper, a new nonparametric approach for missing value imputation in time series is proposed. The main novelty of this research is applying the [Formula: see text] norm-based version of Singular Spectrum Analysis (SSA), namely [Formula: see text]-SSA which is robust against outliers. The performance of the new imputation method has been compared with many other established methods. The comparison is done by applying them to various real and simulated time series. The obtained results confirm that the SSA-based methods, especially [Formula: see text]-SSA can provide better imputation in comparison to other methods.


Sign in / Sign up

Export Citation Format

Share Document