Multilevel and time-series missing value imputation for combined survey and longitudinal context data

Author(s):  
David Wutchiett ◽  
Claire Durand
Author(s):  
Taesung Kim ◽  
Jinhee Kim ◽  
Wonho Yang ◽  
Hunjoo Lee ◽  
Jaegul Choo

To prevent severe air pollution, it is important to analyze time-series air quality data, but this is often challenging as the time-series data is usually partially missing, especially when it is collected from multiple locations simultaneously. To solve this problem, various deep-learning-based missing value imputation models have been proposed. However, often they are barely interpretable, which makes it difficult to analyze the imputed data. Thus, we propose a novel deep learning-based imputation model that achieves high interpretability as well as shows great performance in missing value imputation for spatio-temporal data. We verify the effectiveness of our method through quantitative and qualitative results on a publicly available air-quality dataset.


2021 ◽  
Vol 551 ◽  
pp. 67-82
Author(s):  
Ying Zhang ◽  
Baohang Zhou ◽  
Xiangrui Cai ◽  
Wenya Guo ◽  
Xiaoke Ding ◽  
...  

2021 ◽  
Vol 14 (11) ◽  
pp. 2533-2545
Author(s):  
Parikshit Bansal ◽  
Prathamesh Deshpande ◽  
Sunita Sarawagi

We present DeepMVI, a deep learning method for missing value imputation in multidimensional time-series datasets. Missing values are commonplace in decision support platforms that aggregate data over long time stretches from disparate sources, whereas reliable data analytics calls for careful handling of missing data. One strategy is imputing the missing values, and a wide variety of algorithms exist spanning simple interpolation, matrix factorization methods like SVD, statistical models like Kalman filters, and recent deep learning methods. We show that often these provide worse results on aggregate analytics compared to just excluding the missing data. DeepMVI expresses the distribution of each missing value conditioned on coarse and fine-grained signals along a time series, and signals from correlated series at the same time. Instead of resorting to linearity assumptions of conventional matrix factorization methods, DeepMVI harnesses a flexible deep network to extract and combine these signals in an end-to-end manner. To prevent over-fitting with high-capacity neural networks, we design a robust parameter training with labeled data created using synthetic missing blocks around available indices. Our neural network uses a modular design with a novel temporal transformer with convolutional features, and kernel regression with learned embeddings. Experiments across ten real datasets, five different missing scenarios, comparing seven conventional and three deep learning methods show that DeepMVI is significantly more accurate, reducing error by more than 50% in more than half the cases, compared to the best existing method. Although slower than simpler matrix factorization methods, we justify the increased time overheads by showing that DeepMVI provides significantly more accurate imputation that finally impacts quality of downstream analytics.


The R Journal ◽  
2017 ◽  
Vol 9 (1) ◽  
pp. 207 ◽  
Author(s):  
Steffen Moritz ◽  
Thomas Bartz-Beielstein

2018 ◽  
Author(s):  
Stefan Bischof ◽  
Andreas Harth ◽  
Benedikt KKmpgen ◽  
Axel Polleres ◽  
Patrik Schneider

Sign in / Sign up

Export Citation Format

Share Document