State estimation of missing data imputation

Author(s):  
Keung Hui ◽  
Jason Mou
2019 ◽  
Vol 50 (3) ◽  
pp. 860-877 ◽  
Author(s):  
Jie Lin ◽  
NianHua Li ◽  
Md Ashraful Alam ◽  
Yuqing Ma

Abstract Due to cluster instability, not in the cluster monitoring system. This paper focuses on the missing data imputation processing for the cluster monitoring application and proposes a new hybrid multiple imputation framework. This new imputation approach is different from the conventional multiple imputation technologies in the fact that it attempts to impute the missing data for an arbitrary missing pattern with a model-based and data-driven combination architecture. Essentially, the deep neural network, as the data model, extracts deep features from the data and deep features are further calculated then by a regression or data-driven strategies and used to create the estimation of missing data with the arbitrary missing pattern. This paper gives evidence that if we can train a deep neural network to construct the deep features of the data, imputation based on deep features is better than that directly on the original data. In the experiments, we compare the proposed method with other conventional multiple imputation approaches for varying missing data patterns, missing ratios, and different datasets including real cluster data. The result illustrates that when data encounters larger missing ratio and various missing patterns, the proposed algorithm has the ability to achieve more accurate and stable imputation performance.


2021 ◽  
pp. 147592172110219
Author(s):  
Huachen Jiang ◽  
Chunfeng Wan ◽  
Kang Yang ◽  
Youliang Ding ◽  
Songtao Xue

Wireless sensors are the key components of structural health monitoring systems. During the signal transmission, sensor failure is inevitable, among which, data loss is the most common type. Missing data problem poses a huge challenge to the consequent damage detection and condition assessment, and therefore, great importance should be attached. Conventional missing data imputation basically adopts the correlation-based method, especially for strain monitoring data. However, such methods often require delicate model selection, and the correlations for vehicle-induced strains are much harder to be captured compared with temperature-induced strains. In this article, a novel data-driven generative adversarial network (GAN) for imputing missing strain response is proposed. As opposed to traditional ways where correlations for inter-strains are explicitly modeled, the proposed method directly imputes the missing data considering the spatial–temporal relationships with other strain sensors based on the remaining observed data. Furthermore, the intact and complete dataset is not even necessary during the training process, which shows another great superiority over the model-based imputation method. The proposed method is implemented and verified on a real concrete bridge. In order to demonstrate the applicability and robustness of the GAN, imputation for single and multiple sensors is studied. Results show the proposed method provides an excellent performance of imputation accuracy and efficiency.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Nishith Kumar ◽  
Md. Aminul Hoque ◽  
Masahiro Sugimoto

AbstractMass spectrometry is a modern and sophisticated high-throughput analytical technique that enables large-scale metabolomic analyses. It yields a high-dimensional large-scale matrix (samples × metabolites) of quantified data that often contain missing cells in the data matrix as well as outliers that originate for several reasons, including technical and biological sources. Although several missing data imputation techniques are described in the literature, all conventional existing techniques only solve the missing value problems. They do not relieve the problems of outliers. Therefore, outliers in the dataset decrease the accuracy of the imputation. We developed a new kernel weight function-based proposed missing data imputation technique that resolves the problems of missing values and outliers. We evaluated the performance of the proposed method and other conventional and recently developed missing imputation techniques using both artificially generated data and experimentally measured data analysis in both the absence and presence of different rates of outliers. Performances based on both artificial data and real metabolomics data indicate the superiority of our proposed kernel weight-based missing data imputation technique to the existing alternatives. For user convenience, an R package of the proposed kernel weight-based missing value imputation technique was developed, which is available at https://github.com/NishithPaul/tWLSA.


Sign in / Sign up

Export Citation Format

Share Document