missing data imputation Latest Research Papers

Missing Data Imputation – A Survey

International Journal of Decision Support System Technology ◽

10.4018/ijdsst.292446 ◽

2022 ◽

Vol 14 (1) ◽

pp. 0-0

Keyword(s):

Missing Data ◽

Linear Regression ◽

Missing Values ◽

Computational Cost ◽

Machine Learning Algorithms ◽

Classification And Regression Tree ◽

High Dimensional ◽

Missing Data Imputation ◽

Real World Datasets ◽

Incomplete Datasets

Many real world datasets may contain missing values for various reasons. These incomplete datasets can pose severe issues to the underlying machine learning algorithms and decision support systems. It may result in high computational cost, skewed output and invalid deductions. Various solutions exist to mitigate this issue; the most popular strategy is to estimate the missing values by applying inferential techniques such as linear regression, decision trees or Bayesian inference. In this paper, the missing data problem is discussed in detail with a comprehensive review of the approaches to tackle it. The paper concludes with a discussion on the effectiveness of three imputation methods namely, imputation based on Multiple Linear Regression (MLR), Predictive Mean Matching (PMM) and Classification And Regression Tree (CART) in the context of subspace clustering. The experimental results obtained on real benchmark datasets and high-dimensional synthetic datasets highlight that, MLR based imputation method is more efficient on high-dimensional incomplete datasets.

Comparison of Missing Data Imputation Methods in Time Series Forecasting

Computers Materials & Continua ◽

10.32604/cmc.2022.019369 ◽

2022 ◽

Vol 70 (1) ◽

pp. 767-779

Author(s):

Hyun Ahn ◽

Kyunghee Sun ◽

Kwanghoon Pio Kim

Keyword(s):

Time Series ◽

Missing Data ◽

Time Series Forecasting ◽

Data Imputation ◽

Missing Data Imputation ◽

Imputation Methods

Reconstruction of Missing Segments in Well Data History Using Data Analytics

10.2118/208137-ms ◽

2021 ◽

Author(s):

Yuanjun Li ◽

Roland Horne ◽

Ahmed Al Shmakhy ◽

Tania Felix Menchaca

Keyword(s):

Missing Data ◽

Data Streams ◽

History Matching ◽

Gas Flow ◽

Missing Values ◽

Imputation Accuracy ◽

Time Span ◽

Data Imputation ◽

Significant Information ◽

Missing Data Imputation

Abstract The problem of missing data is a frequent occurrence in well production history records. Due to network outage, facility maintenance or equipment failure, the time series production data measured from surface and downhole gauges can be intermittent. The fragmentary data are an obstacle for reservoir management. The incomplete dataset is commonly simplified by omitting all observations with missing values, which will lead to significant information loss. Thus, to fill the missing data gaps, in this study, we developed and tested several missing data imputation approaches using machine learning and deep learning methods. Traditional data imputation methods such as interpolation and counting most frequent values can introduce bias to the data as the correlations between features are not considered. Thus, in this study, we investigated several multivariate imputation algorithms that use the entire set of available data streams to estimate the missing values. The methods use a full suite of well measurements, including wellhead and downhole pressures, oil, water and gas flow rates, surface and downhole temperatures, choke settings, etc. Any parameter that has gaps in its recorded history can be imputed from the other available data streams. The models were tested on both synthetic and real datasets from operating Norwegian and Abu Dhabi reservoirs. Based on the characteristics of the field data, we introduced different types of continuous missing distributions, which are the combinations of single-multiple missing sections in a long-short time span, to the complete dataset. We observed that as the missing time span expands, the stability of the more successful methods can be kept to a threshold of 30% of the entire dataset. In addition, for a single missing section over a shorter period, which could represent a weather perturbation, most methods we tried were able to achieve high imputation accuracy. In the case of multiple missing sections over a longer time span, which is typical of gauge failures, other methods were better candidates to capture the overall correlation in the multivariate dataset. Most missing data problems addressed in our industry focus on single feature imputation. In this study, we developed an efficient procedure that enables fast reconstruction of the entire production dataset with multiple missing sections in different variables. Ultimately, the complete information can support the reservoir history matching process, production allocation, and develop models for reservoir performance prediction.

Adaptive Deep Incremental Learning — Assisted Missing Data Imputation for Streaming Data

Journal of Interconnection Networks ◽

10.1142/s021926592143009x ◽

2021 ◽

Author(s):

C. V. S. R. Syavasya ◽

M. A. Lakshmi

Keyword(s):

Missing Data ◽

Incremental Learning ◽

Missing Values ◽

Learning Algorithm ◽

Streaming Data ◽

Stochastic Gradient Descent ◽

Data Imputation ◽

Imputation Model ◽

Missing Data Imputation ◽

Hidden Neurons

With the rapid explosion of the data streams from the applications, ensuring accurate data analysis is essential for effective real-time decision making. Nowadays, data stream applications often confront the missing values that affect the performance of the classification models. Several imputation models have adopted the deep learning algorithms for estimating the missing values; however, the lack of parameter and structure tuning in classification, degrade the performance for data imputation. This work presents the missing data imputation model using the adaptive deep incremental learning algorithm for streaming applications. The proposed approach incorporates two main processes: enhancing the deep incremental learning algorithm and enhancing deep incremental learning-based imputation. Initially, the proposed approach focuses on tuning the learning rate with both the Adaptive Moment Estimation (Adam) along with Stochastic Gradient Descent (SGD) optimizers and tuning the hidden neurons. Secondly, the proposed approach applies the enhanced deep incremental learning algorithm to estimate the imputed values in two steps: (i) imputation process to predict the missing values based on the temporal-proximity and (ii) generation of complete IoT dataset by imputing the missing values from both the predicted values. The experimental outcomes illustrate that the proposed imputation model effectively transforms the incomplete dataset into a complete dataset with minimal error.

Missing data imputation on biomedical data using deeply learned clustering and L2 regularized regression based on symmetric uncertainty

Artificial Intelligence in Medicine ◽

10.1016/j.artmed.2021.102214 ◽

2021 ◽

pp. 102214

Author(s):

Gayathri Nagarajan ◽

L.D. Dhinesh Babu

Keyword(s):

Missing Data ◽

Biomedical Data ◽

Data Imputation ◽

Regularized Regression ◽

Missing Data Imputation

Missing data imputation using mixture factor analysis for building electric load data

Applied Energy ◽

10.1016/j.apenergy.2021.117655 ◽

2021 ◽

Vol 304 ◽

pp. 117655

Author(s):

Dongyeon Jeong ◽

Chiwoo Park ◽

Young Myoung Ko

Keyword(s):

Factor Analysis ◽

Missing Data ◽

Electric Load ◽

Data Imputation ◽

Missing Data Imputation

A Novel Missing Data Imputation Algorithm for Deep Learning-Based Anomaly Detection System in IIoT Networks

10.1201/9781003156123-2 ◽

2021 ◽

pp. 27-46

Author(s):

Ancy Jose ◽

S.V. Annlin Jeba ◽

Beulah Joslyn Jose

Keyword(s):

Deep Learning ◽

Missing Data ◽

Anomaly Detection ◽

Detection System ◽

Data Imputation ◽

Missing Data Imputation ◽

Anomaly Detection System

EvoImputer: An evolutionary approach for Missing Data Imputation and feature selection in the context of supervised learning

Knowledge-Based Systems ◽

10.1016/j.knosys.2021.107734 ◽

2021 ◽

pp. 107734

Author(s):

Shatha Awawdeh ◽

Hossam Faris ◽

Hazem Hiary

Keyword(s):

Feature Selection ◽

Missing Data ◽

Supervised Learning ◽

Evolutionary Approach ◽

Data Imputation ◽

Missing Data Imputation

COLI: Collaborative Clustering Missing Data Imputation

Pattern Recognition Letters ◽

10.1016/j.patrec.2021.11.011 ◽

2021 ◽

Author(s):

Daoming Wan ◽

Roozbeh Razavi-Far ◽

Mehrdad Saif ◽

Niloofar Mozafari

Keyword(s):

Missing Data ◽

Data Imputation ◽

Missing Data Imputation ◽

Collaborative Clustering

A Review of Current Publications Trend on Missing Data Imputation Over Three Decades: Direction and Future Research

10.21203/rs.3.rs-996596/v1 ◽

2021 ◽

Author(s):

farah adibah adnan ◽

Khairur Rijal Jamaludin ◽

Wan Zuki Azman Wan Muhamad ◽

Suraya Miskon

Keyword(s):

Missing Data ◽

Nearest Neighbor ◽

Future Research ◽

Data Imputation ◽

Classification Problems ◽

Missing Data Imputation ◽

Imputation Methods ◽

Publish Or Perish ◽

Research Fields ◽

Scopus Database

Abstract Missing value or sometimes synonym as missing data, is an unavoidable issue when collecting data. It is uncontrollable and happen in almost any research fields. Hence, this study focused on identifying the current publications trend on missing data imputation techniques (1991- 2021) specifically in classification problems using bibliometric analysis. Most importantly, this research aims to uncover the potential missing data imputation methods. Two software were used; VOSViewer and Harzing Publish or Perish. Based on the Scopus database extracted in June 2021, the findings indicate an emerging trend in missing data imputation research to date, while there are two imputation methods that get the most attention; the random forest and the nearest neighbor methods.

missing data imputation
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Missing Data Imputation – A Survey

Comparison of Missing Data Imputation Methods in Time Series Forecasting

Reconstruction of Missing Segments in Well Data History Using Data Analytics

Adaptive Deep Incremental Learning — Assisted Missing Data Imputation for Streaming Data

Missing data imputation on biomedical data using deeply learned clustering and L2 regularized regression based on symmetric uncertainty

Missing data imputation using mixture factor analysis for building electric load data

A Novel Missing Data Imputation Algorithm for Deep Learning-Based Anomaly Detection System in IIoT Networks

EvoImputer: An evolutionary approach for Missing Data Imputation and feature selection in the context of supervised learning

COLI: Collaborative Clustering Missing Data Imputation

A Review of Current Publications Trend on Missing Data Imputation Over Three Decades: Direction and Future Research

Export Citation Format

missing data imputationRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Missing Data Imputation – A Survey

Comparison of Missing Data Imputation Methods in Time Series Forecasting

Reconstruction of Missing Segments in Well Data History Using Data Analytics

Adaptive Deep Incremental Learning — Assisted Missing Data Imputation for Streaming Data

Missing data imputation on biomedical data using deeply learned clustering and L2 regularized regression based on symmetric uncertainty

Missing data imputation using mixture factor analysis for building electric load data

A Novel Missing Data Imputation Algorithm for Deep Learning-Based Anomaly Detection System in IIoT Networks

EvoImputer: An evolutionary approach for Missing Data Imputation and feature selection in the context of supervised learning

COLI: Collaborative Clustering Missing Data Imputation

A Review of Current Publications Trend on Missing Data Imputation Over Three Decades: Direction and Future Research

missing data imputation
Recently Published Documents