Imputation Methods Outperform Missing-Indicator for Data Missing Completely at Random

Based on the missing situation and actual needs of maritime search and rescue data, multiple imputation methods were used to construct complete data sets under different missing patterns. Probability density curves and overimputation diagnostics were used to explore the effects of multiple imputation. The results showed that the Data Augmentation (DA) algorithm had the characteristics of high operation efficiency and good imputation effect, but the algorithm was not suitable for data imputation when there was a high data missing rate. The EMB algorithm effectively restored the distribution of datasets with different data missing rates, and was less affected by the missing position; the EMB algorithm could obtain a good imputation effect even when there was a high data missing rate. Overimputation diagnostics could not only reflect the data imputation effect, but also show the correlation between different datasets, which was of great importance for deep data mining and imputation effect improvement. The Expectation-Maximization with Bootstrap (EMB) algorithm had a poor estimation effect on extreme data and failed to reflect the dataset’s variability characteristics.

Download Full-text

Futuristic Prediction of Missing Value Imputation Methods Using Extended ANN

International Journal of Business Analytics ◽

10.4018/ijban.292055 ◽

2022 ◽

Vol 9 (3) ◽

pp. 0-0

Keyword(s):

Data Analysis ◽

Missing Data ◽

Measurement Errors ◽

Missing Values ◽

Missing Value ◽

Hybrid Schemes ◽

Imputation Methods ◽

Research Fields ◽

Data Missing ◽

The Given

Missing data is universal complexity for most part of the research fields which introduces the part of uncertainty into data analysis. We can take place due to many types of motives such as samples mishandling, unable to collect an observation, measurement errors, aberrant value deleted, or merely be short of study. The nourishment area is not an exemption to the difficulty of data missing. Most frequently, this difficulty is determined by manipulative means or medians from the existing datasets which need improvements. The paper proposed hybrid schemes of MICE and ANN known as extended ANN to search and analyze the missing values and perform imputations in the given dataset. The proposed mechanism is efficiently able to analyze the blank entries and fill them with proper examining their neighboring records in order to improve the accuracy of the dataset. In order to validate the proposed scheme, the extended ANN is further compared against various recent algorithms or mechanisms to analyze the efficiency as well as the accuracy of the results.

Download Full-text

Comparison of Imputation Methods for Missing Values in Longitudinal Data Under Missing Completely at Random (mcar) mechanism

African Journal of Applied Statistics ◽

10.16929/ajas/241.213 ◽

2017 ◽

Vol 4 (1) ◽

pp. 241-258

Author(s):

Lotsi Anani

Keyword(s):

Longitudinal Data ◽

Missing Values ◽

Imputation Methods ◽

Missing Completely At Random

Download Full-text

Analyzing Longitudinal Health-Related Quality of Life Data: Missing Data and Imputation Methods

Statistical Methods for Quality of Life Studies ◽

10.1007/978-1-4757-3625-0_9 ◽

2002 ◽

pp. 103-112 ◽

Cited By ~ 1

Author(s):

Dennis A. Revicki

Keyword(s):

Quality Of Life ◽

Missing Data ◽

Health Related Quality ◽

Imputation Methods ◽

Life Data ◽

Related Quality ◽

Health Related ◽

Data Missing

Download Full-text

The Effect of Partly Missing Covariates on Statistical Power in Randomized Controlled Trials With Discrete-Time Survival Endpoints

Methodology ◽

10.1027/1614-2241/a000121 ◽

2017 ◽

Vol 13 (2) ◽

pp. 41-60

Author(s):

Shahab Jolani ◽

Maryam Safarkhani

Keyword(s):

Randomized Controlled Trials ◽

Discrete Time ◽

Treatment Effect ◽

Survival Data ◽

Controlled Trials ◽

Missing Covariates ◽

Indicator Method ◽

Imputation Methods ◽

Randomized Controlled ◽

Baseline Covariates

Abstract. In randomized controlled trials (RCTs), a common strategy to increase power to detect a treatment effect is adjustment for baseline covariates. However, adjustment with partly missing covariates, where complete cases are only used, is inefficient. We consider different alternatives in trials with discrete-time survival data, where subjects are measured in discrete-time intervals while they may experience an event at any point in time. The results of a Monte Carlo simulation study, as well as a case study of randomized trials in smokers with attention deficit hyperactivity disorder (ADHD), indicated that single and multiple imputation methods outperform the other methods and increase precision in estimating the treatment effect. Missing indicator method, which uses a dummy variable in the statistical model to indicate whether the value for that variable is missing and sets the same value to all missing values, is comparable to imputation methods. Nevertheless, the power level to detect the treatment effect based on missing indicator method is marginally lower than the imputation methods, particularly when the missingness depends on the outcome. In conclusion, it appears that imputation of partly missing (baseline) covariates should be preferred in the analysis of discrete-time survival data.

Download Full-text

Mean Empirical Likelihood Inference for Response Mean with Data Missing at Random

Discrete Dynamics in Nature and Society ◽

10.1155/2020/8893594 ◽

2020 ◽

Vol 2020 ◽

pp. 1-12

Author(s):

Hanji He ◽

Guangming Deng

Keyword(s):

Empirical Likelihood ◽

Missing At Random ◽

Confidence Regions ◽

Likelihood Inference ◽

High Dimensional ◽

Finite Sample ◽

The Mean ◽

Data Missing ◽

Consistency And Asymptotic Normality ◽

The Impact

We extend the mean empirical likelihood inference for response mean with data missing at random. The empirical likelihood ratio confidence regions are poor when the response is missing at random, especially when the covariate is high-dimensional and the sample size is small. Hence, we develop three bias-corrected mean empirical likelihood approaches to obtain efficient inference for response mean. As to three bias-corrected estimating equations, we get a new set by producing a pairwise-mean dataset. The method can increase the size of the sample for estimation and reduce the impact of the dimensional curse. Consistency and asymptotic normality of the maximum mean empirical likelihood estimators are established. The finite sample performance of the proposed estimators is presented through simulation, and an application to the Boston Housing dataset is shown.

Download Full-text

Analysis of Imputation Methods of Small and Unbalanced Datasets in Classifications using Naïve Bayes and Particle Swarm Optimization

2020 International Seminar on Application for Technology of Information and Communication (iSemantic) ◽

10.1109/isemantic50169.2020.9234225 ◽

2020 ◽

Author(s):

Muhammad Misdram ◽

Edi Noersasongko ◽

Abdul Syukur ◽

Purwanto ◽

Muljono Muljono ◽

...

Keyword(s):

Particle Swarm Optimization ◽

Naive Bayes ◽

Particle Swarm ◽

Naïve Bayes ◽

Swarm Optimization ◽

Imputation Methods

Download Full-text

Comparison of five imputation methods in handling missing data in a continuous frequency table

10.1063/5.0053286 ◽

2021 ◽

Author(s):

M. B. Mohammed ◽

H. S. Zulkafli ◽

M. B. Adam ◽

N. Ali ◽

I. A. Baba

Keyword(s):

Missing Data ◽

Imputation Methods ◽

Frequency Table ◽

Continuous Frequency

Download Full-text

Improved imputation methods for missing data in two-occasion successive sampling

Communication in Statistics- Theory and Methods ◽

10.1080/03610926.2021.1944211 ◽

2021 ◽

pp. 1-20

Author(s):

Garib Nath Singh ◽

Ashok Kumar Jaiswal ◽

Awadhesh K. Pandey

Keyword(s):

Missing Data ◽

Successive Sampling ◽

Imputation Methods

Download Full-text

Imputation Methods Outperform Missing-Indicator for Data Missing Completely at Random

A Note on Normal Theory Power Calculation in SEM With Data Missing Completely at Random

Multiple imputation of maritime search and rescue data at multiple missing patterns

Futuristic Prediction of Missing Value Imputation Methods Using Extended ANN

Comparison of Imputation Methods for Missing Values in Longitudinal Data Under Missing Completely at Random (mcar) mechanism

Analyzing Longitudinal Health-Related Quality of Life Data: Missing Data and Imputation Methods

The Effect of Partly Missing Covariates on Statistical Power in Randomized Controlled Trials With Discrete-Time Survival Endpoints

Mean Empirical Likelihood Inference for Response Mean with Data Missing at Random

Analysis of Imputation Methods of Small and Unbalanced Datasets in Classifications using Naïve Bayes and Particle Swarm Optimization

Comparison of five imputation methods in handling missing data in a continuous frequency table

Improved imputation methods for missing data in two-occasion successive sampling

Export Citation Format