REALIZATION OF MISSING DATA TECHNIQUES WITHIN STATISTICAL PROGRAM PACKAGES AND THEIR EMPIRICAL PERFORMANCE

Author(s):  
Rainer Schnell
Author(s):  
Maria Lucia Parrella ◽  
Giuseppina Albano ◽  
Cira Perna ◽  
Michele La Rocca

AbstractMissing data reconstruction is a critical step in the analysis and mining of spatio-temporal data. However, few studies comprehensively consider missing data patterns, sample selection and spatio-temporal relationships. To take into account the uncertainty in the point forecast, some prediction intervals may be of interest. In particular, for (possibly long) missing sequences of consecutive time points, joint prediction regions are desirable. In this paper we propose a bootstrap resampling scheme to construct joint prediction regions that approximately contain missing paths of a time components in a spatio-temporal framework, with global probability $$1-\alpha $$ 1 - α . In many applications, considering the coverage of the whole missing sample-path might appear too restrictive. To perceive more informative inference, we also derive smaller joint prediction regions that only contain all elements of missing paths up to a small number k of them with probability $$1-\alpha $$ 1 - α . A simulation experiment is performed to validate the empirical performance of the proposed joint bootstrap prediction and to compare it with some alternative procedures based on a simple nominal coverage correction, loosely inspired by the Bonferroni approach, which are expected to work well standard scenarios.


Author(s):  
Pedro J. García-Laencina ◽  
Juan Morales-Sánchez ◽  
Rafael Verdú-Monedero ◽  
Jorge Larrey-Ruiz ◽  
José-Luis Sancho-Gómez ◽  
...  

Many real-word classification scenarios suffer a common drawback: missing, or incomplete, data. The ability of missing data handling has become a fundamental requirement for pattern classification because the absence of certain values for relevant data attributes can seriously affect the accuracy of classification results. This chapter focuses on incomplete pattern classification. The research works on this topic currently grows wider and it is well known how useful and efficient are most of the solutions based on machine learning. This chapter analyzes the most popular and proper missing data techniques based on machine learning for solving pattern classification tasks, trying to highlight their advantages and disadvantages.


1998 ◽  
Vol 24 (6) ◽  
pp. 763-779 ◽  
Author(s):  
Fred S. Switzer ◽  
Philip L. Roth ◽  
Deborah M. Switzer

The accuracy of eight missing data techniques (MDTs) under conditions of systematically missing data was tested using a Monte Carlo analysis. Data were generated from a population correlation matrix, then deleted using several patterns that might be found in a human resource management (HRM) selection validation study. The results indicated that listwise and pairwise deletion were the most accurate methods, followed closely by imputation methods such as regression and hot-deck. Mean substitution was substantially inferior to the other methods tested. Future research that examines different missing data patterns is recommended.


Sign in / Sign up

Export Citation Format

Share Document