A Review of Missing Data Handling Methods in Education Research

2014 ◽  
Vol 84 (4) ◽  
pp. 487-508 ◽  
Author(s):  
Jehanzeb R. Cheema
Author(s):  
Craig K. Enders ◽  
Amanda N. Baraldi

SIMULATION ◽  
2020 ◽  
Vol 96 (10) ◽  
pp. 825-839
Author(s):  
Hao Cheng

Missing data is almost inevitable for various reasons in many applications. For hierarchical latent variable models, there usually exist two kinds of missing data problems. One is manifest variables with incomplete observations, the other is latent variables which cannot be observed directly. Missing data in manifest variables can be handled by different methods. For latent variables, there exist several kinds of partial least square (PLS) algorithms which have been widely used to estimate the value of latent variables. In this paper, we not only combine traditional linear regression type PLS algorithms with missing data handling methods, but also introduce quantile regression to improve the performances of PLS algorithms when the relationships among manifest and latent variables are not fixed according to the explored quantile of interest. Thus, we can get the overall view of variables’ relationships at different levels. The main challenges lie in how to introduce quantile regression in PLS algorithms correctly and how well the PLS algorithms perform when missing manifest variables occur. By simulation studies, we compare all the PLS algorithms with missing data handling methods in different settings, and finally build a business sophistication hierarchical latent variable model based on real data.


Author(s):  
Seçil Ömür Sünbül

<p>In this study, it was aimed to investigate the impact of different missing data handling methods on DINA model parameter estimation and classification accuracy. In the study, simulated data were used and the data were generated by manipulating the number of items and sample size. In the generated data, two different missing data mechanisms (missing completely at random and missing at random) were created according to three different amounts of missing data. The generated missing data was completed by using methods of treating missing data as incorrect, person mean imputation, two-way imputation, and expectation-maximization algorithm imputation. As a result, it was observed that both s and g parameter estimations and classification accuracies were effected from, missing data rates, missing data handling methods and missing data mechanisms.</p>


Author(s):  
Seçil Ömür Sünbül

<p>In this study, it was aimed to investigate the impact of different missing data<br />handling methods on DINA model parameter estimation and classification<br />accuracy. In the study, simulated data were used and the data were generated<br />by manipulating the number of items and sample size. In the generated data,<br />two different missing data mechanisms (missing completely at random and<br />missing at random) were created according to three different amounts of<br />missing data. The generated missing data was completed by using methods<br />of treating missing data as incorrect, person mean imputation, two-way<br />imputation, and expectation-maximization algorithm imputation. As a result,<br />it was observed that both s and g parameter estimations and classification<br />accuracies were effected from, missing data rates, missing data handling<br />methods and missing data mechanisms.</p>


Sign in / Sign up

Export Citation Format

Share Document