Rounding non-binary categorical variables following multivariate normal imputation: evaluation of simple methods and implications for practice

2012 ◽  
Vol 84 (4) ◽  
pp. 798-811 ◽  
Author(s):  
J. C. Galati ◽  
K. A. Seaton ◽  
K. J. Lee ◽  
J. A. Simpson ◽  
J. B. Carlin
2021 ◽  
pp. 1-9
Author(s):  
Moritz Marbach

Abstract Imputing missing values is an important preprocessing step in data analysis, but the literature offers little guidance on how to choose between imputation models. This letter suggests adopting the imputation model that generates a density of imputed values most similar to those of the observed values for an incomplete variable after balancing all other covariates. We recommend stable balancing weights as a practical approach to balance covariates whose distribution is expected to differ if the values are not missing completely at random. After balancing, discrepancy statistics can be used to compare the density of imputed and observed values. We illustrate the application of the suggested approach using simulated and real-world survey data from the American National Election Study, comparing popular imputation approaches including random forests, hot-deck, predictive mean matching, and multivariate normal imputation. An R package implementing the suggested approach accompanies this letter.


2014 ◽  
Vol 22 (4) ◽  
pp. 497-519 ◽  
Author(s):  
Jonathan Kropko ◽  
Ben Goodrich ◽  
Andrew Gelman ◽  
Jennifer Hill

We consider the relative performance of two common approaches to multiple imputation (MI): joint multivariate normal (MVN) MI, in which the data are modeled as a sample from a joint MVN distribution; and conditional MI, in which each variable is modeled conditionally on all the others. In order to use the multivariate normal distribution, implementations of joint MVN MI typically assume that categories of discrete variables are probabilistically constructed from continuous values. We use simulations to examine the implications of these assumptions. For each approach, we assess (1) the accuracy of the imputed values; and (2) the accuracy of coefficients and fitted values from a model fit to completed data sets. These simulations consider continuous, binary, ordinal, and unordered-categorical variables. One set of simulations uses multivariate normal data, and one set uses data from the 2008 American National Election Studies. We implement a less restrictive approach than is typical when evaluating methods using simulations in the missing data literature: in each case, missing values are generated by carefully following the conditions necessary for missingness to be “missing at random” (MAR). We find that in these situations conditional MI is more accurate than joint MVN MI whenever the data include categorical variables.


2002 ◽  
Vol 18 (1) ◽  
pp. 78-84 ◽  
Author(s):  
Eva Ullstadius ◽  
Jan-Eric Gustafsson ◽  
Berit Carlstedt

Summary: Vocabulary tests, part of most test batteries of general intellectual ability, measure both verbal and general ability. Newly developed techniques for confirmatory factor analysis of dichotomous variables make it possible to analyze the influence of different abilities on the performance on each item. In the testing procedure of the Computerized Swedish Enlistment test battery, eight different subtests of a new vocabulary test were given randomly to subsamples of a representative sample of 18-year-old male conscripts (N = 9001). Three central dimensions of a hierarchical model of intellectual abilities, general (G), verbal (Gc'), and spatial ability (Gv') were estimated under different assumptions of the nature of the data. In addition to an ordinary analysis of covariance matrices, assuming linearity of relations, the item variables were treated as categorical variables in the Mplus program. All eight subtests fit the hierarchical model, and the items were found to load about equally on G and Gc'. The results also indicate that if nonlinearity is not taken into account, the G loadings for the easy items are underestimated. These items, moreover, appear to be better measures of G than the difficult ones. The practical utility of the outcome for item selection and the theoretical implications for the question of the origin of verbal ability are discussed.


2006 ◽  
Vol 11 (1) ◽  
pp. 12-24 ◽  
Author(s):  
Alexander von Eye

At the level of manifest categorical variables, a large number of coefficients and models for the examination of rater agreement has been proposed and used. The most popular of these is Cohen's κ. In this article, a new coefficient, κ s , is proposed as an alternative measure of rater agreement. Both κ and κ s allow researchers to determine whether agreement in groups of two or more raters is significantly beyond chance. Stouffer's z is used to test the null hypothesis that κ s = 0. The coefficient κ s allows one, in addition to evaluating rater agreement in a fashion parallel to κ, to (1) examine subsets of cells in agreement tables, (2) examine cells that indicate disagreement, (3) consider alternative chance models, (4) take covariates into account, and (5) compare independent samples. Results from a simulation study are reported, which suggest that (a) the four measures of rater agreement, Cohen's κ, Brennan and Prediger's κ n , raw agreement, and κ s are sensitive to the same data characteristics when evaluating rater agreement and (b) both the z-statistic for Cohen's κ and Stouffer's z for κ s are unimodally and symmetrically distributed, but slightly heavy-tailed. Examples use data from verbal processing and applicant selection.


Author(s):  
Jared A. Warren ◽  
John P. McLaughlin ◽  
Robert M. Molloy ◽  
Carlos A. Higuera ◽  
Jonathan L. Schaffer ◽  
...  

AbstractBoth advances in perioperative blood management, anesthesia, and surgical technique have improved transfusion rates following primary total knee arthroplasty (TKA), and have driven substantial change in preoperative blood ordering protocols. Therefore, blood management in TKA has seen substantial changes with the implementation of preoperative screening, patient optimization, and intra- and postoperative advances. Thus, the purpose of this study was to examine changes in blood management in primary TKA, a nationwide sample, to assess gaps and opportunities. The American College of Surgeons National Surgical Quality Improvement Program database was used to identify TKA (n = 337,160) cases from 2011 to 2018. The following variables examined, such as preoperative hematocrit (HCT), anemia (HCT <35.5% for females and <38.5% for males), platelet count, thrombocytopenia (platelet count < 150,000/µL), international normalized ration (INR), INR > 2.0, bleeding disorders, preoperative, and postoperative transfusions. Analysis of variances were used to examine changes in continuous variables, and Chi-squared tests were used for categorical variables. There was a substantial decrease in postoperative transfusions from high of 18.3% in 2011 to a low of 1.0% in 2018, (p < 0.001), as well as in preoperative anemia from a high of 13.3% in 2011 to a low of 9.5% in 2016 to 2017 (p < 0.001). There were statistically significant, but clinically irrelevant changes in the other variables examined. There was a HCT high of 41.2 in 2016 and a low of 40.4 in 2011 to 2012 (p < 0.001). There was platelet count high of 247,400 in 2018 and a low of 242,700 in 201 (p < 0.001). There was a high incidence of thrombocytopenia of 5.2% in 2017 and a low of low of 4.4% in 2018 (p < 0.001). There was a high INR of 1.037 in 2011 and a low of 1.021 in 2013 (p < 0.001). There was a high incidence of INR >2.0 of 1.0% in 2012 to 2015 and a low of 0.8% in 2016 to 2018 (p = 0.027). There was a high incidence of bleeding disorders of 2.9% in 2013 and a low of 1.8% in 2017 to 2018 (p < 0.001). There was a high incidence of preoperative transfusions of 0.1% in 2011 to 2014 and a low of <0.1% in 2015 to 2018 (p = 0.021). From 2011 to 2018, there has been substantial decreases in patients receiving postoperative transfusions after primary TKA. Similarly, although a decrease in patients with anemia was seen, there remains 1 out 10 patients with preoperative anemia, highlighting the opportunity to further improve and address this potentially modifiable risk factor before surgery. These findings may reflect changes during TKA patient selection, optimization, or management, and emphasizes the need to further advance multimodal approaches for perioperative blood management of TKA patients. This is a Level III study.


1996 ◽  
Vol 8 (3) ◽  
pp. 133-144 ◽  
Author(s):  
María del Mar del Pozo Andrés ◽  
Jacques F A Braster

In this article we propose two research techniques that can bridge the gap between quantitative and qualitative historical research. These are: (1) a multiple regression approach that gives information about general patterns between numerical variables and the selection of outliers for qualitative analysis; (2) a homogeneity analysis with alternating least squares that results in a two-dimensional picture in which the relationships between categorical variables are graphically presented.


Sign in / Sign up

Export Citation Format

Share Document