On the Generalized S−X2–Test of Item Fit: Some Variants, Residuals, and a Graphical Visualization

2021 ◽  
pp. 107699862110503
Author(s):  
Jochen Ranger ◽  
Kay Brauer

The generalized [Formula: see text]–test is a test of item fit for items with polytomous responses format. The test is based on a comparison of the observed and expected number of responses in strata defined by the test score. In this article, we make four contributions. We demonstrate that the performance of the generalized [Formula: see text]–test depends on how sparse cells are pooled. We propose alternative implementations of the test within the framework of limited information testing. We derive the distribution of the [Formula: see text]–residuals that can be used for post hoc analyses. We suggest a diagnostic plot that visualizes the form of the misfit. The performance of the alternative implementations is investigated in a simulation study. The simulation study suggests that the alternative implementations are capable of controlling the Type-I error rate well and have high power. An empirical application concludes this article.

Methodology ◽  
2015 ◽  
Vol 11 (1) ◽  
pp. 3-12 ◽  
Author(s):  
Jochen Ranger ◽  
Jörg-Tobias Kuhn

In this manuscript, a new approach to the analysis of person fit is presented that is based on the information matrix test of White (1982) . This test can be interpreted as a test of trait stability during the measurement situation. The test follows approximately a χ2-distribution. In small samples, the approximation can be improved by a higher-order expansion. The performance of the test is explored in a simulation study. This simulation study suggests that the test adheres to the nominal Type-I error rate well, although it tends to be conservative in very short scales. The power of the test is compared to the power of four alternative tests of person fit. This comparison corroborates that the power of the information matrix test is similar to the power of the alternative tests. Advantages and areas of application of the information matrix test are discussed.


2014 ◽  
Vol 53 (05) ◽  
pp. 343-343

We have to report marginal changes in the empirical type I error rates for the cut-offs 2/3 and 4/7 of Table 4, Table 5 and Table 6 of the paper “Influence of Selection Bias on the Test Decision – A Simulation Study” by M. Tamm, E. Cramer, L. N. Kennes, N. Heussen (Methods Inf Med 2012; 51: 138 –143). In a small number of cases the kind of representation of numeric values in SAS has resulted in wrong categorization due to a numeric representation error of differences. We corrected the simulation by using the round function of SAS in the calculation process with the same seeds as before. For Table 4 the value for the cut-off 2/3 changes from 0.180323 to 0.153494. For Table 5 the value for the cut-off 4/7 changes from 0.144729 to 0.139626 and the value for the cut-off 2/3 changes from 0.114885 to 0.101773. For Table 6 the value for the cut-off 4/7 changes from 0.125528 to 0.122144 and the value for the cut-off 2/3 changes from 0.099488 to 0.090828. The sentence on p. 141 “E.g. for block size 4 and q = 2/3 the type I error rate is 18% (Table 4).” has to be replaced by “E.g. for block size 4 and q = 2/3 the type I error rate is 15.3% (Table 4).”. There were only minor changes smaller than 0.03. These changes do not affect the interpretation of the results or our recommendations.


2019 ◽  
Vol 44 (3) ◽  
pp. 167-181 ◽  
Author(s):  
Wenchao Ma

Limited-information fit measures appear to be promising in assessing the goodness-of-fit of dichotomous response cognitive diagnosis models (CDMs), but their performance has not been examined for polytomous response CDMs. This study investigates the performance of the Mord statistic and standardized root mean square residual (SRMSR) for an ordinal response CDM—the sequential generalized deterministic inputs, noisy “and” gate model. Simulation studies showed that the Mord statistic had well-calibrated Type I error rates, but the correct detection rates were influenced by various factors such as item quality, sample size, and the number of response categories. In addition, the SRMSR was also influenced by many factors and the common practice of comparing the SRMSR against a prespecified cut-off (e.g., .05) may not be appropriate. A set of real data was analyzed as well to illustrate the use of Mord statistic and SRMSR in practice.


Author(s):  
Georg Krammer

The Andersen LRT uses sample characteristics as split criteria to evaluate Rasch model fit, or theory driven hypothesis testing for a test. The power and Type I error of a random split criterion was evaluated with a simulation study. Results consistently show a random split criterion lacks power.


2020 ◽  
Vol 7 (1) ◽  
pp. 1-6
Author(s):  
João Pedro Nunes ◽  
Giovanna F. Frigoli

The online support of IBM SPSS proposes that users alter the syntax when performing post-hoc analyses for interaction effects of ANOVA tests. Other authors also suggest altering the syntax when performing GEE analyses. This being done, the number of possible comparisons (k value) is also altered, therefore influencing the results from statistical tests that k is a component of the formula, such as repeated measures-ANOVA and Bonferroni post-hoc of ANOVA and GEE. This alteration also exacerbates type I error, producing erroneous results and conferring potential misinterpretations of data. Reasoning from this, the purpose of this paper is to report the misuse and improper handling of syntax for ANOVAs and GEE post-hoc analyses in SPSS and to illustrate its consequences on statistical results and data interpretation.


2012 ◽  
Author(s):  
Nor Haniza Sarmin ◽  
Md Hanafiah Md Zin ◽  
Rasidah Hussin

Suatu transformasi terhadap min dilakukan menggunakan penganggar pembetulan kepincangan bagi mendapatkan statistik untuk menguji min hipotesis taburan terpencong. Penghasilan statistik ini melibatkan pengubahsuaian pemboleh ubah . Kajian simulasi yang dijalankan terhadap taburan yang terpencong iaitu taburan eksponen, kuasa dua khi dan Weibull ke atas Kebarangkalian Ralat Jenis I menunjukkan bahawa statistik t3 sesuai untuk ujian satu hujung sebelah kiri dan saiz sampel yang kecil (n=5). Kata kunci: Min; statistik; taburan terpencong; penganggar pembetulan kepincangan; kebarangkalian Ralat Jenis I A transformation of mean has been done using a bias correction estimator to produce a statistic for mean hypothesis of skewed distributions. The statistic found involves a modification of the variable . A simulation study that has been done on some skewed distributions i.e. esponential, chi-square and Weibull on the Type I Error shows that t3 is suitable for the left-tailed test and a small sample size (n=5). Key words: Mean; statistic; skewed distribution; bias correction estimator; Type I Error


2015 ◽  
Vol 2015 ◽  
pp. 1-6 ◽  
Author(s):  
Ming-Wen An ◽  
Xin Lu ◽  
Daniel J. Sargent ◽  
Sumithra J. Mandrekar

Background. A phase II design with an option for direct assignment (stop randomization and assign all patients to experimental treatment based on interim analysis, IA) for a predefined subgroup was previously proposed. Here, we illustrate the modularity of the direct assignment option by applying it to the setting of two predefined subgroups and testing for separate subgroup main effects.Methods. We power the 2-subgroup direct assignment option design with 1 IA (DAD-1) to test for separate subgroup main effects, with assessment of power to detect an interaction in a post-hoc test. Simulations assessed the statistical properties of this design compared to the 2-subgroup balanced randomized design with 1 IA, BRD-1. Different response rates for treatment/control in subgroup 1 (0.4/0.2) and in subgroup 2 (0.1/0.2, 0.4/0.2) were considered.Results. The 2-subgroup DAD-1 preserves power and type I error rate compared to the 2-subgroup BRD-1, while exhibiting reasonable power in a post-hoc test for interaction.Conclusion. The direct assignment option is a flexible design component that can be incorporated into broader design frameworks, while maintaining desirable statistical properties, clinical appeal, and logistical simplicity.


2009 ◽  
Vol 17 (3) ◽  
pp. 236-260 ◽  
Author(s):  
Kishore Gawande ◽  
Hui Li

Endogeneity of explanatory variables is now receiving the concern it deserves in the empirical political science literature. Instrumental variables (IVs) estimators, such as two-stage least squares (2SLS), are the primary means for tackling this problem. These estimators solve the endogeneity problem by “instrumenting” the endogenous regressors using exogenous variables (the instruments). In many applications, a problem that the IV approach must overcome is that of weak instruments (WIs), where the instruments only weakly identify the regression coefficients of interest. With WIs, the infinite-sample properties (e.g., consistency) used to justify the use of estimators like 2SLS are on thin ground because these estimators have poor small-sample properties. Specifically, they may suffer from excessive bias and/or Type I error. We highlight the WI problem in the context of empirical testing of “protection for sale” model that predicts the cross-sectional pattern of trade protection as a function of political organization, imports and output. These variables are endogenous. Importantly, the instruments used to solve the endogeneity problem are weak. A method better suited to exact inference with WIs is the limited information maximum likelihood (LIML) estimator. Censoring in the dependent variable in the application requires a nonlinear Tobit LIML estimator.


2016 ◽  
Vol 77 (3) ◽  
pp. 415-428 ◽  
Author(s):  
David R. J. Fikis ◽  
T. C. Oshima

Purification of the test has been a well-accepted procedure in enhancing the performance of tests for differential item functioning (DIF). As defined by Lord, purification requires reestimation of ability parameters after removing DIF items before conducting the final DIF analysis. IRTPRO 3 is a recently updated program for analyses in item response theory, with built-in DIF tests but not purification procedures. A simulation study was conducted to investigate the effect of two new methods of purification. The results suggested that one of the purification procedures showed significantly improved power and Type I error. The procedure, which can be cumbersome by hand, can be easily applied by practitioners by using the web-based program developed for this study.


Sign in / Sign up

Export Citation Format

Share Document