Item Parameter Recovery, Standard Error Estimates, and Fit Statistics of the Winsteps Program for the Family of Rasch Models

2005 ◽  
Vol 65 (3) ◽  
pp. 376-404 ◽  
Author(s):  
Wen-Chung Wang ◽  
Cheng-Te Chen
1987 ◽  
Vol 65 (3) ◽  
pp. 691-707 ◽  
Author(s):  
A. F. L. Nemec ◽  
R. O. Brinkhurst

A data matrix of 23 generic or subgeneric taxa versus 24 characters and a shorter matrix of 15 characters were analyzed by means of ordination, cluster analyses, parsimony, and compatibility methods (the last two of which are phylogenetic tree reconstruction methods) and the results were compared inter alia and with traditional methods. Various measures of fit for evaluating the parsimony methods were employed. There were few compatible characters in the data set, and much homoplasy, but most analyses separated a group based on Stylaria from the rest of the family, which could then be separated into four groups, recognized here for the first time as tribes (Naidini, Derini, Pristinini, and Chaetogastrini). There was less consistency of results within these groups. Modern methods produced results that do not conflict with traditional groupings. The Jaccard coefficient minimizes the significance of symplesiomorphy and complete linkage avoids chaining effects and corresponds to actual similarities, unlike single or average linkage methods, respectively. Ordination complements cluster analysis. The Wagner parsimony method was superior to the less flexible Camin–Sokal approach and produced better measure of fit statistics. All of the aforementioned methods contain areas susceptible to subjective decisions but, nevertheless, they lead to a complete disclosure of both the methods used and the assumptions made, and facilitate objective hypothesis testing rather than the presentation of conflicting phylogenies based on the different, undisclosed premises of manual approaches.


2020 ◽  
Vol 24 (1) ◽  
Author(s):  
Bahrul Hayat ◽  
Muhammad Dwirifqi Kharisma Putra ◽  
Bambang Suryadi

Rasch model is a method that has a long history in its application in the fields of social and behavioral sciences including educational measurement. Under certain circumstances, Rasch models are known as a special case of Item response theory (IRT), while IRT is equivalent to the Item Factor Analysis (IFA) models as a special case of Structural Equation Models (SEM), although there are other ‘tradition’ that consider Rasch measurement models not part of both. In this study, a simulation study was conducted to using simulated data to explain how the interrelationships between the Rasch model as a constraint version of 2-parameter logistic (2-PL) IRT, Rasch model as an item factor analysis were compared with the Rasch measurement model using Mplus, IRTPRO and WINSTEPS program, each of which came from its own 'tradition'. The results of this study indicate that Rasch models and IFA as a special case of SEM are mathematically equal, as well as the Rasch measurement model, but due to different philosophical perspectives people might vary in their understanding about this concept. Given the findings of this study, it is expected that confusion and misunderstanding between the three can be overcome.


Econometrica ◽  
2021 ◽  
Vol 89 (4) ◽  
pp. 1963-1977 ◽  
Author(s):  
Jinyong Hahn ◽  
Zhipeng Liao

Asymptotic justification of the bootstrap often takes the form of weak convergence of the bootstrap distribution to some limit distribution. Theoretical literature recognized that the weak convergence does not imply consistency of the bootstrap second moment or the bootstrap variance as an estimator of the asymptotic variance, but such concern is not always reflected in the applied practice. We bridge the gap between the theory and practice by showing that such common bootstrap based standard error in fact leads to a potentially conservative inference.


2021 ◽  
pp. 107699862199436
Author(s):  
Yue Liu ◽  
Hongyun Liu

The prevalence and serious consequences of noneffortful responses from unmotivated examinees are well-known in educational measurement. In this study, we propose to apply an iterative purification process based on a response time residual method with fixed item parameter estimates to detect noneffortful responses. The proposed method is compared with the traditional residual method and noniterative method with fixed item parameters in two simulation studies in terms of noneffort detection accuracy and parameter recovery. The results show that when severity of noneffort is high, the proposed method leads to a much higher true positive rate with a small increase of false discovery rate. In addition, parameter estimation is significantly improved by the strategies of fixing item parameters and iteratively cleansing. These results suggest that the proposed method is a potential solution to reduce the impact of data contamination due to severe low test-taking effort and to obtain more accurate parameter estimates. An empirical study is also conducted to show the differences in the detection rate and parameter estimates among different approaches.


2015 ◽  
Vol 26 (4) ◽  
pp. 1802-1823 ◽  
Author(s):  
Elizabeth H Payne ◽  
James W Hardin ◽  
Leonard E Egede ◽  
Viswanathan Ramakrishnan ◽  
Anbesaw Selassie ◽  
...  

Overdispersion is a common problem in count data. It can occur due to extra population-heterogeneity, omission of key predictors, and outliers. Unless properly handled, this can lead to invalid inference. Our goal is to assess the differential performance of methods for dealing with overdispersion from several sources. We considered six different approaches: unadjusted Poisson regression (Poisson), deviance-scale-adjusted Poisson regression (DS-Poisson), Pearson-scale-adjusted Poisson regression (PS-Poisson), negative-binomial regression (NB), and two generalized linear mixed models (GLMM) with random intercept, log-link and Poisson (Poisson-GLMM) and negative-binomial (NB-GLMM) distributions. To rank order the preference of the models, we used Akaike's information criteria/Bayesian information criteria values, standard error, and 95% confidence-interval coverage of the parameter values. To compare these methods, we used simulated count data with overdispersion of different magnitude from three different sources. Mean of the count response was associated with three predictors. Data from two real-case studies are also analyzed. The simulation results showed that NB and NB-GLMM were preferred for dealing with overdispersion resulting from any of the sources we considered. Poisson and DS-Poisson often produced smaller standard-error estimates than expected, while PS-Poisson conversely produced larger standard-error estimates. Thus, it is good practice to compare several model options to determine the best method of modeling count data.


2014 ◽  
Vol 30 (3) ◽  
pp. 521-532 ◽  
Author(s):  
Phillip S. Kott ◽  
C. Daniel Day

Abstract This article describes a two-step calibration-weighting scheme for a stratified simple random sample of hospital emergency departments. The first step adjusts for unit nonresponse. The second increases the statistical efficiency of most estimators of interest. Both use a measure of emergency-department size and other useful auxiliary variables contained in the sampling frame. Although many survey variables are roughly a linear function of the measure of size, response is better modeled as a function of the log of that measure. Consequently the log of size is a calibration variable in the nonresponse-adjustment step, while the measure of size itself is a calibration variable in the second calibration step. Nonlinear calibration procedures are employed in both steps. We show with 2010 DAWN data that estimating variances as if a one-step calibration weighting routine had been used when there were in fact two steps can, after appropriately adjusting the finite-population correct in some sense, produce standard-error estimates that tend to be slightly conservative.


Sign in / Sign up

Export Citation Format

Share Document