rasch models
Recently Published Documents


TOTAL DOCUMENTS

169
(FIVE YEARS 35)

H-INDEX

27
(FIVE YEARS 2)

2021 ◽  
Author(s):  
Conrad J. Harrison ◽  
Bao Sheng Loe ◽  
Inge Apon ◽  
Chris J. Sidey-Gibbons ◽  
Marc C. Swan ◽  
...  

BACKGROUND There are two philosophical approaches to contemporary psychometrics: Rasch measurement theory (RMT) and item response theory (IRT). Either measurement strategy can be applied to computerized adaptive testing (CAT). There are potential benefits of IRT over RMT with regards to measurement precision, but also potential risks to measurement generalizability. RMT CAT assessments have demonstrated good performance with the CLEFT-Q, a patient-reported outcome measure for use in orofacial clefting. OBJECTIVE To test whether the post-hoc application of IRT (graded response models, GRMs, and multidimensional GRMs) to RMT-validated CLEFT-Q appearance scales could improve CAT accuracy at given assessment lengths. METHODS Partial credit Rasch models, unidimensional GRMs and a multidimensional GRM were calibrated for each of the 7 CLEFT-Q appearance scales (which measure the appearance of the: face, jaw, teeth, nose, nostrils, cleft lip scar and lips) using data from the CLEFT-Q field test. A second, simulated dataset was generated with 1000 plausible response sets to each scale. Rasch and GRM scores were calculated for each simulated response set, scaled to 0-100 scores, and compared by Pearson’s correlation coefficient, root mean square error (RMSE), mean absolute error (MAE) and 95% limits of agreement. For the face, teeth and jaw scales, we repeated this in a an independent, real patient dataset. We then used the simulated data to compare the performance of a range of fixed-length CAT assessments that were generated with partial credit Rasch models, unidimensional GRMs and the multidimensional GRM. Median standard error of measurement (SEM) was recorded for each assessment. CAT scores were scaled to 0-100 and compared to linear assessment Rasch scores with RMSE, MAE and 95% limits of agreement. This was repeated in the independent, real patient dataset with the RMT and unidimensional GRM CAT assessments for the face, teeth and jaw scales to test the generalizability of our simulated data analysis. RESULTS Linear assessment scores generated by Rasch models and unidimensional GRMs showed close agreement, with RMSE ranging from 2.2 to 6.1, and MAE ranging from 1.5 to 4.9 in the simulated dataset. These findings were closely reproduced in the real patient dataset. Unidimensional GRM CAT algorithms achieved lower median SEM than Rasch counterparts, but reproduced linear assessment scores with very similar accuracy (RMSE, MAE and 95% limits of agreement). The multidimensional GRM had poorer accuracy than the unidimensional models at comparable assessment lengths. CONCLUSIONS Partial credit Rasch models and GRMs produce very similar CAT scores. GRM CAT assessments achieve a lower SEM, but this does not translate into better accuracy. Commonly used SEM heuristics for target measurement reliability should not be generalized across CAT assessments built with different psychometric models. In this study, a relatively parsimonious multidimensional GRM CAT algorithm performed more poorly than unidimensional GRM comparators.


2021 ◽  
Vol 50 (Supplement_1) ◽  
Author(s):  
Jessica Stanhope ◽  
Philip Weinstein

Abstract Background Pain is often measured by asking people to rate their pain intensity at its worst, on average, and at its least, for the last 7 days using numeric rating scales. The three ratings are summed to produce a composite measure. The validity and reliability of this composite measure has not been examined using modern psychometric methods in any population. We examined the validity and reliability of this pain intensity measure for use with professional musicians, university music and science students, and university staff, all of whom had reported experiencing musculoskeletal symptoms in the last 7 days. Methods Data were collected using a questionnaire survey. The validity and reliability of the composite pain measure were examined using Rasch analysis. Differential item functioning was examined for age, gender, student status, musician status, and socioeconomic status. Results While the data fit one of the Rasch models, after several response categories were collapsed, differential item functioning was present. There was no solution found that fit one of the Rasch models, without differential item functioning. Conclusions Despite the recommendation for the three number ratings scales for pain to be combined, using Rasch analysis showed that this was not a valid approach for our study population. Our findings highlight the importance of using Rasch analysis to examine the utility of measures. Key messages Rasch analysis is a useful method for investigating the validity and reliability of scales. Combining pain ratings cannot be assumed to produce a valid and reliable measure.


Author(s):  
Dong Gi Seo ◽  
Jae Kum Kim

Purpose: Diagnostic classification models (DCMs) were developed to identify the mastery or non-mastery of the attributes required for solving test items, but their application has been limited to very low-level attributes, and the accuracy and consistency of high-level attributes using DCMs have rarely been reported compared with classical test theory (CTT) and item response theory models. This paper compared the accuracy of high-level attribute mastery between deterministic inputs, noisy “and” gate (DINA) and Rasch models, along with sub-scores based on CTT.Methods: First, a simulation study explored the effects of attribute length (number of items per attribute) and the correlations among attributes with respect to the accuracy of mastery. Second, a real-data study examined model and item fit and investigated the consistency of mastery for each attribute among the 3 models using the 2017 Korean Medical Licensing Examination with 360 items.Results: Accuracy of mastery increased with a higher number of items measuring each attribute across all conditions. The DINA model was more accurate than the CTT and Rasch models for attributes with high correlations (>0.5) and few items. In the real-data analysis, the DINA and Rasch models generally showed better item fits and appropriate model fit. The consistency of mastery between the Rasch and DINA models ranged from 0.541 to 0.633 and the correlations of person attribute scores between the Rasch and DINA models ranged from 0.579 to 0.786.Conclusion: Although all 3 models provide a mastery decision for each examinee, the individual mastery profile using the DINA model provides more accurate decisions for attributes with high correlations than the CTT and Rasch models. The DINA model can also be directly applied to tests with complex structures, unlike the CTT and Rasch models, and it provides different diagnostic information from the CTT and Rasch models.


2021 ◽  
Vol 13 (8) ◽  
pp. 4346
Author(s):  
Luca Iseppi ◽  
Marcella Rizzo ◽  
Enrico Gori ◽  
Federico Nassivera ◽  
Ivana Bassi ◽  
...  

The Food and Agriculture Organization of the United Nations supports the production of edible insects as a promising and sustainable source of nutrients to meet the increasing demand for animal-derived products by the growing world population. Even if insects are part of the diet of more than two billion people worldwide, the practice of eating insects (entomophagy) raises challenging questions for Western countries where this is not a habit. The research applied the Rasch models and showed that, in the case of hunger or need, 70.8% of the sample declared that they would be willing to eat insects. The willingness to habitually consume and pay for insect food is very low, but the percentages are higher than people who had actually had insect tasting experiences. This demonstrates that a communication process is necessary that aims to overcome psychological/cultural barriers. Only in this way will it be possible to increase the propensity to consume insects.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Masoud Geramipour

AbstractRasch testlet and bifactor models are two measurement models that could deal with local item dependency (LID) in assessing the dimensionality of reading comprehension testlets. This study aimed to apply the measurement models to real item response data of the Iranian EFL reading comprehension tests and compare the validity of the bifactor models and corresponding item parameters with unidimensional and multidimensional Rasch models. The data collected from the EFL reading comprehension section of the Iranian national university entrance examinations from 2016 to 2018. Various advanced packages of the R system were employed to fit the Rasch unidimensional, multidimensional, and testlet models and the exploratory and confirmatory bifactor models. Then, item parameters estimated and testlet effects identified; moreover, goodness of fit indices and the item parameter correlations for the different models were calculated. Results showed that the testlet effects were all small but non-negligible for all of the EFL reading testlets. Moreover, bifactor models were superior in terms of goodness of fit, whereas exploratory bifactor model better explained the factor structure of the EFL reading comprehension tests. However, item difficulty parameters in the Rasch models were more consistent than the bifactor models. This study had substantial implications for methods of dealing with LID and dimensionality in assessing reading comprehension with reference to the EFL testing.


2021 ◽  
Vol 9 (1) ◽  
Author(s):  
Jessica Malmqvist ◽  
Volkert Siersma ◽  
Christine Winther Bang ◽  
John Brodersen

Abstract Background Harms of colorectal cancer (CRC) screening include psychosocial consequences. We have not identified studies using a participant-relevant questionnaire with adequate measurement properties to investigate these harms. However, Brodersen et al. have previously developed a core questionnaire consequences of screening (COS) for use in screening for life-threatening diseases. Therefore, the objectives were: (1) To investigate content validity of COS in a CRC screening setting and in case of gaps in content coverage (2) generate new items and themes and (3) test the possibly extended version of COS for dimensionality and differential item functioning (DIF) using Rasch Models. Methods We performed two-part-focus-groups with CRC screenees. Screenees were recruited by strategic sampling. In the first part 16 screenees with false-positive results (n = 7) and low-risk polyps (n = 9) were interviewed about their CRC screening experiences and in the second part COS was examined for content validity. When new information was developed in the focus groups, new items covering this topic were generated. Subsequently, new items were, together with COS, tested in the subsequent interviews. A random subsample (n = 410) from a longitudinal questionnaire study, not yet published, was used to form the data for this paper. We analysed multidimensionality and uniform DIF with Andersen’s conditional likelihood ratio test. We assessed individual item fit to the model. We also analysed Local Dependence (LD) and DIF by partial gamma coefficients using Rasch Models. Results COS was found relevant in a CRC screening setting. However, new information was discovered in the focus groups, covered by 18 new CRC screening-specific items. The Rasch analyses only revealed minor problems in the COS-scales. The 18 new items were distributed on four new CRC screening-specific dimensions and one single item. Conclusion An extended version of COS specifically for use in a CRC screening setting has been developed. The extended part encompasses four new scales and one new single item. The original COS with the CRC-screening specific extension is called consequences of screening in colorectal cancer (COS-CRC). COS-CRC possessed reliability, unidimensionality and invariant measurement.


2021 ◽  
Vol 14 (1) ◽  
pp. 105-120
Author(s):  
Ramdani Ramdani ◽  
◽  
Fattah Hanurawan ◽  
M. Ramli ◽  
Blasius Boli Lasan ◽  
...  

Author(s):  
Daiana Colledani ◽  
Pasquale Anselmi ◽  
Egidio Robusto
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document