scholarly journals An Application of Item Response Theory to Scoring Patient Safety Culture Survey Data

Author(s):  
Heon-Jae Jeong ◽  
Hsun-Hsiang Liao ◽  
Su Ha Han ◽  
Wui-Chiang Lee

Patient safety culture is important in preventing medical errors. Thus, many instruments have been developed to measure it. Yet, few studies focus on the data processing step. This study, by analyzing the Chinese version of the Safety Attitudes Questionnaire dataset that contained 37,163 questionnaires collected in Taiwan, found critical issues related to the currently used mean scoring method: The instrument, like other popular ones, uses a 5-point Likert scale, and because it is an ordinal scale, the mean scores cannot be calculated. Instead, Item Response Theory (IRT) was applied. The construct validity was satisfactory and the item properties of the instrument were estimated from confirmatory factor analysis. The IRT-based domain scores and mean domain scores of each respondent were estimated and compared. As for resolution, the mean approach yielded only around 20 unique values on a 0 to 100 scale for each domain; the IRT method yielded at least 440 unique values. Meanwhile, IRT scores ranged widely at each unique mean score, meaning that the precision of the mean approach was less reliable. The theoretical soundness and empirical strength of IRT suggest that healthcare institutions should adopt IRT as a new scoring method, which is the core step of processing collected data.

2017 ◽  
Vol 3 (2) ◽  
pp. 152
Author(s):  
Dian Normalitasari Purnama

This study is aimed at: (1) understanding the characteristics of Accounting Vocational Theory trial test items using the Item Response Theory and (2) determining the horizontal equation of Accounting Vocational Theory trial exam instruments. This was explorative-descriptive research, observing the subject of the eleventh-grade students. The research objects were test instruments and responses of students from six schools selected through the stratified random sampling technique. The data analysis employed review sheets and BILOG program for the Item Response Theory 2PL. The findings were as follows. (1) The test item review of test packages A and B found 37 good quality items, the Item Response Theory using 2PL showed that Package A Test generated 27 good questions, Package B Test contained 24 good questions. (2) The question equating using the Mean/Sigma method resulted in the equation of = 1.168bx + 0.270, with the Mean/Mean method resulting in the equation of  = 0.997bx - 0.250, the Mean/Mean method at 0.250, while Mean/Sigma method at 0.320. 


2017 ◽  
Vol 78 (5) ◽  
pp. 805-825 ◽  
Author(s):  
Dimiter M. Dimitrov

This article presents some new developments in the methodology of an approach to scoring and equating of tests with binary items, referred to as delta scoring (D-scoring), which is under piloting with large-scale assessments at the National Center for Assessment in Saudi Arabia. This presentation builds on a previous work on delta scoring and adds procedures for scaling and equating, item response function, and estimation of true values and standard errors of D scores. Also, unlike the previous work on this topic, where D-scoring involves estimates of item and person parameters in the framework of item response theory, the approach presented here does not require item response theory calibration.


2017 ◽  
Vol 2 (1) ◽  
pp. 1 ◽  
Author(s):  
Rizki Nor Amelia ◽  
Kriswantoro Kriswantoro

<p>This first aim of this study is to describe the quality of chemistry test item made by teacher. The test was developed for 11<sup>th</sup> grade students’ science class in the first semester on academic year 2015/2016. The second aim of this study is to describe the characteristic of measurement’s result for students’ ability in chemistry. This is descriptive research design with the 101 student’s responses patterns from multiple choice test device with 5 answer alternatives. The responses patterns were collected by documentation technique and analyzed quantitatively using Item Response Theory software such as BILOG MG V3.0 with 1-PL, 2-PL, and 3-PL models. The differences of students’ ability in chemistry in model 1-PL, 2-PL, dan 3-PL were analyzed using One-Way Anova Repeated Measure. The result showed that the mean of item difficulties level (b), item differentiate (a), and pseudo-guessing (c) are good. The measurement tools arranged by teacher were suitable for students who have the ability from -1.0 to +1.7. The maximum score of item information function is 68.83 (SEM =0.121) with ability in 0.2 logit. The highest ability’s estimation score was showed by Model 2-PL. The mean of students’ ability for 11<sup>th</sup> grade students is -0.0185 logit and consider as moderate category.</p><p> </p>


2021 ◽  
Vol 60 (2) ◽  
pp. 97-104
Author(s):  
Rumyana Stoyanova ◽  
Rositsa Dimova ◽  
Bianka Tornyova ◽  
Momchil Mavrov ◽  
Harieta Elkova

Abstract Introduction A patient safety culture (PSC) is a complex phenomenon, representing an essential part of the organizational culture and refers to the shared values, conceptions and beliefs which contribute to the formation and encouragement of safe behavioural models in a health organization. With this study, the authors wanted to delineate the attitude of hospital staff in Bulgaria regarding PSC and to document to whether attitudes differ between physicians and other healthcare professionals (HCPs). Methods A national cross-sectional survey among 384 HCPs was conducted using an online version of the Bulgarian version of Hospital Survey on Patient Safety Culture (B-HSOPSC). The data was analysed with descriptive statistics, non-parametric Mann-Whitney U and x 2 tests. Results The physicians represented 37.50% (144) of the sample and other HCPs 62.50% (240). Respondents from governmental/municipal hospitals prevailed (53.6%). The dimensions “Staffing” and “Non-punitive response to error” were most problematic, as their percentage of positive response rates (PRRs) were lowest. However, “Handoffs and transitions” and “Supervisor/manager expectations and actions promoting safety“ showed the highest mean values in both physicians and other HCPs. From all participants, 76.0% have never reported an adverse event or error. Conclusion The results of the study show that all respondents demonstrate a positive attitude regarding PSC. A comparison of the mean values and that of PRRs in the dimensions did not show any group differences, according to the type of staff position, i.e. physicians or other HCPs.


2021 ◽  
Author(s):  
Daniel Zarate ◽  
Joshua Marmara ◽  
Camilla Potoczny ◽  
Warwick Hosking ◽  
Vasielios Stavropoulos

Abstract Background: The present study considers a measure of positive body image, the Body Appreciation Scale-2, which assesses acceptance and/or favourable opinions towards the body (BAS-2[29]). Differential functioning of the scale across the two genders, as well as its items, has not been excluded. The present study contributes to this area of knowledge via the employment of gender Measurement Invariance (MI) and Item Response Theory (IRT) analyses. Methods: A group of 386 adults from the community were assessed (N = 394, 54.8% men, 43.1% women, M age = 27.48; SD = 5.57). Results: MI analysis observed invariance across males and females at the configural level, and non-invariance at the metric level. Further, the two-parameter logistic model employed to observe IRT properties indicated that all items demonstrated, although variable, strong discrimination capacity. Conclusions: The items showed increased reliability for latent levels of ∓ 2 SD from the mean level of Body Appreciation. The implications and interpretations of the findings for clinical practice are discussed.


2011 ◽  
Vol 72 (3) ◽  
pp. 510-528 ◽  
Author(s):  
Louis Tay ◽  
Fritz Drasgow

Two Monte Carlo simulation studies investigated the effectiveness of the mean adjusted χ2/ df statistic proposed by Drasgow and colleagues and, because of problems with the method, a new approach for assessing the goodness of fit of an item response theory model was developed. It has been previously recommended that mean adjusted χ2/ df values greater than 3 using a cross-validation data set indicate substantial misfit. The authors used simulations to examine this critical value across different test lengths (15, 30, 45) and sample sizes (500, 1,000, 1,500, 5,000). The one-, two- and three-parameter logistic models were fitted to data simulated from different logistic models, including unidimensional and multidimensional models. In general, a fixed cutoff value was insufficient to ascertain item response theory model–data fit. Consequently, the authors propose the use of the parametric bootstrap to investigate misfit and evaluated its performance. This new approach produced appropriate Type I error rates and had substantial power to detect misfit across simulated conditions. In a third study, the authors applied the parametric bootstrap approach to LSAT data to determine which dichomotous item response theory model produced the best fit. Future applications of the mean adjusted χ2/ df statistic are discussed.


2019 ◽  
Vol 8 (1) ◽  
pp. 37-45
Author(s):  
Medianta Tarigan ◽  
Fadillah Fadillah

AbstractWonderlic Personnel Test (WPT) is a psychology tool that measures individual cognitive abilities based on measuring the level of learning ability, understanding the instruction, and solving the problems. In this study, WPT items were tested using the Item Response Theory (IRT) method. There were 374 participating subjects and the results of the study showed 31 items are fit with the model, while 19 items were misfit. According to the IRT 2PL model analysis, mean of examinee ability was -0.01 (SD=1.19). The mean of difficulty (b) was 0.48 (SD=2.58) and meand of discriminant (a) was 0.62 (SD=0.38). WPT test is indicated to consist of items were misfit, that do not measure the same dimension. These statistical results are in line with the characteristics of WPT which are built from three abilities to measure intelligence.AbstrakWonderlic Personnel Test (WPT) merupakan alat ukur psikologi yang mengukur kemampuan koginitif berdasarkan pada pengukuran tingkat kemampuan belajar, memahami instruksi, dan memecahkan masalah. Dalam penelitian ini dilakukan uji terhadap aitem WPT dengan metode Item Response Theory (IRT). Terdapat 374 subjek yang berpartisipasi dan hasil penelitian menunjukkan 31 aitem sesuai dengan model, sedangkan 19 aitem lagi tidak sesuai. Menurut analisis IRT model 2PL, rata-rata kemampuan peserta adalah -0.01 (SD=1.19). Sedangkan untuk rata-rata tingkat kesukaran (b) sebesar 0.48 (SD=2.58) dan rata-rata daya beda (a) sebesar 0.62 (SD=0.38). Tes WPT diindikasikan terdiri dari aitem yang tidak mengukur satu dimensi yang sama. Hasil statistik ini sejalan dengan karakteristik WPT yang dibangun dari tiga kemampuan untuk mengukur tingkat kecerdasan.  


Author(s):  
Ado Abdu Bichi ◽  
Hadiza Hafiz ◽  
Samira Abdullahi Bello

High-stakes testing is used for the purposes of providing results that have important consequences. Validity is the cornerstone upon which all measurement systems are built. This study applied the Item Response Theory principles to analyse Northwest University Kano Post-UTME Economics test items. The developed fifty (50) economics test items was administered to a sample of 600 students. The data obtained was analysed using XCALIBRE 4 and SPSS 20v softwares to determine items parameters base on IRT models. Indicate that, the test measure single trait by satisfying the condition of unidimensionality. Similarly, the goodness of fit test revealed that, the two parameter IRT model was more suitable since no misfit item was observed and the test reliability was 0.86. The mean examinee ability was 0.07 (SD=0.94). The mean item difficulty was -0.63(SD=2.54) and mean item discrimination was 0.28 (SD=0.04). 16 (33%) items were identified as “problematic” based on difficulty indices, 35(71%) also failed to meet the set standards on the basis of discrimination parameters. it can be concluded that, using the IRT approach, the NWU Post-UTME items are not stable as far as item difficulty and discrimination indices are concerned. It is recommended that, the Post-UTME items should be made to pass through all process of standardisation and validation; test development and content experts should be involve in developing and validating the test items in order to obtain valid and reliable results which will lead to valid inferences


2021 ◽  
Vol 11 (3) ◽  
pp. 19-35
Author(s):  
N.P. Radchikova ◽  
M.A. Odintsova

Personal self-activation inventory that measures a psychological construct reflecting person’s internal voluntary activity is considered in the article. The inventory includes three components (scales): independence, physical activation and psychological activation. In the framework of the Item Response Theory (IRT), the model of graded responses was applied. It is shown that all questions of the inventory have the discriminability not lower than moderate. The graphs of the information function for each scale indicate that the measurements of the self-activation components are fairly accurate in the range from low values to values significantly higher than the mean, and only the highest values (exceeding the mean by two standard deviations or more) are not measured accurately. A moderate positive correlation between self-activation and the average grade can serve as verification of the inventory criterion validity. Discriminant validity estimation, which was carried out by calculating correlations with other similar constructs (self-control, personal dynamism, hardiness), showed that hardiness is the most similar to self-activation construct. Incremental validity estimation has shown that when self-activation is added to the prediction model of performance based on personal resources, the variance explained is increased much more than by adding hardiness. This indicates that the self-activation construct has some incremental validity and reflects a psychological reality that is different from the construct of hardiness.


Sign in / Sign up

Export Citation Format

Share Document