Comparing the normalised and 2PL IRT scoring methods on multi-form examinations

Author(s):  
Aolin Xie ◽  
Ting Wei Chiu ◽  
Keyu Chen ◽  
Gregory Camilli
Keyword(s):  
2020 ◽  
Vol 63 (4) ◽  
pp. 1240-1253
Author(s):  
Victoria S. Henbest ◽  
Lisa Fitton ◽  
Krystal L. Werfel ◽  
Kenn Apel

Purpose Spelling is a skill that relies on an individual's linguistic awareness, the ability to overtly manipulate language. The ability to accurately spell is important for academic and career success into adulthood. The spelling skills of adults have received some attention in the literature, but there is limited information regarding which approach for analyzing adults' spelling is optimal for guiding instruction or intervention for those who struggle. Thus, we aimed to examine the concurrent validity of four different scoring methods for measuring adults' spellings (a dichotomous scoring method and three continuous methods) and to determine whether adults' linguistic awareness skills differentially predict spelling outcomes based on the scoring method employed. Method Sixty undergraduate college students who were determined to be average readers as measured by a word reading and contextual word reading task were administered a spelling task as well as morphological, orthographic, phonemic, and syntactic awareness tasks. Results All four scoring methods were highly correlated suggesting high concurrent validity among the measures. Two linguistic awareness skills, morphological awareness and syntactic awareness, predicted spelling performance on both the dichotomous and continuous scoring methods. Contrastively, phonemic awareness and orthographic awareness predicted spelling performance only when spelling was scored using a continuous measure error analysis. Conclusions The results of this study confirm that multiple linguistic awareness skills are important for spelling in adults who are average readers. The results also highlight the need for using continuous measures of spelling when planning intervention or instruction, particularly in the areas of orthographic and phonemic awareness.


2018 ◽  
Vol 33 (null) ◽  
pp. 135-163
Author(s):  
Chong Jiyun ◽  
Sang-Jin Kang ◽  
강현화

2021 ◽  
Vol 80 (Suppl 1) ◽  
pp. 351.2-352
Author(s):  
L. Van de Stadt ◽  
F. Kroon ◽  
M. Reijnierse ◽  
D. Van der Heijde ◽  
F. Rosendaal ◽  
...  

Background:Ultrasound (US) is used in rheumatic musculoskeletal diseases (RMDs) such as hand osteoarthritis (OA) as outcome measure. Traditionally scoring is performed real-time, but central reading of static US images could avoid issues of inter-rater reliability. However, agreement between real-time and static assessment has not been studiedObjectives:To study the agreement between real-time and static scoring of US in inflammatory hand OA.Methods:Ultrasound was performed of 30 joints obtained in 75 patients with hand osteoarthritis, treated with prednisolone or placebo in a randomized double-blind trial. Hand joints were assessed for synovial thickening, effusion, Doppler signal and osteophytes by ultrasound (score 0-3 per joint) at baseline and after treatment. Two ultrasonographers blinded for clinical data scored the live images together (simultaneously) in real-time. A consensus score for each joint was recorded. Representative images stored during scanning were scored by one ultrasonographer minimally 6 months after real-time scoring. For each patient, images of each visit were scored paired, with known chronological order.Agreement between scoring methods was studied at joint level with quadratic weighted kappa. At patient level, intra-class correlations (ICC; mixed effect model, absolute agreement, with clustering taken into account) were calculated at both timepoints. ICCs were also calculated for the delta of sum scores. Responsiveness of scoring methods was analyzed with generalized estimating equations (GEE) with treatment as independent and ultrasonography findings as dependent variable.Results:Thirty-nine patients (52%) were treated with prednisolone and 36 (48%) were treated with placebo. Patient characteristics were well-balanced between treatment groups.All patients had signs of synovial thickening and osteophytes as assessed by real-time ultrasonography, and almost all signs of effusion (99%) or a positive Doppler signal (95%) in at least one joint. Total ultrasonography sum score for osteophytes was high (mean 45 ±SD 12), whereas sum score was low for positive Doppler signal (mean 5.9 ±SD 4.4), with intermediate sum scores for synovial thickening and effusion (mean 16 ±SD 6.3 and 11 ±SD 6.0 respectively). Static sum scores were overall slightly higher (osteophytes mean 48 ±SD 10; Doppler mean 6.9 S±D 5.0; synovial thickening mean 20 ±SD 7.0 and effusion 13 ±SD 6.5)Agreement at baseline was good to excellent at joint level (kappa 0.72-0.88) and moderate to excellent at patient level (ICC 0.59-0.86). Agreement for delta sum scores was poor to fair for synovial thickening and effusion (ICC 0.18 and 0.34 respectively), but excellent for Doppler signal (ICC 0.80) (Table 1).Real-time ultrasonography showed responsiveness to prednisolone with a mean between-group difference of synovial thickening sum score of -2.5 (CI:-4.7 to-0.3). Static ultrasonography did not show a decrease in synovial thickening (Figure 1). No difference in ultrasonography scores was seen for the other ultrasonography features, neither with real-time nor static scoring.Conclusion:While cross-sectional agreement between real-time and static ultrasonography was good, agreement of delta sum scores was not and paired static ultrasonography measurement of synovial thickening did not show responsiveness to prednisone therapy where real-time ultrasonography did. Therefore, when using ultrasonography in clinical trials, real-time dynamic scoring should remain the standard.Table 1.Agreement on patient levelBaselineWeek 6Delta W6-BLICC (95% CI)ICC (95% CI)ICC (95% CI)Synovitis0.59 (0.26-0.76)0.58 (0.24-0.77)0.18 (0 - 0.40)Effusion0.84 (0.66-0.92)0.84 (0.75-0.89)0.34 (0.12-0.53)Osteophytes0.82 (0.50-0.92)0.78 (0.56-0.88)NDDoppler0.86 (0.75-0.92)0.91 (0.85-0.94)0.80 (0.70 -0.87)ICC: intra-class correlation coefficient linear mixed model (random patient, fixed rating), absolute agreement. ND: Not DerterminedDisclosure of Interests:Lotte van de Stadt: None declared, Féline Kroon: None declared, Monique Reijnierse Grant/research support from: Dutch Arthritis Foundation, Désirée van der Heijde Consultant of: bbVie, Amgen, Astellas, AstraZeneca, Bayer, BMS, Boehringer Ingelheim, Celgene, Cyxone, Daiichi, Eisai, Eli-Lilly, Galapagos, Gilead, Glaxo-Smith-Kline, Janssen, Merck, Novartis, Pfizer, Regeneron, Roche, Sanofi, Takeda, UCB Pharma, Frits Rosendaal: None declared, Naghmeh Riyazi: None declared, R. de Slegte: None declared, Jende van Zeben: None declared, Cornelia Allaart: None declared, Margreet Kloppenburg Consultant of: Abbvie, Pfizer, Levicept, GlaxoSmithKline, Merck-Serono, Kiniksa, Flexion, Galapagos, Jansen, CHDR, Grant/research support from: MI-APPROACH, Marion Kortekaas: None declared


Author(s):  
Ali Almustanyir ◽  
Reema Alduhayan ◽  
Mosaad Alhassan ◽  
Kholoud Bokhary ◽  
Balsam Alabdulkader

2021 ◽  
Vol 7 (1) ◽  
Author(s):  
Soonmyung Paik ◽  
Youngmee Kwon ◽  
Moo Hyun Lee ◽  
Ji Ye Kim ◽  
Da Kyung Lee ◽  
...  

AbstractAlthough Ki67 labeling index is a potential predictive marker for chemotherapy benefit, its clinical utility has been limited by the lack of a standard scoring method resulting in poor interobserver reproducibility. Especially, there is no consensus on the use of average versus hotspot score for reporting. In order to determine the best method for Ki67 scoring and validate manual scoring method proposed by the International Ki67 Working Group (IKWG), we systematically compared average versus hotspot score in 240 cases with a public domain image analysis program QuPath. We used OncotypeDx Recurrence Score (RS) as a benchmark to compare the potential clinical utility of each scoring methods. Both average and hotspot scores showed statistically significant but only modest correlation with OncotypeDx RS. Only hotspot score could meaningfully distinguish RS low-risk versus high-risk patients. However, hotspot score was less reproducible limiting its clinical utility. In summary, our data demonstrate that utility of the Ki67 labeling index is influenced by the choice of scoring method.


Sign in / Sign up

Export Citation Format

Share Document