Generalizability Theory

Author(s):  
Oleg N. Medvedev ◽  
Richard J. Siegert
2020 ◽  
Vol 63 (6) ◽  
pp. 1947-1957
Author(s):  
Alexandra Hollo ◽  
Johanna L. Staubitz ◽  
Jason C. Chow

Purpose Although sampling teachers' child-directed speech in school settings is needed to understand the influence of linguistic input on child outcomes, empirical guidance for measurement procedures needed to obtain representative samples is lacking. To optimize resources needed to transcribe, code, and analyze classroom samples, this exploratory study assessed the minimum number and duration of samples needed for a reliable analysis of conventional and researcher-developed measures of teacher talk in elementary classrooms. Method This study applied fully crossed, Person (teacher) × Session (samples obtained on 3 separate occasions) generalizability studies to analyze an extant data set of three 10-min language samples provided by 28 general and special education teachers recorded during large-group instruction across the school year. Subsequently, a series of decision studies estimated of the number and duration of sessions needed to obtain the criterion g coefficient ( g > .70). Results The most stable variables were total number of words and mazes, requiring only a single 10-min sample, two 6-min samples, or three 3-min samples to reach criterion. No measured variables related to content or complexity were adequately stable regardless of number and duration of samples. Conclusions Generalizability studies confirmed that a large proportion of variance was attributable to individuals rather than the sampling occasion when analyzing the amount and fluency of spontaneous teacher talk. In general, conventionally reported outcomes were more stable than researcher-developed codes, which suggests some categories of teacher talk are more context dependent than others and thus require more intensive data collection to measure reliably.


2009 ◽  
Vol 31 (1) ◽  
pp. 81
Author(s):  
Takeaki Kumazawa

Classical test theory (CTT) has been widely used to estimate the reliability of measurements. Generalizability theory (G theory), an extension of CTT, is a powerful statistical procedure, particularly useful for performance testing, because it enables estimating the percentages of persons variance and multiple sources of error variance. This study focuses on a generalizability study (G study) conducted to investigate such variance components for a paper-pencil multiple-choice vocabulary test used as a diagnostic pretest. Further, a decision study (D study) was conducted to compute the generalizability coefficient (G coefficient) for absolute decisions. The results of the G and D studies indicated that 46% of the total variance was due to the items effect; further, the G coefficient for absolute decisions was low. 古典的テスト理論は尺度の信頼性を測定するため広く用いられている。古典的テスト理論の応用である一般化可能性理論(G理論)は特にパフォーマンステストにおいて有効な分析手法であり、受験者と誤差の要因となる分散成分の割合を測定することができる。本研究では診断テストとして用いられた多岐選択式語彙テストの分散成分を測定するため一般化可能性研究(G研究)を行った。さらに、決定研究(D研究)では絶対評価に用いる一般化可能性係数を算出した。G研究とD研究の結果、項目の分散成分が全体の分散の46%を占め、また信頼度指数は高くなかった。


2021 ◽  
pp. 1-11
Author(s):  
Q. C. Truong ◽  
C. Choo ◽  
K. Numbers ◽  
A. G. Merkin ◽  
H. Brodaty ◽  
...  

ABSTRACT Objectives: This study aimed to apply the generalizability theory (G-theory) to investigate dynamic and enduring patterns of subjective cognitive complaints (SCC), and reliability of two widely used SCC assessment tools. Design: G-theory was applied to assessment scales using longitudinal measurement design with five assessments spanning 10 years of follow-up. Setting: Community-dwelling older adults aged 70–90 years and their informants, living in Sydney, Australia, participated in the longitudinal Sydney Memory and Ageing Study. Participants: The sample included 232 participants aged 70 years and older, and 232 associated informants. Participants were predominantly White Europeans (97.8%). The sample of informants included 76 males (32.8%), 153 females (65.9%), and their age ranged from 27 to 86 years, with a mean age of 61.3 years (SD = 14.38). Measurements: The Memory Complaint Questionnaire (MAC-Q) and the Informant Questionnaire on Cognitive Decline in the Elderly (IQCODE). Results: The IQCODE demonstrated strong reliability in measuring enduring patterns of SCC with G = 0.86. Marginally acceptable reliability of the 6-item MAC-Q (G = 0.77–0.80) was optimized by removing one item resulting in G = 0.80–0.81. Most items of both assessments were measuring enduring SCC with exception of one dynamic MAC-Q item. The IQCODE significantly predicted global cognition scores and risk of dementia incident across all occasions, while MAC-Q scores were only significant predictors on some occasions. Conclusions: While both informants’ (IQCODE) and self-reported (MAC-Q) SCC scores were generalizable across sample population and occasions, self-reported (MAC-Q) scores may be less accurate in predicting cognitive ability and diagnosis of each individual.


Author(s):  
Tiffany N. Anderson ◽  
James N. Lau ◽  
Robert Shi ◽  
Richard W. Sapp ◽  
Lauren R. Aalami ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document