Revision of a Criterion-Referenced Vocabulary Test Using Generalizability Theory

Classical test theory (CTT) has been widely used to estimate the reliability of measurements. Generalizability theory (G theory), an extension of CTT, is a powerful statistical procedure, particularly useful for performance testing, because it enables estimating the percentages of persons variance and multiple sources of error variance. This study focuses on a generalizability study (G study) conducted to investigate such variance components for a paper-pencil multiple-choice vocabulary test used as a diagnostic pretest. Further, a decision study (D study) was conducted to compute the generalizability coefficient (G coefficient) for absolute decisions. The results of the G and D studies indicated that 46% of the total variance was due to the items effect; further, the G coefficient for absolute decisions was low. 古典的テスト理論は尺度の信頼性を測定するため広く用いられている。古典的テスト理論の応用である一般化可能性理論（G理論）は特にパフォーマンステストにおいて有効な分析手法であり、受験者と誤差の要因となる分散成分の割合を測定することができる。本研究では診断テストとして用いられた多岐選択式語彙テストの分散成分を測定するため一般化可能性研究（G研究）を行った。さらに、決定研究（D研究）では絶対評価に用いる一般化可能性係数を算出した。G研究とD研究の結果、項目の分散成分が全体の分散の46%を占め、また信頼度指数は高くなかった。

Download Full-text

Using Generalizability Theory to Develop Clinical Assessment Protocols

Physical Therapy ◽

10.2522/ptj.20120368 ◽

2013 ◽

Vol 93 (4) ◽

pp. 562-569 ◽

Cited By ~ 8

Author(s):

Richard A. Preuss

Keyword(s):

Clinical Assessment ◽

Repeated Measures ◽

Generalizability Theory ◽

Classical Test Theory ◽

Error Variance ◽

Test Theory ◽

Minimal Detectable Change ◽

Multiple Sources ◽

Measurement Error Variance ◽

Patient Profiles

Clinical assessment protocols must produce data that are reliable, with a clinically attainable minimal detectable change (MDC). In a reliability study, generalizability theory has 2 advantages over classical test theory. These advantages provide information that allows assessment protocols to be adjusted to match individual patient profiles. First, generalizability theory allows the user to simultaneously consider multiple sources of measurement error variance (facets). Second, it allows the user to generalize the findings of the main study across the different study facets and to recalculate the reliability and MDC based on different combinations of facet conditions. In doing so, clinical assessment protocols can be chosen based on minimizing the number of measures that must be taken to achieve a realistic MDC, using repeated measures to minimize the MDC, or simply based on the combination that best allows the clinician to monitor an individual patient's progress over a specified period of time.

Download Full-text

Applications of generalizability theory and their relations to classical test theory and structural equation modeling.

Psychological Methods ◽

10.1037/met0000107 ◽

2018 ◽

Vol 23 (1) ◽

pp. 1-26 ◽

Cited By ~ 14

Author(s):

Walter P. Vispoel ◽

Carrie A. Morris ◽

Murat Kilinc

Keyword(s):

Structural Equation Modeling ◽

Structural Equation ◽

Generalizability Theory ◽

Classical Test Theory ◽

Test Theory ◽

Equation Modeling ◽

Classical Test

Download Full-text

Detecting and Measuring Rater Effects in Interpreting Assessment: A Methodological Comparison of Classical Test Theory, Generalizability Theory, and Many-Facet Rasch Measurement

Testing and Assessment of Interpreting - New Frontiers in Translation Studies ◽

10.1007/978-981-15-8554-8_5 ◽

2021 ◽

pp. 85-113

Author(s):

Chao Han

Keyword(s):

Generalizability Theory ◽

Classical Test Theory ◽

Rasch Measurement ◽

Test Theory ◽

Rater Effects ◽

Classical Test ◽

Methodological Comparison

Download Full-text

A Reflection on Student Perceptions of Teaching Quality from Three Psychometric Perspectives: CCT, IRT and GT

Student Feedback on Teaching in Schools ◽

10.1007/978-3-030-75150-0_2 ◽

2021 ◽

pp. 15-29

Author(s):

Hannah Bijlsma ◽

Rikkert van der Lans ◽

Tim Mainhard ◽

Perry den Brok

Keyword(s):

Item Response Theory ◽

Student Perceptions ◽

Generalizability Theory ◽

Classical Test Theory ◽

Teaching Quality ◽

Test Theory ◽

Psychometric Theory ◽

Research On Teaching ◽

Perceptions Of Teaching ◽

Main Message

AbstractThis chapter discusses student perceptions in terms of three psychometric perspectives that dominate contemporary research on teaching quality, namely, Classical Test Theory (CTT), Item Response Theory (IRT) and Generalizability Theory (GT). These perspectives function as being exemplars for the connection between psychometric theories and the different perspectives on “what a perception is” as well as on how and for what purposes student perceptions should be used. The main message of the chapter is that the choice of a psychometric theory is not merely a technical matter, but also has implications for how the nature of perceptions is conceptualized. After presenting and linking each psychometric theory, their strengths and weaknesses in the context of student perceptions of teaching quality and issues on practical implementations are discussed.

Download Full-text

The Reliability Factor: Modeling individual reliability with multiple items from a single assessment

10.31234/osf.io/kr4xq ◽

2020 ◽

Author(s):

Stephen Ross Martin ◽

Philippe Rast

Keyword(s):

Factor Model ◽

Classical Test Theory ◽

Error Variance ◽

Repeated Measurements ◽

Test Theory ◽

Reliability Factor ◽

Reliability Estimates ◽

Factor Modeling ◽

Fixed Quantity ◽

Parallel Tests

Reliability is a crucial concept in psychometrics. Although it is typically estimated as a single fixed quantity, previous work suggests that reliability can vary across persons, groups, and covariates. We propose a novel method for estimating and modeling case-specific reliability without repeated measurements or parallel tests. The proposed method employs a “Reliability Factor” that models the error variance of each case across multiple indicators, thereby producing case-specific reliability estimates. Additionally, we use Gaussian process modeling to a estimate non-linear, non-monotonic function between the latent factor itself and the reliability of the measure, providing an analogue to test information functions in item response theory. The reliability factor model is a new tool for examining latent regions with poor conditional reliability, and correlates thereof, in a classical test theory framework.

Download Full-text

CRITERION-REFERENCED APPLICATIONS OF CLASSICAL TEST THEORY 1,2

Journal of Educational Measurement ◽

10.1111/j.1745-3984.1972.tb00756.x ◽

1972 ◽

Vol 9 (1) ◽

pp. 13-26 ◽

Cited By ~ 89

Author(s):

SAMUEL A. LIVINGSTON

Keyword(s):

Classical Test Theory ◽

Test Theory ◽

Classical Test ◽

Criterion Referenced

Download Full-text

Development and validation of the coronary heart disease scale under the system of quality of life instruments for chronic diseases QLICD-CHD: combinations of classical test theory and Generalizability theory

Health and Quality of Life Outcomes ◽

10.1186/1477-7525-12-82 ◽

2014 ◽

Vol 12 (1) ◽

pp. 82 ◽

Cited By ~ 11

Author(s):

Chonghua Wan ◽

Hezhan Li ◽

Xuejin Fan ◽

Ruixue Yang ◽

Jiahua Pan ◽

...

Keyword(s):

Quality Of Life ◽

Coronary Heart Disease ◽

Heart Disease ◽

Chronic Diseases ◽

Generalizability Theory ◽

Classical Test Theory ◽

Test Theory ◽

Development And Validation ◽

Quality Of Life Instruments

Download Full-text

The estimation of true score variance and error variance in the classical test theory model

Psychometrika ◽

10.1007/bf02291113 ◽

1973 ◽

Vol 38 (2) ◽

pp. 183-201 ◽

Cited By ~ 1

Author(s):

Paul H. Jackson

Keyword(s):

Classical Test Theory ◽

Error Variance ◽

Theory Model ◽

Test Theory ◽

True Score ◽

Classical Test

Download Full-text

Visualising two approaches to explore reliability-power relationships

10.31234/osf.io/qh5mf ◽

2018 ◽

Author(s):

Sam Parsons

Keyword(s):

Low Power ◽

Statistical Power ◽

High Reliability ◽

Classical Test Theory ◽

Error Variance ◽

Test Theory ◽

Total Variance ◽

Power Relationship ◽

Measurement Reliability ◽

True Score

The relationship between measurement reliability and statistical power is a complex one. Where reliability is defined by classical test theory as the proportion of 'true' variance to total variance (the sum of true score and error variance), power is only functionally related to total variance. Therefore, to explore direct relationships between reliability and power, one must hold either true-score variance or error variance constant while varying the other. Here, visualisations are used to illustrate the reliability-power relationship under conditions of fixed true-score variance and fixed error variance. From these visualisations, conceptual distinctions between fixing true-score or error variance can be raised. Namely, when true-score variance is fixed, low reliability (and low power) suggests a true effect may be hidden by error. Whereas, when error variance is fixed, high reliability (and low power) may simply suggest a very small effect. I raise several observations I hope will be useful in considering the utility of measurement reliability and it's relationship to effect sizes and statistical power.

Download Full-text

Validation of the Simplified Chinese Version of FACT-Hep for Patients with Hepatocellular Carcinoma Based on Combinations of Classical Test Theory and Generalizability Theory

10.21203/rs.3.rs-22901/v1 ◽

2020 ◽

Author(s):

Pingguang lei ◽

zheng yang ◽

wei li ◽

jingqing ou ◽

yingli cun ◽

...

Keyword(s):

Hepatocellular Carcinoma ◽

Generalizability Theory ◽

Classical Test Theory ◽

Well Being ◽

Chinese Version ◽

Test Theory ◽

Validity And Reliability ◽

Classical Test ◽

Simplified Chinese ◽

Reliability Coefficients

Abstract Background Quality of life (QOL) is now concerned worldwide in cancer clinical fields and the specific instrument FACT-Hep (Functional Assessment of Cancer Therapy- Hepatobiliary questionnaire) is widely used in English-spoken countries. However, the specific instruments for hepatocellular carcinoma patients in China were seldom and no formal validation on the Simplified Chinese Version of the FACT-Hep was carried out. This study was aimed to validate the Chinese FACT-Hep based on Combinations of Classical Test Theory and Generalizability Theory. Methods The Chinese Version of FACT-Hep and the QLICP-LI were used to measure QOL three times before and after treatments from a sample of 114 in-patients of hepatocellular carcinoma. The scale were evaluated by indicators such as validity and reliability coefficients Cronbach α, Pearson r, intra-class correlation (ICC), and standardized response mean. The Generalizability Theory (G theory) was also applied to addresses the dependability of measurements and estimation of multiple sources of variance. Results The Internal consistency Cronbach’s α coefficients were greater than 0.70 for all domains, and test-retest reliability coefficients for all domains and the overall were greater than 0.80 (exception of emotional Well-being 0.74) with the range from 0.81 to 0.96. G-coefficients and Ф-coefficients confirmed the reliability of the scale further with exact variance components. The domains of PWB, FWB and the overall scale had significant changes after treatments with SRM ranging from 0.40 to 0.69. Conclusions The Chinese version of FACT-Hep has good validity, reliability, and responsiveness, and can be used to measure QOL for patients with hepatocellular carcinoma in China.

Download Full-text