Reliability and Validity in Classical Test Theory

Author(s):  
David Andrich ◽  
Ida Marais
2020 ◽  
Vol 64 (3) ◽  
pp. 219-237
Author(s):  
Brandon LeBeau ◽  
Susan G. Assouline ◽  
Duhita Mahatmya ◽  
Ann Lupkowski-Shoplik

This study investigated the application of item response theory (IRT) to expand the range of ability estimates for gifted (hereinafter referred to as high-achieving) students’ performance on an above-level test. Using a sample of fourth- to sixth-grade high-achieving students ( N = 1,893), we conducted a study to compare estimates from two measurement theories, classical test theory (CTT) and IRT. CTT and IRT make different assumptions about the analysis that impact the reliability and validity of the scores obtained from the test. IRT can also differentiate students based on the student’s grade or within a grade by using the unique string of correct and incorrect answers the student makes while taking the test. This differentiation may have implications for identifying or classifying students who are ready for advanced coursework. An exploration of the differentiation for Math, Reading, and Science tests and the impact the different measurement frameworks can have on classification of students are explored. Implications for academic talent identification with the talent search model and development of academic talent are discussed.


Author(s):  
David L. Streiner

This chapter discusses the two major theories underlying scale development: classical test theory, which has dominated the field for the past century, and item response theory, which is more recent. It begins by summarizing the history of measurement, first of physical and physiological parameters and later of intelligence. This is followed by the steps involved in developing a scale: creating the items, determining if they fully span the construct of interest while at the same time not including irrelevant content, and assessing the usability of the items (whether they are understood correctly, whether they are free of jargon, if they avoid negatively worded phrases, etc.). The chapter then describes how to establish the reliability and validity of the scale—what are called the psychometric properties of the scale. It concludes by discussing some of the shortcomings with classical test theory, how item response theory attempts to address them, and the degree to which it has been successful in this regard. This chapter should be useful for those who need to evaluate existing scales as well as for those wanting to develop new scales.


Author(s):  
James Austin

Classical testing theory, including its origins within psychological measurement, the fundamental principles of true scores and measurement error, psychometrics, and statistical assumptions are the focus of this chapter. Random and systematic forms of measurement error are addressed, and the standard error of measurement is defined. Major approaches to defining and estimating test reliability and validity are reviewed, and practical applications of classical test theory to K-12 music education assessment are considered, including large-scale standardized testing as well as measurement levels, item analysis, and techniques for enhancing the reliability and validity of classroom-level assessments. Finally, the transition from classical test theory to modern test theory is explored.


Author(s):  
Zainab Albikawi ◽  
Mohammad Abuadas

Background: Providing care for schizophrenia patients is complex, and it requires dealing with various psychosocial burdens.Aim: To develop and validate a tool that measures the quality of life and self-stigma (SS) of the schizophrenia patient’s caregiver (QLSSoSPC).Setting: Outpatient psychiatric services clinics in Saudi Arabia.Methods: The current study used a methodological cross-sectional design. A sample of 205 schizophrenia patients’ caregivers was recruited by using a convenient sampling method. Classical Test Theory and Rasch Analysis approaches were used.Results: The developed tool has proven acceptable level of reliability and validity. The analysis confirmed seven-factor structure accounted for 74.4% of the total variance. Cronbach’s reliability statistics for the developed tool were satisfactory and ranged from 0.80 to 0.91.Conclusion: The psychometric properties of the QLSSoSPC tool supported its prospective use and allowing us to recommend the implementation of the tool on behalf of clinical and research purposes.


Sign in / Sign up

Export Citation Format

Share Document