Testing Students with Special Educational Needs in Large-Scale Assessments – Psychometric Properties of Test Scores and Associations with Test Taking Behavior

AbstractMeasures of cognitive or socio-emotional skills from large-scale assessments surveys (LSAS) are often based on advanced statistical models and scoring techniques unfamiliar to applied researchers. Consequently, applied researchers working with data from LSAS may be uncertain about the assumptions and computational details of these statistical models and scoring techniques and about how to best incorporate the resulting skill measures in secondary analyses. The present paper is intended as a primer for applied researchers. After a brief introduction to the key properties of skill assessments, we give an overview over the three principal methods with which secondary analysts can incorporate skill measures from LSAS in their analyses: (1) as test scores (i.e., point estimates of individual ability), (2) through structural equation modeling (SEM), and (3) in the form of plausible values (PVs). We discuss the advantages and disadvantages of each method based on three criteria: fallibility (i.e., control for measurement error and unbiasedness), usability (i.e., ease of use in secondary analyses), and immutability (i.e., consistency of test scores, PVs, or measurement model parameters across different analyses and analysts). We show that although none of the methods are optimal under all criteria, methods that result in a single point estimate of each respondent’s ability (i.e., all types of “test scores”) are rarely optimal for research purposes. Instead, approaches that avoid or correct for measurement error—especially PV methodology—stand out as the method of choice. We conclude with practical recommendations for secondary analysts and data-producing organizations.

Download Full-text

Developments in Psychometric Population Models for Technology-Based Large-Scale Assessments: An Overview of Challenges and Opportunities

Journal of Educational and Behavioral Statistics ◽

10.3102/1076998619881789 ◽

2019 ◽

Vol 44 (6) ◽

pp. 671-705 ◽

Cited By ~ 4

Author(s):

Matthias von Davier ◽

Lale Khorramdel ◽

Qiwei He ◽

Hyo Jeong Shin ◽

Haiwen Chen

Keyword(s):

Test Scores ◽

Large Scale ◽

Population Models ◽

Data Types ◽

Challenges And Opportunities ◽

Computer Based ◽

Complex Test ◽

Large Scale Assessments ◽

Item Types ◽

Latent Regression

International large-scale assessments (ILSAs) transitioned from paper-based assessments to computer-based assessments (CBAs) facilitating the use of new item types and more effective data collection tools. This allows implementation of more complex test designs and to collect process and response time (RT) data. These new data types can be used to improve data quality and the accuracy of test scores obtained through latent regression (population) models. However, the move to a CBA also poses challenges for comparability and trend measurement, one of the major goals in ISLAs. We provide an overview of current methods used in ILSAs to examine and assure the comparability of data across different assessment modes and methods that improve the accuracy of test scores by making use of new data types provided by a CBA.

Download Full-text

Appropriate Test-Taking Instructions for Students With Special Educational Needs

Journal of Cognitive Education and Psychology ◽

10.1891/1945-8959.16.3.227 ◽

2017 ◽

Vol 16 (3) ◽

pp. 227-240 ◽

Cited By ~ 6

Author(s):

Lena Nusser ◽

Sabine Weinert

Keyword(s):

Physical Activity ◽

Cognitive Skills ◽

Reading Speed ◽

Educational Needs ◽

Reading Test ◽

Control Group ◽

Special Educational Needs ◽

Test Taking ◽

Relevant Variables ◽

Test Requirements

If children fail to understand test instructions, measurements of their competence may be unfair and invalid. This is especially relevant for students with special educational needs (SEN) because they face greater challenges in comprehending instructions. Two interventions were designed to facilitate the comprehension of test requirements by presenting intensified instructions and to enhance students’ attention by engaging them in physical activity before receiving the test instructions. Three-hundred forty-eight students with SEN aged 8–12 years were randomly assigned to an experimental condition or a control group. Even after controlling for relevant variables (reading speed, basic cognitive skills), students participating in the interventions performed better in a reading test than controls. As hypothesized, the intensified test instructions reduced the number of responses that were not compliant with instructions. In conclusion, this study shows the importance of adapting test instructions for students with SEN, and it proposes interventions that can be implemented in other assessments.

Download Full-text

Teacher satisfaction at the educational placement of students with special educational needs

Journal of Human Sciences ◽

10.14687/ijhs.v13i1.3675 ◽

2016 ◽

Vol 13 (1) ◽

pp. 1792 ◽

Cited By ~ 1

Author(s):

Timo Saloviita ◽

Markku Leskinen

Keyword(s):

Special Education ◽

Large Scale ◽

Teacher Satisfaction ◽

Educational Needs ◽

Special School ◽

Special Educational Needs ◽

Educational Placement ◽

Special Schools ◽

Part Time ◽

Restrictive Environment

This study conducted a large-scale survey to investigate the satisfaction of Finnish primary school teachers toward the current educational placement of their students with special educational needs (SEN). Teachers were asked to recommend the most suitable educational placement for each of their SEN students from a pool of six alternatives: a mainstream classroom, part-time special education, a special classroom in the mainstream school, a special school, a state special school, or an institution. Data were obtained from 980 students representing 68 schools. The results showed that, in the majority of cases, teachers recommended a different level to the current level of placement for their SEN students. Teachers in the mainstream classrooms mostly recommended special classrooms, while teachers of the special education classrooms frequently recommended special schools, and teachers of the special schools often recommended state special schools or special education classrooms. A less restrictive environment was recommended for 20% and a more restrictive environment was recommended for 33% of the students. The results are interpreted in terms of organizational selection.

Download Full-text

Including Students With Special Educational Needs in the Competence Assessment of the NEPS—Results on the Comparability of Test Scores in Reading

Methodological Issues of Longitudinal Surveys ◽

10.1007/978-3-658-11994-2_27 ◽

2016 ◽

pp. 485-501

Author(s):

Anna Südkamp ◽

Steffi Pohl ◽

Jana Heydrich ◽

Sabine Weinert

Keyword(s):

Test Scores ◽

Educational Needs ◽

Competence Assessment ◽

Special Educational Needs

Download Full-text

Special Educational Needs: the relationship between teacher assessment, test scores and classroom behaviour

British Educational Research Journal ◽

10.1080/0141192820080201 ◽

1982 ◽

Vol 8 (2) ◽

pp. 111-122 ◽

Cited By ~ 4

Author(s):

Diana Moses

Keyword(s):

Test Scores ◽

Educational Needs ◽

Teacher Assessment ◽

Special Educational Needs ◽

Classroom Behaviour ◽

The Relationship

Download Full-text

Item Position Effects Are Moderated by Changes in Test-Taking Effort

Applied Psychological Measurement ◽

10.1177/0146621616676791 ◽

2016 ◽

Vol 41 (2) ◽

pp. 115-129 ◽

Cited By ~ 18

Author(s):

Sebastian Weirich ◽

Martin Hecht ◽

Christiane Penk ◽

Alexander Roppelt ◽

Katrin Böhme

Keyword(s):

Large Scale ◽

Context Effects ◽

Item Difficulty ◽

Position Effect ◽

Reliability And Validity ◽

Position Effects ◽

Test Taking ◽

Current Test ◽

Item Position ◽

Large Scale Assessments

This article examines the interdependency of two context effects that are known to occur regularly in large-scale assessments: item position effects and effects of test-taking effort on the probability of correctly answering an item. A microlongitudinal design was used to measure test-taking effort over the course of a large-scale assessment of 60 min. Two components of test-taking effort were investigated: initial effort and change in effort. Both components of test-taking effort significantly affected the probability to solve an item. In addition, it was found that participants’ current test-taking effort diminished considerably across the course of the test. Furthermore, a substantial linear position effect was found, which indicated that item difficulty increased during the test. This position effect varied considerably across persons. Concerning the interplay of position effects and test-taking effort, it was found that only the change in effort moderates the position effect and that persons differ with respect to this moderation effect. The consequences of these results concerning the reliability and validity of large-scale assessments are discussed.

Download Full-text