Large-scale Assessments in Education

AbstractAnalysis of user-generated data (for example process data from logfiles, learning analytics, and data mining) in computer-based environments has gained much attention in the last decade and is considered a promising evolving field in learning sciences. In the area of educational assessment, the benefits of such data and how to exploit them are increasingly emphasised. Even though the use of process data in assessment holds significant promise, the ethical and regulatory implications associated with it have not been sufficiently considered. To address this issue and to provide an overview of how ethical and regulatory requirements interface with process data from assessments in primary and secondary education (K-12), we conducted a systematic literature review. Initial results showed that few studies considered ethical, privacy and regulatory issues in K-12 assessment, prompting a widening of the search criteria to include research in higher education also, which identified 22 studies. The literature that was relevant to our research questions represented an approximate balance in the number of theoretical and empirical studies. The studies identified as relevant interpret issues of privacy largely in terms of informed consent and the research pays little attention to ethical and privacy issues in the use of process data in assessment. The implications for the field of educational assessment and the use of process data are discussed. This includes the need to develop a specific code of ethics to govern the use of process- and logfile data in educational assessment.

Download Full-text

Financial literacy among Finnish adolescents in PISA 2018: the role of financial learning and dispositional factors

Large-scale Assessments in Education ◽

10.1186/s40536-021-00118-0 ◽

2021 ◽

Vol 9 (1) ◽

Author(s):

Gintautas Silinskas ◽

Arto K. Ahonen ◽

Terhi-Anna Wilska

Keyword(s):

Learning Strategies ◽

Financial Literacy ◽

Source Credibility ◽

Financial Education ◽

Literacy Skills ◽

Effective Strategies ◽

Dispositional Characteristics ◽

Dispositional Factors ◽

Literacy Scores

AbstractThe aim or the present study was to examine the relative importance of financial education in school and families and dispositional factors (competitiveness, work mastery, meta-cognition) in predicting financial literacy among Finnish adolescents. The data on the 4328 Finnish 15-year-olds was drawn from the PISA 2018 assessment. Financial literacy was measured by tests, and financial education and dispositional factors were assessed by adolescent questionnaires. First, the results showed that financial education in school was positively associated with adolescents’ financial literacy skills, whereas parental involvement in financial matters did not relate or related negatively to financial literacy scores. Second, dispositional factors, such as competitiveness, work mastery, and meta-cognition (effective strategies to understand/remember information, to summarize information, and to evaluate source credibility) were the strongest positive predictors of the financial literacy scores. Overall, the present study emphasizes that certain social factors (schools and families) and especially dispositional characteristics (personality/motivation and critical thinking/learning strategies) may shape the development of the financial skills of adolescents.

Download Full-text

The achievement gap in reading competence: the effect of measurement non-invariance across school types

Large-scale Assessments in Education ◽

10.1186/s40536-021-00116-2 ◽

2021 ◽

Vol 9 (1) ◽

Author(s):

Theresa Rohm ◽

Claus H. Carstensen ◽

Luise Fischer ◽

Timo Gnambs

Keyword(s):

Data Structure ◽

Achievement Gap ◽

Secondary School ◽

Differential Item Functioning ◽

Measurement Invariance ◽

Structural Equation Models ◽

Competence Development ◽

School Type ◽

Reading Competence ◽

Item Functioning

Abstract Background After elementary school, students in Germany are separated into different school tracks (i.e., school types) with the aim of creating homogeneous student groups in secondary school. Consequently, the development of students’ reading achievement diverges across school types. Findings on this achievement gap have been criticized as depending on the quality of the administered measure. Therefore, the present study examined to what degree differential item functioning affects estimates of the achievement gap in reading competence. Methods Using data from the German National Educational Panel Study, reading competence was investigated across three timepoints during secondary school: in grades 5, 7, and 9 (N = 7276). First, using the invariance alignment method, measurement invariance across school types was tested. Then, multilevel structural equation models were used to examine whether a lack of measurement invariance between school types affected the results regarding reading development. Results Our analyses revealed some measurement non-invariant items that did not alter the patterns of competence development found among school types in the longitudinal modeling approach. However, misleading conclusions about the development of reading competence in different school types emerged when the hierarchical data structure (i.e., students being nested in schools) was not taken into account. Conclusions We assessed the relevance of measurement invariance and accounting for clustering in the context of longitudinal competence measurement. Even though differential item functioning between school types was found for each measurement occasion, taking these differences in item estimates into account did not alter the parallel pattern of reading competence development across German secondary school types. However, ignoring the clustered data structure of students being nested within schools led to an overestimation of the statistical significance of school type effects.

Download Full-text

The relative effect of student, family and school-related factors on math achievement by location of the school

Large-scale Assessments in Education ◽

10.1186/s40536-021-00117-1 ◽

2021 ◽

Vol 9 (1) ◽

Author(s):

Ekber Tomul ◽

Emine Önder ◽

Erdal Taslidere

Keyword(s):

Primary Schools ◽

Math Achievement ◽

Relative Effect ◽

Medium Size ◽

Related Factors ◽

Math Scores ◽

Characteristic Sets ◽

Before School ◽

Occupation Level ◽

Related Characteristic

AbstractThis study aims to examine the relative effects of student, family and school-related characteristics on 4th grade students’ math achievement according to location of the school in Turkey. The data of 6435 students studying at 260 primary schools were analyzed using TIMSS-2015 database. The dependent variable of the study was students’ math scores and 19 factors constituting the student, family and school-related characteristics were the independent variables. The location of the school was classified as urban, suburban, medium-size city and village. The data was analyzed via single level multiple linear regression. The results revealed that the entire models explained the largest amount of variance (52%) in the schools located in the villages and the least amount of variance (44%) in those located in the urban area. Although all of the student, family and school-related characteristic sets were found to be significantly related with the achievement, the student-related characteristics explained the largest amount of variance in achievement. Students’ confidence in math contributed almost the highest amount of variance, and the early numeracy tasks, absenteeism in school, parents’ highest education level, parents’ highest occupation level, early numeric activities before school explained small amounts of variance in students’ math achievement in the schools of all locations.

Download Full-text

RALSA: the R analyzer for large-scale assessments

Large-scale Assessments in Education ◽

10.1186/s40536-021-00114-4 ◽

2021 ◽

Vol 9 (1) ◽

Author(s):

Plamen V. Mirazchiyski

Keyword(s):

Open Source ◽

Large Scale ◽

International Student ◽

Student Assessment ◽

Data Sets ◽

Large Scale Assessment ◽

Complex Sampling ◽

International Student Assessment ◽

Large Scale Assessments ◽

Pisa Data

AbstractThis paper presents the R Analyzer for Large-Scale Assessments (), a newly developed package for analyzing data from studies using complex sampling and assessment designs. Such studies are, for example, the IEA’s Trends in International Mathematics and Science Study and the OECD’s Programme for International Student Assessment. The package covers all cycles from a broad range of studies. The paper presents the architecture of the package, the overall workflow and illustrates some basic analyses using it. The package is open-source and free of charge. Other software packages for analyzing large-scale assessment data exist, some of them are proprietary, others are open-source. However, is the first comprehensive package, designed for the user experience and has some distinctive features. One innovation is that the package can convert SPSS data from large scale assessments into native data sets. It can also do so for PISA data from cycles prior to 2015, where the data is provided in tab-delimited text files along with SPSS control syntax files. Another feature is the availability of a graphical user interface, which is also written in and operates in any operating system where a full copy of can be installed. The output from any analysis function is written into an MS Excel workbook with multiple sheets for the estimates, model statistics, analysis information and the calling syntax itself for reproducing the analysis in future. The flexible design of allows for the quick addition of new studies, analysis types and features to the existing ones.

Download Full-text

From byproduct to design factor: on validating the interpretation of process indicators based on log data

Large-scale Assessments in Education ◽

10.1186/s40536-021-00113-5 ◽

2021 ◽

Vol 9 (1) ◽

Author(s):

Frank Goldhammer ◽

Carolin Hahnel ◽

Ulf Kroehne ◽

Fabian Zehner

Keyword(s):

Large Scale ◽

Empirical Evaluation ◽

Worked Examples ◽

Assessment System ◽

Continuous Assessment ◽

Design Factor ◽

Log Data ◽

Process Indicators ◽

Assessment Approach ◽

Intended Interpretation

AbstractInternational large-scale assessments such as PISA or PIAAC have started to provide public or scientific use files for log data; that is, events, event-related attributes and timestamps of test-takers’ interactions with the assessment system. Log data and the process indicators derived from it can be used for many purposes. However, the intended uses and interpretations of process indicators require validation, which here means a theoretical and/or empirical justification that inferences about (latent) attributes of the test-taker’s work process are valid. This article reviews and synthesizes measurement concepts from various areas, including the standard assessment paradigm, the continuous assessment approach, the evidence-centered design (ECD) framework, and test validation. Based on this synthesis, we address the questions of how to ensure the valid interpretation of process indicators by means of an evidence-centered design of the task situation, and how to empirically challenge the intended interpretation of process indicators by developing and implementing correlational and/or experimental validation strategies. For this purpose, we explicate the process of reasoning from log data to low-level features and process indicators as the outcome of evidence identification. In this process, contextualizing information from log data is essential in order to reduce interpretative ambiguities regarding the derived process indicators. Finally, we show that empirical validation strategies can be adapted from classical approaches investigating the nomothetic span and construct representation. Two worked examples illustrate possible validation strategies for the design phase of measurements and their empirical evaluation.

Download Full-text

Exploring the association between occupational complexity and numeracy

Large-scale Assessments in Education ◽

10.1186/s40536-021-00112-6 ◽

2021 ◽

Vol 9 (1) ◽

Author(s):

Mary Genevieve Billington ◽

Njål Foldnes

Keyword(s):

Task Complexity ◽

Basic Skills ◽

Well Being ◽

Three Dimensions ◽

Full Time ◽

Occupational Complexity ◽

Workplace Organization ◽

Occupational Task ◽

Full Time Employment ◽

Practical Implications

AbstractThe basic cognitive skill of numeracy is a recognized form of human capital, associated with economic and social well being for individuals and for nations. In this study, we explore how occupational complexity relates to proficiency in numeracy, among adults in full-time employment. We operationalize occupational complexity by constructing three measures of task complexity: complexity with data, complexity with people and complexity with things. Data from the international OECD survey of adult skills, 2012, is employed to investigate both the distribution of these three dimensions of occupational task complexity and how these relate to numeracy in 13 countries. The analysis indicates that data occupational complexity predicts numeracy scores, when controlling for age, gender and educational level. The findings open for a hypothesis that occupational activities may enhance basic skills in adult populations. If elaborated and supported through further studies this finding has practical implications for workplace organization and contributes to theoretical understandings of the development of basic skills in adults.

Download Full-text

Does the choice of response time threshold procedure substantially affect inferences concerning the identification and exclusion of rapid guessing responses? A meta-analysis

Large-scale Assessments in Education ◽

10.1186/s40536-021-00110-8 ◽

2021 ◽

Vol 9 (1) ◽

Author(s):

Joseph A. Rios ◽

Jiayi Deng

Keyword(s):

Response Time ◽

Empirical Data ◽

Test Performance ◽

Meta Analysis ◽

Performance Outcomes ◽

Measurement Properties ◽

Individual Level ◽

Group Score ◽

Time Threshold ◽

Threshold Procedure

Abstract Background In testing contexts that are predominately concerned with power, rapid guessing (RG) has the potential to undermine the validity of inferences made from educational assessments, as such responses are unreflective of the knowledge, skills, and abilities assessed. Given this concern, practitioners/researchers have utilized a multitude of response time threshold procedures that classify RG responses in these contexts based on either the use of no empirical data (e.g., an arbitrary time limit), response time distributions, and the combination of response time and accuracy information. As there is little understanding of how these procedures compare to each other, this meta-analysis sought to investigate whether threshold typology is related to differences in descriptive, measurement property, and performance outcomes in these contexts. Methods Studies were sampled that: (a) employed two or more response time (RT) threshold procedures to identify and exclude RG responses on the same computer-administered low-stakes power test; and (b) evaluated differences between procedures on the proportion of RG responses and responders, measurement properties, and test performance. Results Based on as many as 86 effect sizes, our findings indicated non-negligible differences between RT threshold procedures in the proportion of RG responses and responders. The largest differences for these outcomes were observed between procedures using no empirical data and those relying on response time and accuracy information. However, these differences were not related to variability in aggregate-level measurement properties and test performance. Conclusions When filtering RG responses to improve inferences concerning item properties and group score outcomes, the actual threshold procedure chosen may be of less importance than the act of identifying such deleterious responses. However, given the conservative nature of RT thresholds that use no empirical data, practitioners may look to avoid the use of these procedures when making inferences at the individual-level, given their potential for underclassifying RG.

Download Full-text

Modern international large-scale assessment in education: an integrative review and mapping of the literature

Large-scale Assessments in Education ◽

10.1186/s40536-021-00109-1 ◽

2021 ◽

Vol 9 (1) ◽

Author(s):

Daniel Hernández-Torrano ◽

Matthew G. R. Courtney

Keyword(s):

Interpersonal Relationships ◽

Large Scale ◽

The Public ◽

Large Scale Assessment ◽

Research Collaborations ◽

Scale Assessment ◽

Policy Measurement ◽

Recent Developments ◽

Research In Education ◽

Data Linkages

AbstractResearch in international large-scale assessment (ILSA) has become an increasingly popular field of study in education. Consequently, interest and debate in the field by practitioners, researchers, policymakers, and the public has grown over the past decades. This study adopts a descriptive bibliometric approach to map modern research on ILSA in education and provide an up-to-date picture of the recent developments and structure of the field. The analysis of 2,233 journal articles indexed in the Web of Sciences database revealed that ILSA research in education is an emerging field in a stage of exponential growth that has become increasingly international with recent substantive contributions from China, Spain, and Turkey. Research in the field is currently produced by a tupid network of scholars with diverse geographical backgrounds that engage frequently in national and international research collaborations. Also, the field is relatively interdisciplinary and has developed grounded on nine differentiated historical paths. The PISA program has received the greatest attention in the field, and a wide variety of topics have been addressed in the literature in the last decades, including equity and quality education, globalization and education policy, measurement and statistics, student motivation and self-concept, and interpersonal relationships. The paper concludes by pointing to the potential of future ILSA research to make use of new more relevant instrumentation, data linkages, and trans-regional collaborations.

Download Full-text

Defensible inferences from a nested sequence of logistic regressions: a guide for the perplexed

Large-scale Assessments in Education ◽

10.1186/s40536-021-00111-7 ◽

2021 ◽

Vol 9 (1) ◽

Author(s):

Gulsah Gurkan ◽

Yoav Benjamini ◽

Henry Braun

Keyword(s):

Logistic Regression ◽

Cognitive Skills ◽

Regression Models ◽

Large Scale ◽

Family Background ◽

National Income ◽

Logistic Models ◽

Logistic Regression Models ◽

Logistic Regressions ◽

The Impact

AbstractEmploying nested sequences of models is a common practice when exploring the extent to which one set of variables mediates the impact of another set. Such an analysis in the context of logistic regression models confronts two challenges: (i) direct comparisons of coefficients across models are generally biased due to the changes in scale that accompany the changes in the set of explanatory variables, (ii) conducting a large number of tests induces a problem of multiplicity that can lead to spurious findings of significance if not heeded. This article aims to illustrate a practical strategy for conducting analyses in the face of these challenges. The challenges—and how to address them—are illustrated using a subset of the findings reported by Braun (Large-scale Assess Educ 6(4):1–52, 2018. 10.1186/s40536-018-0058-x), drawn from the Programme for the International Assessment of Adult Competencies (PIAAC), an international, large-scale assessment of adults. For each country in the dataset, a nested pair of logistic regression models was fit in order to investigate the role of Educational Attainment and Cognitive Skills in mediating the impact of family background and demographic characteristics on the location of an individual’s annual income in the national income distribution. A modified version of the Karlson–Holm–Breen (KHB) method was employed to obtain an unbiased estimate of the true differences in the coefficients between nested logistic models. In order to address the issue of multiplicity, a recent generalization of the Benjamini–Hochberg (BH) False Discovery Rate (FDR)-controlling procedure to hierarchically structured hypotheses was employed and compared to two conventional methods. The differences between the changes in coefficients calculated conventionally and with the KHB adjustment varied from negligible to very substantial. When combined with the actual magnitudes of the coefficients, we concluded that the more proximal factors indeed act as strong mediators for the background factors, but less so for Age, and hardly at all for Gender. With respect to multiplicity, applying the FDR-controlling procedure yielded results very similar to those obtained by applying a standard per-comparison procedure, but quite a few more discoveries in comparison to the Bonferroni procedure. The KHB methodology illustrated here can be applied wherever there is interest in comparing nested logistic regressions. Modifications to account for probability sampling are practicable. The categorization of variables and the order of entry should be determined by substantive considerations. On the other hand, the BH procedure is perfectly general and can be implemented to address multiplicity issues in a broad range of settings.

Download Full-text

Large-scale Assessments in Education
Latest Publications

TOTAL DOCUMENTS

H-INDEX

Published By Springer (Biomed Central Ltd.)

A call to action: a systematic review of ethical and regulatory issues in using process data in educational assessment

Financial literacy among Finnish adolescents in PISA 2018: the role of financial learning and dispositional factors

The achievement gap in reading competence: the effect of measurement non-invariance across school types

The relative effect of student, family and school-related factors on math achievement by location of the school

RALSA: the R analyzer for large-scale assessments

From byproduct to design factor: on validating the interpretation of process indicators based on log data

Exploring the association between occupational complexity and numeracy

Does the choice of response time threshold procedure substantially affect inferences concerning the identification and exclusion of rapid guessing responses? A meta-analysis

Modern international large-scale assessment in education: an integrative review and mapping of the literature

Defensible inferences from a nested sequence of logistic regressions: a guide for the perplexed

Export Citation Format

Large-scale Assessments in EducationLatest Publications

TOTAL DOCUMENTS

H-INDEX

Published By Springer (Biomed Central Ltd.)

A call to action: a systematic review of ethical and regulatory issues in using process data in educational assessment

Financial literacy among Finnish adolescents in PISA 2018: the role of financial learning and dispositional factors

The achievement gap in reading competence: the effect of measurement non-invariance across school types

The relative effect of student, family and school-related factors on math achievement by location of the school

RALSA: the R analyzer for large-scale assessments

From byproduct to design factor: on validating the interpretation of process indicators based on log data

Exploring the association between occupational complexity and numeracy

Does the choice of response time threshold procedure substantially affect inferences concerning the identification and exclusion of rapid guessing responses? A meta-analysis

Modern international large-scale assessment in education: an integrative review and mapping of the literature

Defensible inferences from a nested sequence of logistic regressions: a guide for the perplexed

Large-scale Assessments in Education
Latest Publications