scholarly journals Early tracking and different types of inequalities in achievement: difference-in-differences evidence from 20 years of large-scale assessments

2021 ◽  
Vol 33 (1) ◽  
pp. 139-167
Author(s):  
Andrés Strello ◽  
Rolf Strietholt ◽  
Isa Steinmann ◽  
Charlotte Siepmann

AbstractResearch to date on the effects of between-school tracking on inequalities in achievement and on performance has been inconclusive. A possible explanation is that different studies used different data, focused on different domains, and employed different measures of inequality. To address this issue, we used all accumulated data collected in the three largest international assessments—PISA (Programme for International Student Assessment), PIRLS (Progress in International Reading Literacy Study), and TIMSS (Trends in International Mathematics and Science Study)—in the past 20 years in 75 countries and regions. Following the seminal paper by Hanushek and Wößmann (2006), we combined data from a total of 21 cycles of primary and secondary school assessments to estimate difference-in-differences models for different outcome measures. We synthesized the effects using a meta-analytical approach and found strong evidence that tracking increased social achievement gaps, that it had smaller but still significant effects on dispersion inequalities, and that it had rather weak effects on educational inadequacies. In contrast, we did not find evidence that tracking increased performance levels. Besides these substantive findings, our study illustrated that the effect estimates varied considerably across the datasets used because the low number of countries as the units of analysis was a natural limitation. This finding casts doubt on the reproducibility of findings based on single international datasets and suggests that researchers should use different data sources to replicate analyses.

2018 ◽  
Vol 21 ◽  
pp. 80-90
Author(s):  
Chandra Mani Paudel ◽  
Ram Chandra Panday

This paper tries to present results from a systematic review of literature that reviewed the large-scale assessments finding in the South Asian context especially focusing Nepal. The main objective of the LEAP programme is to reform the quality of learning in the Asia-Pacific region by developing capacity of the Member States to collect, analyze and utilize international and national assessment data identifying learning enablers. The review has identified the high order skills overshadowed by rote learning. It has also employed Item Response Theory (IRT) making assessments comparable and connected with the previous levels. International Assessments such as the Programme for International Student Assessment (PISA) and the Trends in Mathematics and Science Study (TIMSS) collected vast amounts of data on schools, students and households. The use of education-related “big data” for evidence-based policy making is limited, partly due to insufficient institutional capacity of countries to analyze such data and link results with policies.


Methodology ◽  
2007 ◽  
Vol 3 (4) ◽  
pp. 149-159 ◽  
Author(s):  
Oliver Lüdtke ◽  
Alexander Robitzsch ◽  
Ulrich Trautwein ◽  
Frauke Kreuter ◽  
Jan Marten Ihme

Abstract. In large-scale educational assessments such as the Third International Mathematics and Sciences Study (TIMSS) or the Program for International Student Assessment (PISA), sizeable numbers of test administrators (TAs) are needed to conduct the assessment sessions in the participating schools. TA training sessions are run and administration manuals are compiled with the aim of ensuring standardized, comparable, assessment situations in all student groups. To date, however, there has been no empirical investigation of the effectiveness of these standardizing efforts. In the present article, we probe for systematic TA effects on mathematics achievement and sample attrition in a student achievement study. Multilevel analyses for cross-classified data using Markov Chain Monte Carlo (MCMC) procedures were performed to separate the variance that can be attributed to differences between schools from the variance associated with TAs. After controlling for school effects, only a very small, nonsignificant proportion of the variance in mathematics scores and response behavior was attributable to the TAs (< 1%). We discuss practical implications of these findings for the deployment of TAs in educational assessments.


2019 ◽  
Vol 44 (6) ◽  
pp. 752-781
Author(s):  
Michael O. Martin ◽  
Ina V.S. Mullis

International large-scale assessments of student achievement such as International Association for the Evaluation of Educational Achievement’s Trends in International Mathematics and Science Study (TIMSS) and Progress in International Reading Literacy Study and Organization for Economic Cooperation and Development’s Program for International Student Assessment that have come to prominence over the past 25 years owe a great deal in methodological terms to pioneering work by National Assessment of Educational Progress (NAEP). Using TIMSS as an example, this article describes how a number of core techniques, such as matrix sampling, student population sampling, item response theory scaling with population modeling, and resampling methods for variance estimation, have been adapted and implemented in an international context and are fundamental to the international assessment effort. In addition to the methodological contributions of NAEP, this article illustrates how the large-scale international assessments go beyond measuring student achievement by representing important aspects of community, home, school, and classroom contexts in ways that can be used to address issues of importance to researchers and policymakers.


2020 ◽  
pp. 249-263
Author(s):  
Luisa Araújo ◽  
Patrícia Costa ◽  
Nuno Crato

AbstractThis chapter provides a short description of what the Programme for International Student Assessment (PISA) measures and how it measures it. First, it details the concepts associated with the measurement of student performance and the concepts associated with capturing student and school characteristics and explains how they compare with some other International Large-Scale Assessments (ILSA). Second, it provides information on the assessment of reading, the main domain in PISA 2018. Third, it provides information on the technical aspects of the measurements in PISA. Lastly, it offers specific examples of PISA 2018 cognitive items, corresponding domains (mathematics, science, and reading), and related performance levels.


2021 ◽  
Author(s):  
Alexander Robitzsch ◽  
Oliver Lüdtke

International large-scale assessments (LSAs) such as the Programme for International Student Assessment (PISA) provide important information about the distribution of student proficiencies across a wide range of countries. The repeated assessments of these content domains offer policymakers important information for evaluating educational reforms and received considerable attention from the media. Furthermore, the analytical strategies employed in LSAs often define methodological standards for applied researchers in the field. Hence, it is vital to critically reflect the conceptual foundations of analytical choices in LSA studies. This article discusses methodological challenges in selecting and specifying the scaling model used to obtain proficiency estimates from the individual student responses in LSA studies. We distinguish design-based inference from model-based inference. It is argued that for the official reporting of LSA results, design-based inference should be preferred because it allows for a clear definition of the target of inference (e.g., country mean achievement) and is less sensitive to specific modeling assumptions. More specifically, we discuss five analytical choices in the specification of the scaling model: (1) Specification of the functional form of item response functions, (2) the treatment of local dependencies and multidimensionality, (3) the consideration of test-taking behavior for estimating student ability, and the role of country differential items functioning (DIF) for (4) cross-country comparisons, and (5) trend estimation. This article's primary goal is to stimulate discussion about recently implemented changes and suggested refinements of the scaling models in LSA studies.


Methodology ◽  
2021 ◽  
Vol 17 (1) ◽  
pp. 22-38
Author(s):  
Jason C. Immekus

Within large-scale international studies, the utility of survey scores to yield meaningful comparative data hinges on the degree to which their item parameters demonstrate measurement invariance (MI) across compared groups (e.g., culture). To-date, methodological challenges have restricted the ability to test the measurement invariance of item parameters of these instruments in the presence of many groups (e.g., countries). This study compares multigroup confirmatory factor analysis (MGCFA) and alignment method to investigate the MI of the schoolwork-related anxiety survey across gender groups within the 35 Organisation for Economic Co-operation and Development (OECD) countries (gender × country) of the Programme for International Student Assessment 2015 study. Subsequently, the predictive validity of MGCFA and alignment-based factor scores for subsequent mathematics achievement are examined. Considerations related to invariance testing of noncognitive instruments with many groups are discussed.


2017 ◽  
Vol 28 (68) ◽  
pp. 344 ◽  
Author(s):  
Maria de Lourdes Haywanon Santos Araújo ◽  
Robinson Moreira Tenório

<p>O objetivo desta pesquisa consistiu em analisar como foram utilizados os resultados do Programa Internacional de Avaliação de Estudantes (PISA) no contexto educacional brasileiro. A revisão de literatura permitiu apontar a avaliação como um fator fundamental para a qualificação da educação, elaborar um panorama das pesquisas sobre o PISA no Brasil, além de propiciar discussões sobre a necessidade do uso dos resultados das avaliações em larga escala. A partir da análise documental e de entrevistas semiestruturadas, foi possível não apenas apresentar um estudo sobre o uso dos resultados do PISA no país, mas também estabelecer categorias de usos como o Uso Indevido ou Não Uso, apresentando as possibilidades e dificuldades dessa utilização e o papel dos gestores nesse processo.</p><p><strong>Palavras-chave:</strong> Pisa; Uso de Resultados; Avaliação Educacional; Políticas Públicas.</p><p> </p><p><strong><em>Resultados brasileños en el PISA y sus (des)usos</em></strong></p><p><em>El objetivo de este estudio consistió en analizar cómo se utilizaron los resultados del Programa Internacional de Evaluación de Estudiantes (PISA) en el marco educacional brasileño. La revisión de literatura permitió que la evaluación se considerase como un factor fundamental para la cualificación de la educación y se elaborase un panorama de las investigaciones sobre PISA en Brasil, además de propiciar discusiones sobre la necesidad del uso de los resultados de las evaluaciones en gran escala. A partir del análisis documental y de entrevistas semiestructuradas, se hizo posible no solo presentar un estudio sobre el uso de los resultados de PISA en el país, sino también establecer categorías de usos, como el Uso Indebido o No Uso, presentando las posibilidades y dificultades de dicha utilización y el papel de los gestores en este proceso.</em></p><p><em><strong>Palabras-clave:</strong> PISA; Uso de Resultados; Evaluación Educacional; Políticas Públicas.</em></p><p><em> </em></p><p><strong><em>Brazilian results in PISA and its (mis)uses</em></strong></p><p><em>The objective of this study was to analyze how the results of the Program for International Student Assessment (PISA) were used in the Brazilian educational context. The literature review showed that assessment is a fundamental factor for the qualification of education, for elaborating an overview of the PISA studies in Brazil, as well as for promoting discussions about the need to use the results of evaluations on a large scale. Based on the documentary analysis and semi-structured interviews, it was possible not only to present a study on the use of the PISA results in the country but also to establish categories of uses, such as Improper Usage or Lack of Usage, showing the possibilities and difficulties of such use and the administrators’ role in this process.</em></p><p><em><strong>Keywords:</strong> PISA; Use of Results; Educational Assessment; Public Policies.</em></p>


2020 ◽  
Vol 26 (1) ◽  
pp. 20-32 ◽  
Author(s):  
Charlene Tan

This article examines a Confucian conception of competence and its corresponding response to the competencies agenda that underpins international large-scale assessments such as the Programme for International Student Assessment (PISA). It is argued that standardised transnational assessments is underpinned by technical rationality that emphasises proficiency in discrete skills for their instrumental worth at the expense of moral cultivation and personal mastery. Challenging the competencies agenda, this paper draws upon a relational model of competence proposed by Jones and Moore (1995) that views competence as essentially communal, situated within social practices, and manifested through tacit achievement. A Confucian notion of competence is advocated where skills are premised on the virtue of ren (humanity) and demonstrated through appropriate judgement in everyday settings. A Confucian perspective offers an alternative to the behaviourist and generic notions of performance in global assessments by highlighting the social, cultural and ethical dimensions of competence.


Sign in / Sign up

Export Citation Format

Share Document