scholarly journals Reflections on Analytical Choices in the Scaling Model for Test Scores in International Large-Scale Assessment Studies

2021 ◽  
Author(s):  
Alexander Robitzsch ◽  
Oliver Lüdtke

International large-scale assessments (LSAs) such as the Programme for International Student Assessment (PISA) provide important information about the distribution of student proficiencies across a wide range of countries. The repeated assessments of these content domains offer policymakers important information for evaluating educational reforms and received considerable attention from the media. Furthermore, the analytical strategies employed in LSAs often define methodological standards for applied researchers in the field. Hence, it is vital to critically reflect the conceptual foundations of analytical choices in LSA studies. This article discusses methodological challenges in selecting and specifying the scaling model used to obtain proficiency estimates from the individual student responses in LSA studies. We distinguish design-based inference from model-based inference. It is argued that for the official reporting of LSA results, design-based inference should be preferred because it allows for a clear definition of the target of inference (e.g., country mean achievement) and is less sensitive to specific modeling assumptions. More specifically, we discuss five analytical choices in the specification of the scaling model: (1) Specification of the functional form of item response functions, (2) the treatment of local dependencies and multidimensionality, (3) the consideration of test-taking behavior for estimating student ability, and the role of country differential items functioning (DIF) for (4) cross-country comparisons, and (5) trend estimation. This article's primary goal is to stimulate discussion about recently implemented changes and suggested refinements of the scaling models in LSA studies.

2018 ◽  
Vol 26 (2) ◽  
pp. 196-212 ◽  
Author(s):  
Kentaro Yamamoto ◽  
Mary Louise Lennon

Purpose Fabricated data jeopardize the reliability of large-scale population surveys and reduce the comparability of such efforts by destroying the linkage between data and measurement constructs. Such data result in the loss of comparability across participating countries and, in the case of cyclical surveys, between past and present surveys. This paper aims to describe how data fabrication can be understood in the context of the complex processes involved in the collection, handling, submission and analysis of large-scale assessment data. The actors involved in those processes, and their possible motivations for data fabrication, are also elaborated. Design/methodology/approach Computer-based assessments produce new types of information that enable us to detect the possibility of data fabrication, and therefore the need for further investigation and analysis. The paper presents three examples that illustrate how data fabrication was identified and documented in the Programme for the International Assessment of Adult Competencies (PIAAC) and the Programme for International Student Assessment (PISA) and discusses the resulting remediation efforts. Findings For two countries that participated in the first round of PIAAC, the data showed a subset of interviewers who handled many more cases than others. In Case 1, the average proficiency for respondents in those interviewers’ caseloads was much higher than expected and included many duplicate response patterns. In Case 2, anomalous response patterns were identified. Case 3 presents findings based on data analyses for one PISA country, where results for human-coded responses were shown to be highly inflated compared to past results. Originality/value This paper shows how new sources of data, such as timing information collected in computer-based assessments, can be combined with other traditional sources to detect fabrication.


2018 ◽  
Vol 21 ◽  
pp. 80-90
Author(s):  
Chandra Mani Paudel ◽  
Ram Chandra Panday

This paper tries to present results from a systematic review of literature that reviewed the large-scale assessments finding in the South Asian context especially focusing Nepal. The main objective of the LEAP programme is to reform the quality of learning in the Asia-Pacific region by developing capacity of the Member States to collect, analyze and utilize international and national assessment data identifying learning enablers. The review has identified the high order skills overshadowed by rote learning. It has also employed Item Response Theory (IRT) making assessments comparable and connected with the previous levels. International Assessments such as the Programme for International Student Assessment (PISA) and the Trends in Mathematics and Science Study (TIMSS) collected vast amounts of data on schools, students and households. The use of education-related “big data” for evidence-based policy making is limited, partly due to insufficient institutional capacity of countries to analyze such data and link results with policies.


2017 ◽  
Vol 28 (68) ◽  
pp. 512
Author(s):  
Lenice Medeiros ◽  
Alexandre Jaloto ◽  
André Vitor Fernades dos Santos

<p>O artigo procura abordar os aspectos pedagógicos das avaliações internacionais que contam com a área de Ciências, focalizando especialmente a última edição do Programa Internacional de Avaliação de Estudantes (PISA) e o Terceiro Estudo Regional Comparativo e Explicativo (TERCE). São apresentados e discutidos os alicerces conceituais e procedimentais desses estudos e alguns resultados relativos ao desempenho dos estudantes brasileiros. Nesse sentido, problematizam-se os limites e as possibilidades de uso desses dados para a formulação de políticas educacionais que impactam o ensino de Ciências, tais como a Base Nacional Curricular Comum (BNCC) e as avaliações previstas no Plano Nacional da Educação (PNE).</p><p><strong>Palavras-chave:</strong> Avaliação em Larga Escala; Ensino de Ciências; Pisa; Terce.</p><p>  </p><p><strong>El área de ciencias en las evaluaciones internacionales de gran escala</strong></p><p>El artículo pretende abordar los aspectos pedagógicos de las evaluaciones internacionales que cuentan con el área de Ciencias, enfocando especialmente la última edición del Programa Internacional de Evaluación de Estudiantes (PISA) y el Tercer Estudio Regional Comparativo y Explicativo (TERCE). Se presentan y discuten las bases conceptuales y procedimentales de estos estudios y algunos resultados relativos al desempeño de los estudiantes brasileños. En este sentido se problematizan los límites y las posibilidades de uso de estos datos para la formulación de políticas educacionales que impactan la enseñanza de Ciencias, como la Base Nacional Curricular Común (BNCC) y las evaluaciones previstas en el Plan Nacional de la Educación (PNE).</p><p><strong>Palabras-clave:</strong> Evaluación en Gran Escala; Enseñanza de Ciencias; Pisa; Terce.</p><p> </p><p><strong>The science field in international large-scale assessments</strong></p><p>This article aims to address the pedagogical aspects of international assessments that include the Science field, focusing especially on the last edition of the Program for International Student Assessment (PISA) and the Third Regional Comparative and Explanatory Study (TERCE). The procedural and conceptual foundations of these studies and some results concerning the performance of Brazilian students are presented and discussed here. Therefore, the limits and the possibilities of using these data for the formulation of educational policies that impact the teaching of sciences, such as the Common National Curricular Base (BNCC) and the assessments provided for in the National Education Plan (PNE) are discussed.</p><p><strong>keywords:</strong> Large-Scale Assessment; Science Education; Pisa; Terce</p>


2017 ◽  
Vol 28 (68) ◽  
pp. 478
Author(s):  
Andrea Mara Vieira

<p>A nossa proposta é investigar a existência ou não de sintonia entre o conceito acadêmico de letramento científico e aquele previsto nos documentos do Programme for International Student Assessment (PISA) e nas normas educacionais. A despeito de toda complexidade e polissemia conceitual existente em torno do conceito de alfabetização/letramento científico, desenvolvemos uma análise teórico-comparativa desse conceito na forma como é concebido pelos especialistas, em comparação com o conceito de letramento científico previsto na base avaliativa do PISA 2015, considerando também a previsão normatizada pelas políticas públicas educacionais. Ao final, identificamos  menos  acordes  e, por variados motivos, mais dissonâncias, que podem servir como contributo para uma reflexão sobre a validade e  relevância  do PISA enquanto instrumento de avaliação, bem como sobre o tipo de aprendizagem a ser assegurada pelo nosso sistema educacional.</p><p><strong>Palavras-chave:</strong> Letramento Científico; Pisa; Políticas Públicas; Avaliação em Larga Escala.</p><p> </p><p><strong>Acordes y disonancias del letramento científico propuesto por el PISA 2015</strong></p><p>Nuestra propuesta es investigar la existencia o no de sintonía entre el concepto académico de letramento científico y el previsto en los documentos del Programme for International Student Assessment (PISA) y en las normas educacionales. A pesar de toda la complejidad y polisemia conceptual existentes en torno al concepto de alfabetización/letramento científico, desarrollamos un análisis teórico-comparativo de dicho concepto en la forma como es concebido por los especialistas, en comparación con el concepto de letramento científico previsto en la base evaluativa del PISA 2015, considerando también la previsión normalizada por las políticas públicas educacionales. Al final, identificamos menos acordes y, por variados motivos, más disonancias, que pueden servir como contribución para una reflexión sobre la validad y relevancia del PISA como instrumento de evaluación, así como sobre el tipo de aprendizaje que nuestro sistema educacional debe asegurar.</p><p><strong>Palabras-clave:</strong> Letramento Científico; Pisa; Políticas Públicas; Evaluación en Gran Escala.</p><p> </p><p><strong>Chords and dissonances of scientific literacy proposed by PISA 2015</strong></p><p>Our proposal is to investigate the harmony or lack of it between the academic concept of scientific literacy and the one stated in the documents of the Program for International Student Assessment (PISA) and in educational standards. Despite all complexity and conceptual polysemy around the concept of literacy/scientific literacy, we developed a theoretical comparative analysis of this concept as designed by experts, comparing it to the concept of scientific literacy laid down on the assessment basis of the PISA 2015, considering also the projection standardized by public educational policies. Finally, we identified less chords, and, for various reasons, more dissonance, that can serve as a contribution to discuss the validity and relevance of PISA as an assessment tool, as well as on the type of learning to be ensured by our educational system.</p><p><strong>Keywords:</strong> Scientific Literacy; Pisa; Public Policies; Large-Scale Assessment.</p>


2021 ◽  
Vol 9 (1) ◽  
Author(s):  
Plamen V. Mirazchiyski

AbstractThis paper presents the R Analyzer for Large-Scale Assessments (), a newly developed package for analyzing data from studies using complex sampling and assessment designs. Such studies are, for example, the IEA’s Trends in International Mathematics and Science Study and the OECD’s Programme for International Student Assessment. The package covers all cycles from a broad range of studies. The paper presents the architecture of the package, the overall workflow and illustrates some basic analyses using it. The package is open-source and free of charge. Other software packages for analyzing large-scale assessment data exist, some of them are proprietary, others are open-source. However, is the first comprehensive package, designed for the user experience and has some distinctive features. One innovation is that the package can convert SPSS data from large scale assessments into native data sets. It can also do so for PISA data from cycles prior to 2015, where the data is provided in tab-delimited text files along with SPSS control syntax files. Another feature is the availability of a graphical user interface, which is also written in and operates in any operating system where a full copy of can be installed. The output from any analysis function is written into an MS Excel workbook with multiple sheets for the estimates, model statistics, analysis information and the calling syntax itself for reproducing the analysis in future. The flexible design of allows for the quick addition of new studies, analysis types and features to the existing ones.


2017 ◽  
Vol 16 (6) ◽  
pp. 869-884
Author(s):  
Christina E Mølstad ◽  
Daniel Pettersson ◽  
Eva Forsberg

This study investigates knowledge structures and scientific communication using bibliometric methods to explore scientific knowledge production and dissemination. The aim is to develop knowledge about this growing field by investigating studies using international large-scale assessment (ILSA) data, with a specific focus on those using Programme for International Student Assessment (PISA) data. As international organisations use ILSA to measure, assess and compare the success of national education systems, it is important to study this specific knowledge to understand how it is organised and legitimised within research. The findings show an interchange of legitimisation, where major actors from the USA and other English-speaking and westernised countries determine the academic discourse. Important epistemic cultures for PISA research are identified: the most important of which are situated within psychology and education. These two research environments are epicentres created by patterns of the referrals to and referencing of articles framing the formulation of PISA knowledge. Finally, it is argued that this particular PISA research is self-referential and self-authorising, which raises questions about whether research accountability leads to ‘a game of thrones’, where rivalry going on within the scientific field concerning how and on what grounds ‘facts’ and ‘truths’ are constructed, as a continuing process with no obvious winner.


2021 ◽  
Vol 6 ◽  
Author(s):  
Pan Tang ◽  
Hao Liu ◽  
Hongbo Wen

Collaborative problem solving (CPS) competency is critical in the twenty-first century. The Program for International Student Assessment (PISA) launched a large-scale assessment of CPS competency for the first time in 2015. Beijing, Shanghai, Jiangsu, and Guangdong provinces in China participated the assessment and scored an average of 496, which was slightly lower than the OECD average 500 and ranked 25th among the 51 countries and economies participating in the assessment. Therefore, this research was conducted to dig into the factors predicting students’ CPS competency, and help students improve it. Most research about CPS has fallen into the construction of the CPS framework and the effectiveness of CPS; research focusing on the factors predicting CPS competency is rare. Accordingly, a hierarchical linear model (HLM) was constructed to investigate the factors predicting students’ CPS competency in the current research. The model revealed that there was a large difference of students’ CPS competency among schools. In addition, among student-level variables, gender, grade, ESCS, ICT resources, students’ attitude toward CPS, and teacher unfairness were effective in predicting students’ CPS competency; among school-level variables, school location, schools’ ESCS and the proportion of all teachers fully certified predicted students’ CPS competency positively. The findings implied that in order to enhance students’ CPS competency, CPS competency training should be permeated through all the subjects; schools should employ teachers who are fully qualified; teachers should treat each student fairly; and students should be provided with more ICT resources and etc.


2020 ◽  
Vol 84 (1) ◽  
pp. 109-133
Author(s):  
Nurullah Erylmaz ◽  
Mauricio Rivera-Gutiérrez ◽  
Andrés Sandoval-Hernández

It has been claimed that there is a lack of theory-driven constructs and a lack of cross-country comparability in International Large-Scale Assessment (ILSA)’s socio-economic background scales. To address these issues, a new socio-economic background scale was created based on Pierre Bourdieu’s cultural reproduction theory, which distinguishes economic, cultural and social capital. Secondly, measurement invariance of this construct was tested across countries participating in the Programme for International Student Assessment (PISA). After dividing the countries which participated in PISA 2015 into three groups, i.e., Latin American, European, and Asian, a Multi-Group Confirmatory Factor Analysis was carried out in order to examine the measurement invariance of this new socio-economic scale. The results of this study revealed that this questionnaire, which measures the socio-economic background, was not found to be utterly invariant in the analysis involving all countries. However, when analysing more homogenous groups, measurement invariance was verified at the metric level, except for the group of Latin American countries. Further, implications for policymakers and recommendations for future studies are discussed.


Sign in / Sign up

Export Citation Format

Share Document