Mathematics assessment integrity during lockdown: experiences in running online un-invigilated exams

The 2015 Trends in International Mathematics and Science Study (TIMSS) involved 57 countries and 43 different languages to assess students’ achievement in mathematics and science. The purpose of this study is to evaluate whether items and test scores are affected as the differences between language families and cultures increase. Using differential item functioning (DIF) procedures, we compared the consistency of students’ performance across three combinations of languages and countries: (a) same language but different countries, (b) same countries but different languages, and (c) different languages and different countries. The analyses consisted of the detection of the number of DIF items for all paired comparisons within each condition, the direction of DIF, the magnitude of DIF, and the differences between test characteristic curves. As the countries were more distant with respect to cultures and language families, the presence of DIF increased. The magnitude of DIF was greatest when both language and country differed, and smallest when the languages were same, but the countries were different. Results suggest that when TIMSS results are compared across countries, the language- and country-specific differences which could reflect cultural, curriculum, or other differences should be considered.

Download Full-text

When didactics meet data science: process data analysis in large-scale mathematics assessment in France

Large-scale Assessments in Education ◽

10.1186/s40536-020-00085-y ◽

2020 ◽

Vol 8 (1) ◽

Author(s):

Franck Salles ◽

Reinaldo Dos Santos ◽

Saskia Keskpaik

Keyword(s):

Data Analysis ◽

Large Scale ◽

Data Science ◽

Mathematics Assessment ◽

Process Data ◽

Meet Data

Download Full-text

Examining How Spanish-Speaking English Language Learners Use Their Linguistic Resources and Language Modes in a Dual Language Mathematics Assessment Task

Journal of Latinos and Education ◽

10.1080/15348431.2020.1731693 ◽

2020 ◽

pp. 1-13

Author(s):

Alexis A. Lopez

Keyword(s):

English Language Learners ◽

Language Learners ◽

Dual Language ◽

English Language ◽

Mathematics Assessment ◽

Assessment Task ◽

Linguistic Resources ◽

Spanish Speaking

Download Full-text

Stochastic Approximation Methods for Latent Regression Item Response Models

Journal of Educational and Behavioral Statistics ◽

10.3102/1076998609346970 ◽

2010 ◽

Vol 35 (2) ◽

pp. 174-193 ◽

Cited By ~ 23

Author(s):

Matthias von Davier ◽

Sandip Sinharay

Keyword(s):

Item Response ◽

Stochastic Approximation ◽

Latent Variable ◽

Reading Assessment ◽

Mathematics Assessment ◽

National Assessment ◽

Variable Model ◽

Response Models ◽

Item Response Models ◽

Latent Regression

This article presents an application of a stochastic approximation expectation maximization (EM) algorithm using a Metropolis-Hastings (MH) sampler to estimate the parameters of an item response latent regression model. Latent regression item response models are extensions of item response theory (IRT) to a latent variable model with covariates serving as predictors of the conditional distribution of ability. Applications to estimating latent regression models for National Assessment of Educational Progress (NAEP) data from the 2000 Grade 4 mathematics assessment and the Grade 8 reading assessment from 2002 are presented and results of the proposed method are compared to results obtained using current operational procedures.

Download Full-text

Using partial knowledge to inform the creation of learning progressions

Australian Journal of Education ◽

10.1177/00049441211037409 ◽

2021 ◽

pp. 000494412110374

Author(s):

Joan Burfitt

Keyword(s):

Secondary Students ◽

Conceptual Understanding ◽

Rasch Analysis ◽

Mathematics Assessment ◽

Learning Activities ◽

Learning Progressions ◽

Partial Knowledge ◽

Partial Credit ◽

Logical Order ◽

To Receive

The aim of this study was to show that some of the errors made by students when responding to mathematics assessment items can indicate progress in the development of conceptual understanding. By granting partial credit for specific incorrect responses by early secondary students, estimates of the difficulty of demonstrating full and partial knowledge of skills associated with the development of proportional reasoning were determined using Rasch analysis. The errors were confirmed as indicators of progress, and hence partial knowledge, when the thresholds of achievement followed a logical order: The greater the proficiency of the students, the more likely they were to receive a higher score. Consideration of this partial knowledge can enhance the descriptions of the likely behaviours of students at the various levels of learning progressions and this can be informative for teachers in their planning of learning activities.

Download Full-text

Can multiple choice questions be successfully used as an assessment format in undergraduate mathematics?

Pythagoras ◽

10.4102/pythagoras.v0i69.41 ◽

2009 ◽

Vol 0 (69) ◽

Author(s):

Belinda Huntley ◽

Johann Engelbrecht ◽

Ansie Harding

Keyword(s):

Alternative Assessment ◽

Multiple Choice ◽

Mathematics Assessment ◽

Assessment Practices ◽

Continuous Assessment ◽

Multiple Choice Questions ◽

Undergraduate Mathematics ◽

Test Items ◽

Mathematics Test ◽

Assessment Format

In this study we propose a taxonomy for assessment in mathematics, which we call the assessment component taxonomy, to identify those components of mathematics that can be successfully assessed using alternative assessment formats. Based on the literature on assessment models and taxonomies in mathematics, this taxonomy consists of seven mathematics assessment components, hierarchically ordered by cognitive level, as well as the nature of the mathematical tasks associated with each component. Using a model that we developed earlier for measuring the quality of mathematics test items, we investigate which of the assessment components can be successfully assessed in the provided response question (PRQ) format, in particular multiple choice questions (MCQs), and which can be better assessed in the constructed response question (CRQ) format. The results of this study show that MCQs can be constructed to evaluate higher order levels of thinking and learning. The conclusion is that MCQs can be successfully used as an assessment format in undergraduate mathematics, more so in some assessment components than in others. The inclusion of the PRQ assessment format in all seven assessment components can reduce the large marking loads, associated with continuous assessment practices in undergraduate mathematics, without compromising the validity of the assessment.

Download Full-text

Intelligent mathematics assessment in eMax

AFRICON 2007 ◽

10.1109/afrcon.2007.4401512 ◽

2007 ◽

Cited By ~ 3

Author(s):

Anna Gyorgy ◽

Istvan Vajda

Keyword(s):

Mathematics Assessment

Download Full-text

The Assessment Instrument of Mathematics Learning Outcomes Based on HOTS Toward Two-Dimensional Geometry Topic

Indonesian Journal Of Educational Research and Review ◽

10.23887/ijerr.v3i2.25869 ◽

2020 ◽

Vol 3 (1) ◽

pp. 19

Author(s):

I Gede Wahyu Suwela Antara ◽

I Komang Sudarma ◽

I.Ketut Dibia

Keyword(s):

Item Difficulty ◽

Mathematics Learning ◽

Thinking Skills ◽

Mathematics Assessment ◽

Assessment Instrument ◽

Higher Order Thinking Skills ◽

High Order Thinking Skill ◽

Essay Test ◽

High Order Thinking

This study aims to (1) developing a mathematics assessment instrument based on Higher Order Thinking Skills (HOTS); and describe the quality of the instrument. This study was a research and development study adapting 4D model from Thiagarajan. The model including the following steps : (1) define, (2) design, (3) develop, and (4) desseminate. Due to limited of time, this research was only carried out until the developing step.The result shows that the instrument that consists of 18 essay test item are valid and appropriate to be used. The instrumen’s reliability coefficients are 0.659 (High). The instrument has the average of item discrimination 0.44 (Very Good) and the average of item difficulty of the instrument are 0.584 (Medium). The conclusion is the assessment instrument is feasibel being as an assessment instrumen to measure the high order thinking skill toward two-dimentional geometry topic.

Download Full-text