Marginal likelihood inference for a model for item responses and response times

2010 ◽  
Vol 63 (3) ◽  
pp. 603-626 ◽  
Author(s):  
Cees A. W. Glas ◽  
Wim J. Linden
2020 ◽  
pp. 001316442096863
Author(s):  
Kaiwen Man ◽  
Jeffrey R. Harring

Many approaches have been proposed to jointly analyze item responses and response times to understand behavioral differences between normally and aberrantly behaved test-takers. Biometric information, such as data from eye trackers, can be used to better identify these deviant testing behaviors in addition to more conventional data types. Given this context, this study demonstrates the application of a new method for multiple-group analysis that concurrently models item responses, response times, and visual fixation counts collected from an eye-tracker. It is hypothesized that differences in behavioral patterns between normally behaved test-takers and those who have different levels of preknowledge about the test items will manifest in latent characteristics of the different data types. A Bayesian estimation scheme is used to fit the proposed model to experimental data and the results are discussed.


2021 ◽  
Vol 12 ◽  
Author(s):  
Denise Reis Costa ◽  
Maria Bolsinova ◽  
Jesper Tijmstra ◽  
Björn Andersson

Log-file data from computer-based assessments can provide useful collateral information for estimating student abilities. In turn, this can improve traditional approaches that only consider response accuracy. Based on the amounts of time students spent on 10 mathematics items from the PISA 2012, this study evaluated the overall changes in and measurement precision of ability estimates and explored country-level heterogeneity when combining item responses and time-on-task measurements using a joint framework. Our findings suggest a notable increase in precision with the incorporation of response times and indicate differences between countries in how respondents approached items as well as in their response processes. Results also showed that additional information could be captured through differences in the modeling structure when response times were included. However, such information may not reflect the testing objective.


2019 ◽  
Vol 79 (5) ◽  
pp. 931-961 ◽  
Author(s):  
Cengiz Zopluoglu

Researchers frequently use machine-learning methods in many fields. In the area of detecting fraud in testing, there have been relatively few studies that have used these methods to identify potential testing fraud. In this study, a technical review of a recently developed state-of-the-art algorithm, Extreme Gradient Boosting (XGBoost), is provided and the utility of XGBoost in detecting examinees with potential item preknowledge is investigated using a real data set that includes examinees who engaged in fraudulent testing behavior, such as illegally obtaining live test content before the exam. Four different XGBoost models were trained using different sets of input features based on (a) only dichotomous item responses, (b) only nominal item responses, (c) both dichotomous item responses and response times, and (d) both nominal item responses and response times. The predictive performance of each model was evaluated using the area under the receiving operating characteristic curve and several classification measures such as the false-positive rate, true-positive rate, and precision. For comparison purposes, the results from two person-fit statistics on the same data set were also provided. The results indicated that XGBoost successfully classified the honest test takers and fraudulent test takers with item preknowledge. Particularly, the classification performance of XGBoost was reasonably good when the response time information and item responses were both taken into account.


2020 ◽  
Author(s):  
Benjamin Domingue ◽  
Klint Kanopka ◽  
Ben Stenhaug ◽  
Jim Soland ◽  
Megan Kuhfeld ◽  
...  

As our ability to collect data about respondents increases, approaches for incorporating ancillary data features such as response time are of heightened interest. Models for response time have been advanced, but relatively limited large-scale empirical investigations have been conducted. We take advantage of a unique and massive dataset—data from computer adaptive administrations of the NWEA MAP Growth assessment in two states consisting of roughly 1/4 billion item responses—containing both item responses plus response times to shed light on emergent features of response time behavior. We focus on two behaviors in particular. The first, response acceleration, is a reduction in response time for responses that occur relatively late on the assessment. We further note that such reductions are heterogeneous as a function of estimated ability (lower ability estimates are associated with larger increases in acceleration) and that reductions in response time on later items lead to reductions in accuracy relative to expectation. We also document variation in interplay between speed and accuracy. In some cases, additional time spent on an item is associated with an increase in accuracy; in other cases, the opposite is true. This finding has potential connections to the nascent literature on different within-person response processes. We argue that our approach may be useful in other settings and that the behaviors observed here should be of interest in other data.


Psychometrika ◽  
2021 ◽  
Author(s):  
Udo Boehm ◽  
Maarten Marsman ◽  
Han L. J. van der Maas ◽  
Gunter Maris

AbstractThe emergence of computer-based assessments has made response times, in addition to response accuracies, available as a source of information about test takers’ latent abilities. The development of substantively meaningful accounts of the cognitive process underlying item responses is critical to establishing the validity of psychometric tests. However, existing substantive theories such as the diffusion model have been slow to gain traction due to their unwieldy functional form and regular violations of model assumptions in psychometric contexts. In the present work, we develop an attention-based diffusion model based on process assumptions that are appropriate for psychometric applications. This model is straightforward to analyse using Gibbs sampling and can be readily extended. We demonstrate our model’s good computational and statistical properties in a comparison with two well-established psychometric models.


Sign in / Sign up

Export Citation Format

Share Document