A generalized semi-parametric model for jointly analyzing response times and accuracy in computerized testing

Fang Liu; Jiwei Zhang; Ningzhong Shi; Ming-Hui Chen

doi:10.4310/21-sii681

A Multiprocess Item Response Model for Not-Reached Items due to Time Limits and Quitting

Educational and Psychological Measurement ◽

10.1177/0013164419878241 ◽

2019 ◽

Vol 80 (3) ◽

pp. 522-547

Author(s):

Esther Ulitzsch ◽

Matthias von Davier ◽

Steffi Pohl

Keyword(s):

Missing Data ◽

Test Performance ◽

Missing Values ◽

Response Times ◽

Computerized Testing ◽

Item Response Model ◽

Time Limits ◽

Fine Grained ◽

Test Taking ◽

The One

So far, modeling approaches for not-reached items have considered one single underlying process. However, missing values at the end of a test can occur for a variety of reasons. On the one hand, examinees may not reach the end of a test due to time limits and lack of working speed. On the other hand, examinees may not attempt all items and quit responding due to, for example, fatigue or lack of motivation. We use response times retrieved from computerized testing to distinguish missing data due to lack of speed from missingness due to quitting. On the basis of this information, we present a new model that allows to disentangle and simultaneously model different missing data mechanisms underlying not-reached items. The model (a) supports a more fine-grained understanding of the processes underlying not-reached items and (b) allows to disentangle different sources describing test performance. In a simulation study, we evaluate estimation of the proposed model. In an empirical study, we show what insights can be gained regarding test-taking behavior using this model.

Download Full-text

A Semiparametric Model for Jointly Analyzing Response Times and Accuracy in Computerized Testing

Journal of Educational and Behavioral Statistics ◽

10.3102/1076998612461831 ◽

2013 ◽

Vol 38 (4) ◽

pp. 381-417 ◽

Cited By ~ 32

Author(s):

Chun Wang ◽

Zhewen Fan ◽

Hua-Hua Chang ◽

Jeffrey A. Douglas

Keyword(s):

Response Times ◽

Semiparametric Model ◽

Computerized Testing

Download Full-text

Evaluation of the validity of the Psychology Experiment Building Language tests of vigilance, auditory memory, and decision making

10.7287/peerj.preprints.1330 ◽

2015 ◽

Author(s):

Brian Piper ◽

Shane T Mueller ◽

Sara Talebzadeh ◽

Min Jung Ki

Keyword(s):

Decision Making ◽

Performance Test ◽

Iowa Gambling Task ◽

Short Term Memory ◽

Response Times ◽

Digit Span ◽

Continuous Performance Test ◽

Auditory Memory ◽

Computerized Testing ◽

Language Tests

Background. The Psychology Experimental Building Language (PEBL) http://pebl.sourceforge.net/ test battery is a popular application for neurobehavioral investigations. This study evaluated the correspondence between the PEBL and the non-PEBL versions of four executive function tests. Methods. In one cohort, young-adults (N = 44) completed both the Conner’s Continuous Performance Test (CCPT) and the PEBL CPT (PCPT) with the order counter-balanced. In a second cohort, participants (N = 47) completed a non-computerized (Wechsler) and a computerized (PEBL) Digit Span (WDS or PDS) both Forward and Backward. Participants also completed the Psychological Assessment Resources or the PEBL versions of the Iowa Gambling Task (PARIGT or PEBLIGT). Results. The between test correlations were moderately high (reaction time r = 0.78, omission errors r = 0.65, commission errors r = 0.66) on the CPT. DS Forward was significantly greater than DS Backward independent of the test modality. The total WDS score was moderately correlated with the PDS (r = 0.56). The PARIGT and the PEBLIGTs showed a very similar pattern for response times across blocks, development of preference for Advantageous over Disadvantageous Decks, and Deck selections. However, the amount of money earned (score – loan) was significantly higher in the PEBLIGT during the last Block. Conclusions. These findings are broadly supportive of the criterion validity of the PEBL measures of sustained attention, short-term memory, and decision making. Select differences between workalike versions of the same test highlight how detailed aspects of implementation may have more important consequences for computerized testing than has been previously acknowledged.

Download Full-text

On the Reliability and Validity of a Numerical Reasoning Speed Dimension Derived From Response Times Collected in Computerized Testing

Educational and Psychological Measurement ◽

10.1177/0013164411408412 ◽

2011 ◽

Vol 72 (2) ◽

pp. 245-263 ◽

Cited By ~ 13

Author(s):

Mark L. Davison ◽

Robert Semmes ◽

Lan Huang ◽

Catherine N. Close

Keyword(s):

Response Times ◽

Reliability And Validity ◽

Computerized Testing ◽

Numerical Reasoning

Download Full-text

Evaluation of the validity of the Psychology Experiment Building Language tests of vigilance, auditory memory, and decision making

PeerJ ◽

10.7717/peerj.1772 ◽

2016 ◽

Vol 4 ◽

pp. e1772 ◽

Cited By ~ 7

Author(s):

Brian Piper ◽

Shane T. Mueller ◽

Sara Talebzadeh ◽

Min Jung Ki

Keyword(s):

Decision Making ◽

Performance Test ◽

Iowa Gambling Task ◽

Short Term Memory ◽

Response Times ◽

Digit Span ◽

Continuous Performance Test ◽

Auditory Memory ◽

Computerized Testing ◽

Language Tests

Background.The Psychology Experimental Building Language (PEBL) test battery (http://pebl.sourceforge.net/) is a popular application for neurobehavioral investigations. This study evaluated the correspondence between the PEBL and the non-PEBL versions of four executive function tests.Methods.In one cohort, young-adults (N= 44) completed both the Conner’s Continuous Performance Test (CCPT) and the PEBL CPT (PCPT) with the order counter-balanced. In a second cohort, participants (N= 47) completed a non-computerized (Wechsler) and a computerized (PEBL) Digit Span (WDS orPDS) both Forward and Backward. Participants also completed the Psychological Assessment Resources or the PEBL versions of the Iowa Gambling Task (PARIGT orPEBLIGT).Results. The between-test correlations were moderately high (reaction timer= 0.78, omission errorsr= 0.65, commission errorsr= 0.66) on the CPT. DS Forward was significantly greater than DS Backward on theWDS (p< .0005) and thePDS (p< .0005). The totalWDS score was moderately correlated with thePDS (r= 0.56). ThePARIGT and thePEBLIGTs showed a very similar pattern for response times across blocks, development of preference for Advantageous over Disadvantageous Decks, and Deck selections. However, the amount of money earned (score–loan) was significantly higher in thePEBLIGT during the last Block.Conclusions. These findings are broadly supportive of the criterion validity of the PEBL measures of sustained attention, short-term memory, and decision making. Select differences between workalike versions of the same test highlight how detailed aspects of implementation may have more important consequences for computerized testing than has been previously acknowledged.

Download Full-text

Evaluation of the validity of the Psychology Experiment Building Language tests of vigilance, auditory memory, and decision making

10.7287/peerj.preprints.1330v1 ◽

2015 ◽

Author(s):

Brian Piper ◽

Shane T Mueller ◽

Sara Talebzadeh ◽

Min Jung Ki

Keyword(s):

Decision Making ◽

Performance Test ◽

Iowa Gambling Task ◽

Short Term Memory ◽

Response Times ◽

Digit Span ◽

Continuous Performance Test ◽

Auditory Memory ◽

Computerized Testing ◽

Language Tests

Background. The Psychology Experimental Building Language (PEBL) http://pebl.sourceforge.net/ test battery is a popular application for neurobehavioral investigations. This study evaluated the correspondence between the PEBL and the non-PEBL versions of four executive function tests. Methods. In one cohort, young-adults (N = 44) completed both the Conner’s Continuous Performance Test (CCPT) and the PEBL CPT (PCPT) with the order counter-balanced. In a second cohort, participants (N = 47) completed a non-computerized (Wechsler) and a computerized (PEBL) Digit Span (WDS or PDS) both Forward and Backward. Participants also completed the Psychological Assessment Resources or the PEBL versions of the Iowa Gambling Task (PARIGT or PEBLIGT). Results. The between test correlations were moderately high (reaction time r = 0.78, omission errors r = 0.65, commission errors r = 0.66) on the CPT. DS Forward was significantly greater than DS Backward independent of the test modality. The total WDS score was moderately correlated with the PDS (r = 0.56). The PARIGT and the PEBLIGTs showed a very similar pattern for response times across blocks, development of preference for Advantageous over Disadvantageous Decks, and Deck selections. However, the amount of money earned (score – loan) was significantly higher in the PEBLIGT during the last Block. Conclusions. These findings are broadly supportive of the criterion validity of the PEBL measures of sustained attention, short-term memory, and decision making. Select differences between workalike versions of the same test highlight how detailed aspects of implementation may have more important consequences for computerized testing than has been previously acknowledged.

Download Full-text

Nanostructure of semiconductor quantum dots in a borosilicate glass matrix by complementary use of HREM and AEM

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s0424820100176770 ◽

1990 ◽

Vol 48 (4) ◽

pp. 728-729

Author(s):

M.J. Kim ◽

L.C. Liu ◽

S.H. Risbud ◽

R.W. Carpenter

Keyword(s):

Quantum Dots ◽

Borosilicate Glass ◽

Glass Matrix ◽

Optical Switching ◽

Response Times ◽

Fast Response ◽

Processing Technique ◽

Internal Stresses ◽

Electron Hole ◽

Spatial Dimensions

When the size of a semiconductor is reduced by an appropriate materials processing technique to a dimension less than about twice the radius of an exciton in the bulk crystal, the band like structure of the semiconductor gives way to discrete molecular orbital electronic states. Clusters of semiconductors in a size regime lower than 2R {where R is the exciton Bohr radius; e.g. 3 nm for CdS and 7.3 nm for CdTe) are called Quantum Dots (QD) because they confine optically excited electron- hole pairs (excitons) in all three spatial dimensions. Structures based on QD are of great interest because of fast response times and non-linearity in optical switching applications.In this paper we report the first HREM analysis of the size and structure of CdTe and CdS QD formed by precipitation from a modified borosilicate glass matrix. The glass melts were quenched by pouring on brass plates, and then annealed to relieve internal stresses. QD precipitate particles were formed during subsequent "striking" heat treatments above the glass crystallization temperature, which was determined by differential thermal analysis.

Download Full-text

Compact and efficient gas diffusion electrodes based on nanoporous alumina membranes for microfuel cells and gas sensors

The Analyst ◽

10.1039/c9an01882d ◽

2020 ◽

Vol 145 (1) ◽

pp. 122-131 ◽

Cited By ~ 1

Author(s):

Wanda V. Fernandez ◽

Rocío T. Tosello ◽

José L. Fernández

Keyword(s):

Response Times ◽

Hydrogen Oxidation ◽

Fast Response ◽

Gas Diffusion ◽

Limiting Current ◽

Gas Diffusion Electrodes ◽

Nanoporous Alumina ◽

Alumina Membranes ◽

Current Densities ◽

Microfuel Cells

Gas diffusion electrodes based on nanoporous alumina membranes electrocatalyze hydrogen oxidation at high diffusion-limiting current densities with fast response times.

Download Full-text

The S-SH Confusion Test and the Effects of Frequency Lowering

Journal of Speech Language and Hearing Research ◽

10.1044/2018_jslhr-h-18-0267 ◽

2019 ◽

Vol 62 (5) ◽

pp. 1486-1505

Author(s):

Joshua M. Alexander

Keyword(s):

Cognitive Processing ◽

Hearing Aids ◽

Response Times ◽

Nonlinear Frequency ◽

Low Frequencies ◽

Frequency Compression ◽

Negative Side ◽

Minimal Word ◽

Low Pass ◽

Negative Side Effects

PurposeFrequency lowering in hearing aids can cause listeners to perceive [s] as [ʃ]. The S-SH Confusion Test, which consists of 66 minimal word pairs spoken by 6 female talkers, was designed to help clinicians and researchers document these negative side effects. This study's purpose was to use this new test to evaluate the hypothesis that these confusions will increase to the extent that low frequencies are altered.MethodTwenty-one listeners with normal hearing were each tested on 7 conditions. Three were control conditions that were low-pass filtered at 3.3, 5.0, and 9.1 kHz. Four conditions were processed with nonlinear frequency compression (NFC): 2 had a 3.3-kHz maximum audible output frequency (MAOF), with a start frequency (SF) of 1.6 or 2.2 kHz; 2 had a 5.0-kHz MAOF, with an SF of 1.6 or 4.0 kHz. Listeners' responses were analyzed using concepts from signal detection theory. Response times were also collected as a measure of cognitive processing.ResultsOverall, [s] for [ʃ] confusions were minimal. As predicted, [ʃ] for [s] confusions increased for NFC conditions with a lower versus higher MAOF and with a lower versus higher SF. Response times for trials with correct [s] responses were shortest for the 9.1-kHz control and increased for the 5.0- and 3.3-kHz controls. NFC response times were also significantly longer as MAOF and SF decreased. The NFC condition with the highest MAOF and SF had statistically shorter response times than its control condition, indicating that, under some circumstances, NFC may ease cognitive processing.ConclusionsLarge differences in the S-SH Confusion Test across frequency-lowering conditions show that it can be used to document a major negative side effect associated with frequency lowering. Smaller but significant differences in response times for correct [s] trials indicate that NFC can help or hinder cognitive processing, depending on its settings.

Download Full-text

Causal attribution and counterfactual thinking - when does performing one facilitate performance of the other

Swiss Journal of Psychology ◽

10.1024/1421-0185.62.4.209 ◽

2003 ◽

Vol 62 (4) ◽

pp. 209-218

Author(s):

A. N’gbala ◽

N. R. Branscombe

Keyword(s):

Response Times ◽

Causal Attribution ◽

Counterfactual Thinking ◽

The Other

When do causal attribution and counterfactual thinking facilitate one another, and when do the two responses overlap? Undergraduates (N = 78) both explained and undid, in each of two orders, events that were described either with their potential causes or not. The time to perform either response was recorded. Overall, mutation response times were shorter when performed after an attribution was made than before, while attribution response times did not vary as a consequence of sequence. Depending on whether the causes of the target events were described in the scenario or not, respondents undid the actor and assigned causality to another antecedent, or pointed to the actor for both responses. These findings suggest that counterfactual mutation is most likely to be facilitated by attribution, and that mutation and attribution responses are most likely to overlap when no information about potential causes of the event is provided.

Download Full-text