Test Administration and Test Items

2019 ◽  
Vol 45 (2) ◽  
pp. 209-226
Author(s):  
Arnond Sakworawich ◽  
Howard Wainer

Test scoring models vary in their generality, some even adjust for examinees answering multiple-choice items correctly by accident (guessing), but no models, that we are aware of, automatically adjust an examinee’s score when there is internal evidence of cheating. In this study, we use a combination of jackknife technology with an adaptive robust estimator to reduce the bias in examinee scores due to contamination through events such as having access to some of the test items in advance of the test administration. We illustrate our methodology with a data set of test items we knew to have been divulged to a subset of the examinees.


2019 ◽  
Vol 30 (08) ◽  
pp. 694-702
Author(s):  
Maria E. Pomponio ◽  
Stephanie Nagle ◽  
Jennifer L. Smart ◽  
Shannon Palmer

AbstractThere is currently no widely accepted objective method used to identify (central) auditory processing disorder ([C]APD). Audiologists often rely on behavioral test methods to diagnose (C)APD, which can be highly subjective. This is problematic in light of relevant literature that has reported a lack of adequate graduate-level preparation related to (C)APD. This is further complicated when exacerbated by the use of inconsistent test procedures from those used to standardize tests of (C)APD, resulting in higher test variability. The consequences of modifying test administration and scoring methods for tests of (C)APD are not currently documented in the literature.This study aims to examine the effect of varying test administration and scoring procedures from those used to standardize tests of (C)APD on test outcome.This study used a repeated-measures design in which all participants were evaluated in all test conditions. The effects of varying the number of test items administered and the use of repetitions of missed test items on the test outcome score were assessed for the frequency patterns test (FPT), competing sentences test (CST), and the low-pass filtered speech test (LPFST). For the CST only, two scoring methods were used (a strict and a lax criterion) to determine whether or not scoring method affected test outcome.Thirty-three native English-speaking adults served as participants. All participants had normal hearing (as defined by thresholds of 25-dB HL or better) at all octave band frequencies from 500 to 4000 Hz, with thresholds of 55-dB HL or better at 8000 Hz. All participants had normal cognitive function as assessed by the Mini-Mental State Examination.Paired samples t-tests were used to evaluate the differences in test outcome when varying the CST scoring method. A 3 × 2 × 2 repeated-measures factorial analysis of variance (ANOVA) was used to determine the effects of test, length, and repetitions on outcome score for all three tests of auditory processing ability. Individual 2 × 2 repeated-measures two-way ANOVAs were subsequently conducted for each test to further evaluate interactions.There was no effect of scoring method on the CST outcome. There was a significant main effect of repetition use for the FPT and LPFST, in that test scores were greater when corrected for repetitions. An interaction between test length and repetitions was found for the LPFST only, such that there was a greater effect of repetition use when a shorter test was administered compared with a longer test.Test outcome may be affected when test administration procedures are varied from those used to standardize the test, lending itself to the broader possibility that the overall diagnosis of (C)APD may be subsequently affected.


2012 ◽  
Vol 17 (1) ◽  
pp. 1-10 ◽  
Author(s):  
Rosalind Potts ◽  
Robin Law ◽  
John F. Golding ◽  
David Groome

Retrieval-induced forgetting (RIF) refers to the finding that the retrieval of an item from memory impairs the retrieval of related items. The extent to which this impairment is found in laboratory tests varies between individuals, and recent studies have reported an association between individual differences in the strength of the RIF effect and other cognitive and clinical factors. The present study investigated the reliability of these individual differences in the RIF effect. A RIF task was administered to the same individuals on two occasions (sessions T1 and T2), one week apart. For Experiments 1 and 2 the final retrieval test at each session made use of a category-cue procedure, whereas Experiment 3 employed category-plus-letter cues, and Experiment 4 used a recognition test. In Experiment 2 the same test items that were studied, practiced, and tested at T1 were also studied, practiced, and tested at T2, but for the remaining three experiments two different item sets were used at T1 and T2. A significant RIF effect was found in all four experiments. A significant correlation was found between RIF scores at T1 and T2 in Experiment 2, but for the other three experiments the correlations between RIF scores at T1 and T2 failed to reach significance. This study therefore failed to find clear evidence for reliable individual differences in RIF performance, except where the same test materials were used for both test sessions. These findings have important implications for studies involving individual differences in RIF performance.


1982 ◽  
Vol 27 (12) ◽  
pp. 966-967
Author(s):  
Jason Millman
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document