Stabilization and Task Definition in a Performance Test Battery

1979 ◽  
Vol 23 (1) ◽  
pp. 536-540 ◽  
Author(s):  
Marshall B. Jones

Most tasks show practice effects with repeated administrations, effects that may appear in the group mean, the variance among subjects, or the correlations over subjects among trials or repeated testings. Fortunately, there comes a point in many tasks after which practice no longer produces changes in performance; as we will put it, the task stabilizes. Stabilization in this sense is a key phenomenon for performance testing, the prediction of individual behavior, and the theory of personality. It is also desirable that a task be well-defined, that is, that the average correlation among stabilized trials be high (greater than .80). The paper focuses on differential stability, that is, constancy in the positions of individual subjects relative to one another from one trial to the next. Instability or differential change over a set of consecutive trials may appear either within that set of trials (local change) or between the set and other tasks or preceding trials on the sane task (general change). Of the two forms of differential stability or change the latter, general change, is much the more important. The paper concludes with a brief summary of stabilization and task definition in ten tasks currently under consideration for inclusion in a performance test battery for environmental research.

1980 ◽  
Vol 24 (1) ◽  
pp. 349-353 ◽  
Author(s):  
Mary M. Harbeson ◽  
Michele Krause ◽  
Robert S. Kennedy

Four memory tests were considered for inclusion in a human performance test battery. The tests were administered to 23 Navy enlisted men for 15 consecutive days. Group means, standard deviations, and cross-session correlations were examined. Two of the tests, Interference Susceptibility and Free Recall, met the initial statistical criteria for inclusion in the test battery. However, the other tests, Running Recognition and List Differentiation failed to show sufficient task definition and reliability in their present form. These tests are compared with each other and with previous memory research studies.


1986 ◽  
Vol 63 (2) ◽  
pp. 683-708 ◽  
Author(s):  
Alvah C. Bittner ◽  
Robert C. Carter ◽  
Robert S. Kennedy ◽  
Mary M. Harbeson ◽  
Michele Krause

The goal of the Performance Evaluation Tests for Environmental Research (PETER) Program was to identify a set of measures of human capabilities for use in the study of environmental and other time-course effects. 114 measures studied in the PETER Program were evaluated and categorized into four groups based upon task stability and task definition. The Recommended category contained 30 measures that clearly obtained total stabilization and had an acceptable level of reliability efficiency. The Acceptable-But-Redundant category contained 15 measures. The 37 measures in the Marginal category, which included an inordinate number of slope and other derived measures, usually had desirable features which were outweighed by faults. The 32 measures in the Unacceptable category had either differential instability or weak reliability efficiency. It is our opinion that the 30 measures in the Recommended category should be given first consideration for environmental research applications. Further, it is recommended that information pertaining to preexperimental practice requirements and stabilized reliabilities should be utilized in repeated-measures environmental studies.


1980 ◽  
Vol 51 (3_suppl2) ◽  
pp. 1023-1031 ◽  
Author(s):  
D. M. Seales ◽  
R. S. Kennedy ◽  
A. C. Bittner

A paper-and-pencil test of simple arithmetic ability was exceptionally well suited for inclusion in a battery of Performance Evaluation Tests for Environmental Research (PETER). Mean performance stabilized after nine days of baseline testing. Variance was constant throughout 15 days of baseline testing. “Task definition” was high, and “differential stability” was present from the outset. Subjects apparently came to this test with well established differential levels of arithmetic ability.


1979 ◽  
Vol 23 (1) ◽  
pp. 508-512 ◽  
Author(s):  
D. M. Seales ◽  
R. S. Kennedy ◽  
A. C. Bittner

A paper-and-pencil test of simple arithmetic ability was found to be exceptionally well suited for inclusion in a battery of Performance Evaluation Tests for Environmental Research (PETER). Mean performance stabilized after nine days of baseline testing. Variance was constant throughout fifteen days of baseline testing. “Task definition” was high, and “differential stability” was present from the outset. Subjects apparently came to this test with well established differential levels of arithmetic ability.


1984 ◽  
Vol 28 (1) ◽  
pp. 11-15 ◽  
Author(s):  
A. C. Bittner ◽  
R. C. Carter ◽  
R. S. Kennedy ◽  
M. M. Harbeson ◽  
M. Krause

The goal of the Performance Evaluation Tests for Environmental Research (PETER) Program was to identify a set of measures of human cognitive, perceptual, and motor capabilities for use in the study of environmental and other time-course effects. Tasks were evaluated as suitable for repeated measures applications when their intertrial means, variances and correlations were well-behaved under constant baseline conditions. This report provides an evaluation of 112 test measures studied in the program. They are categorized into four groups based upon joint consideration of task stability and task definition. Thirty test measures were categorized as Good, 15 as Good-But-Redundant, 35 as Ugly (flawed), and 32 as Bad.


Ergonomics ◽  
1996 ◽  
Vol 39 (8) ◽  
pp. 1005-1016 ◽  
Author(s):  
ROBERT S. KENNEDY ◽  
WILLIAM P. DUNLAP ◽  
ALYSIA D. RITTER ◽  
LIGIA M. CHAVEZ

1983 ◽  
Vol 27 (8) ◽  
pp. 674-678 ◽  
Author(s):  
Martin G. Smith ◽  
Michele Krause ◽  
Robert S. Kennedy ◽  
Alvah C. Bittner ◽  
Mary M. Harbeson

Microprocessors in the form of personal computers and home game systems are now widely available at affordable prices. Researchers are rapidly acquiring systems for the collection and analysis of data and recording of results. However, the use of these devices parallels the implementation of the early apparatus-based tests which began their development during World War II. Although increased speed in test administration was gained, the mechanization of traditional tests, at times, resulted in alteration of the behavioral factors studied, as well as difficulties with equipment reliability. Pitfalls to be avoided when considering a test for microprocessor mechanization include: (a) equipment factors, (b) quantitative issues, and (c) their interactions. This report outlines the procedures one should follow when implementing a microprocessor based performance test battery.


Author(s):  
Amir Golalipour ◽  
Varun Veginati ◽  
David J. Mensching

In the asphalt materials community, the most critical research need is centered around a paradigm shift in mixture design from the volumetric process of the previous 20-plus years to an optimization procedure based on laboratory-measured mechanical properties that should lead to an increase in long-term pavement performance. This study is focused on advancing the state of understanding with respect to the value of intermediate temperature cracking tests, which may be included in a balanced mix design. The materials included are plant-mixed, laboratory-compacted specimens reheated from the 2013 Federal Highway Administration’s (FHWA’s) Accelerated Loading Facility (ALF) study on reclaimed asphalt pavement/reclaimed asphalt shingle (RAP/RAS) materials. Six commonly discussed intermediate temperature (cracking and durability) performance testing (i.e., Asphalt Mixture Performance Tester [AMPT] Cyclic Fatigue, Cantabro, Illinois Flexibility Index Test [I-FIT], Indirect Tensile Cracking [ITC, also known as IDEAL-CT], Indirect Tensile Nflex, and Texas Overlay Test) were selected for use in this study based on input from stakeholders. Test results were analyzed to compare differences between the cracking tests. In addition, statistical analyses were conducted to assess the separation among materials (lanes) for each performance test. Cyclic fatigue and IDEAL-CT tests showed the most promising results. The ranking from these two tests’ index parameters matched closely with ALF field performance. Furthermore, both showed reasonable variability of test data and they were successful in differentiating between different materials.


Sign in / Sign up

Export Citation Format

Share Document