scholarly journals A comparison of computerized adaptive testing and fixed-length short forms for the Prosthetic Limb Users Survey of Mobility (PLUS-MTM)

2017 ◽  
Vol 42 (5) ◽  
pp. 476-482 ◽  
Author(s):  
Dagmar Amtmann ◽  
Alyssa M Bamer ◽  
Jiseon Kim ◽  
Fraser Bocell ◽  
Hyewon Chung ◽  
...  

Background: New health status instruments can be administered by computerized adaptive test or short forms. The Prosthetic Limb Users Survey of Mobility (PLUS-MTM) is a self-report measure of mobility for prosthesis users with lower limb loss. This study used the PLUS-M to examine advantages and disadvantages of computerized adaptive test and short forms. Objectives: To compare scores obtained from computerized adaptive test to scores obtained from fixed-length short forms (7-item and 12-item) in order to provide guidance to researchers and clinicians on how to select the best form of administration for different uses. Study design: Cross-sectional, observational study. Methods: Individuals with lower limb loss completed the PLUS-M by computerized adaptive test and short forms. Administration time, correlations between the scores, and standard errors were compared. Results: Scores and standard errors from the computerized adaptive test, 7-item short form, and 12-item short form were highly correlated and all forms of administration were efficient. Computerized adaptive test required less time to administer than either paper or electronic short forms; however, time savings were minimal compared to the 7-item short form. Conclusion: Results indicate that the PLUS-M computerized adaptive test is most efficient, and differences in scores between administration methods are minimal. The main advantage of the computerized adaptive test was more reliable scores at higher levels of mobility compared to short forms. Clinical relevance Health-related item banks, like the Prosthetic Limb Users Survey of Mobility (PLUS-MTM), can be administered by computerized adaptive testing (CAT) or as fixed-length short forms (SFs). Results of this study will help clinicians and researchers decide whether they should invest in a CAT administration system or whether SFs are more appropriate.

2021 ◽  
Author(s):  
Bryant A Seamon ◽  
Steven A Kautz ◽  
Craig A Velozo

Abstract Objective Administrative burden often prevents clinical assessment of balance confidence in people with stroke. A computerized adaptive test (CAT) version of the Activities-specific Balance Confidence Scale (ABC CAT) can dramatically reduce this burden. The objective of this study was to test balance confidence measurement precision and efficiency in people with stroke with an ABC CAT. Methods We conducted a retrospective cross-sectional simulation study with data from 406 adults approximately 2-months post-stroke in the Locomotor-Experience Applied Post-Stroke (LEAPS) trial. Item parameters for CAT calibration were estimated with the Rasch model using a random sample of participants (n = 203). Computer simulation was used with response data from remaining 203 participants to evaluate the ABC CAT algorithm under varying stopping criteria. We compared estimated levels of balance confidence from each simulation to actual levels predicted from the Rasch model (Pearson correlations and mean standard error (SE)). Results Results from simulations with number of items as a stopping criterion strongly correlated with actual ABC scores (full item, r = 1, 12-item, r = 0.994; 8-item, r = 0.98; 4-item, r = 0.929). Mean SE increased with decreasing number of items administered (full item, SE = 0.31; 12-item, SE = 0.33; 8-item, SE = 0.38; 4-item, SE = 0.49). A precision-based stopping rule (mean SE = 0.5) also strongly correlated with actual ABC scores (r = .941) and optimized the relationship between number of items administrated with precision (mean number of items 4.37, range [4–9]). Conclusions An ABC CAT can determine accurate and precise measures of balance confidence in people with stroke with as few as 4 items. Individuals with lower balance confidence may require a greater number of items (up to 9) and attributed to the LEAPS trial excluding more functionally impaired persons. Impact Statement Computerized adaptive testing can drastically reduce the ABC’s test administration time while maintaining accuracy and precision. This should greatly enhance clinical utility, facilitating adoption of clinical practice guidelines in stroke rehabilitation. Lay Summary If you have had a stroke, your physical therapist will likely test your balance confidence. A computerized adaptive test version of the ABC scale can accurately identify balance with as few as 4 questions, which takes much less time.


1995 ◽  
Vol 13 (2) ◽  
pp. 151-162 ◽  
Author(s):  
Mary E. Lunz ◽  
Betty Bergstrom

Computerized adaptive testing (CAT) uses a computer algorithm to construct and score the best possible individualized or tailored tests for each candidate. The computer also provides an absolute record of all responses and changes to responses, as well as their effects on candidate performance. The detail of the data from computerized adaptive tests makes it possible to track initial responses and response alterations, and their effect on candidate estimated ability measures, as well as the statistical performance of the examination. The purpose of this study was to track the effect of candidate response patterns on a computerized adaptive test. A ninety-item certification examination was divided into nine units of ten items each to track the pattern of initial responses and response alterations on ability estimates and test precision across the nine test units. The precision of the test was affected most by response alterations during early segments of the test. While generally, candidates benefit from altering responses, individual candidates showed different patterns of response alterations across test segments. Test precision is minimally affected, suggesting that the tailoring of CAT is minimally affected by response alterations.


1999 ◽  
Vol 15 (2) ◽  
pp. 91-98 ◽  
Author(s):  
Lutz F. Hornke

Summary: Item parameters for several hundreds of items were estimated based on empirical data from several thousands of subjects. The logistic one-parameter (1PL) and two-parameter (2PL) model estimates were evaluated. However, model fit showed that only a subset of items complied sufficiently, so that the remaining ones were assembled in well-fitting item banks. In several simulation studies 5000 simulated responses were generated in accordance with a computerized adaptive test procedure along with person parameters. A general reliability of .80 or a standard error of measurement of .44 was used as a stopping rule to end CAT testing. We also recorded how often each item was used by all simulees. Person-parameter estimates based on CAT correlated higher than .90 with true values simulated. For all 1PL fitting item banks most simulees used more than 20 items but less than 30 items to reach the pre-set level of measurement error. However, testing based on item banks that complied to the 2PL revealed that, on average, only 10 items were sufficient to end testing at the same measurement error level. Both clearly demonstrate the precision and economy of computerized adaptive testing. Empirical evaluations from everyday uses will show whether these trends will hold up in practice. If so, CAT will become possible and reasonable with some 150 well-calibrated 2PL items.


1996 ◽  
Vol 21 (4) ◽  
pp. 365-389 ◽  
Author(s):  
Martha L. Stocking

Modern applications of computerized adaptive testing are typically grounded in item response theory (IRT; Lord, 1980 ). While the IRT foundations of adaptive testing provide a number of approaches to adaptive test scoring that may seem natural and efficient to psychometricians, these approaches may be more demanding for test takers, test score users, and interested regulatory institutions to comprehend. An alternative method, based on more familiar equated number-correct scores and identical to that used to score and equate many conventional tests, is explored and compared with one that relies more directly on IRT. It is concluded that scoring adaptive tests using the familiar number-correct score, accompanied by the necessary equating to adjust for the intentional differences in adaptive test difficulty, is a statistically viable, although slightly less efficient, method of adaptive test scoring. To enhance the prospects for enlightened public debate about adaptive testing, it may be preferable to use this more familiar approach. Public attention would then likely be focused on issues more central to adaptive testing, namely, the adaptive nature of the test.


2018 ◽  
Vol 25 (13) ◽  
pp. 1791-1799 ◽  
Author(s):  
Brian C Healy ◽  
Jonathan Zurawski ◽  
Cindy T Gonzalez ◽  
Tanuja Chitnis ◽  
Howard L Weiner ◽  
...  

Background: To date, the computerized adaptive testing (CAT) version of the Neuro-quality of life (QOL) has not been assessed in a large sample of people with multiple sclerosis (MS). Objective: The aim of this study was to assess the associations between the CAT version of Neuro-QOL and other clinical and patient-reported outcome measures. Methods: Subjects ( n = 364) enrolled in SysteMS completed the CAT version of the Neuro-QOL and the 36-Item Short Form Survey (SF-36) within 4 weeks of a clinical exam that included the Multiple Sclerosis Functional Composite-4 (MSFC-4). The correlations between the Neuro-QOL domains and the MSFC-4 subscores and the SF-36 scores were calculated. The changes over time in the Neuro-QOL and other measures were also examined. Results: The lower extremity functioning score of the Neuro-QOL showed the highest correlations with MSFC-4 components including Timed 25-Foot Walk, 9-Hole Peg Test, and cognitive score. The expected domains of the Neuro-QOL showed high correlations with the SF-36 subscores, and some Neuro-QOL domains were associated with many SF-36 subscores. There was limited longitudinal change on the Neuro-QOL domains over 12 months, and the change was not associated with change on other measures. Conclusion: The CAT version of the Neuro-QOL shows many of the expected associations with clinical and patient-reported outcome measures.


Author(s):  
Louise C. Mâsse ◽  
Teresia M. O’Connor ◽  
Yingyi Lin ◽  
Sheryl O. Hughes ◽  
Claire N. Tugault-Lafleur ◽  
...  

Abstract Purpose There has been a call to improve measurement rigour and standardization of food parenting practices measures, as well as aligning the measurement of food parenting practices with the parenting literature. Drawing from an expert-informed conceptual framework assessing three key domains of food parenting practices (autonomy promotion, control, and structure), this study combined factor analytic methods with Item Response Modeling (IRM) methodology to psychometrically validate responses to the Food Parenting Practice item bank. Methods A sample of 799 Canadian parents of 5–12-year-old children completed the Food Parenting Practice item bank (129 items measuring 17 constructs). The factorial structure of the responses to the item bank was assessed with confirmatory factor analysis (CFA), confirmatory bi-factor item analysis, and IRM. Following these analyses, differential Item Functioning (DIF) and Differential Response Functioning (DRF) analyses were then used to test invariance properties by parents’ sex, income and ethnicity. Finally, the efficiency of the item bank was examined using computerized adaptive testing simulations to identify the items to include in a short form. Results Overall, the expert-informed conceptual framework was predominantly supported by the CFA as it retained the same 17 constructs included in the conceptual framework with the exception of the access/availability and permissive constructs which were respectively renamed covert control and accommodating the child to better reflect the content of the final solution. The bi-factor item analyses and IRM analyses revealed that the solution could be simplified to 11 unidimensional constructs and the full item bank included 86-items (empirical reliability from 0.78 to 0.96, except for 1 construct) and the short form had 48 items. Conclusion Overall the food parenting practice item bank has excellent psychometric properties. The item bank includes an expanded version and short version to meet various study needs. This study provides more efficient tools for assessing how food parenting practices influence child dietary behaviours. Next steps are to use the IRM calibrated item bank and draw on computerized adaptive testing methodology to administer the item bank and provide flexibility in item selection.


2009 ◽  
Vol 36 (9) ◽  
pp. 2061-2066 ◽  
Author(s):  
JAMES F. FRIES ◽  
DAVID CELLA ◽  
MATTHIAS ROSE ◽  
ESWAR KRISHNAN ◽  
BONNIE BRUCE

Objective.Assessing self-reported physical function/disability with the Health Assessment Questionnaire Disability Index (HAQ) and other instruments has become central in arthritis research. Item response theory (IRT) and computerized adaptive testing (CAT) techniques can increase reliability and statistical power. IRT-based instruments can improve measurement precision substantially over a wider range of disease severity. These modern methods were applied and the magnitude of improvement was estimated.Methods.A 199-item physical function/disability item bank was developed by distilling 1865 items to 124, including Legacy Health Assessment Questionnaire (HAQ) and Physical Function-10 items, and improving precision through qualitative and quantitative evaluation in over 21,000 subjects, which included about 1500 patients with rheumatoid arthritis and osteoarthritis. Four new instruments, (A) Patient-Reported Outcomes Measurement Information (PROMIS) HAQ, which evolved from the original (Legacy) HAQ; (B) “best” PROMIS 10; (C) 20-item static (short) forms; and (D) simulated PROMIS CAT, which sequentially selected the most informative item, were compared with the HAQ.Results.Online and mailed administration modes yielded similar item and domain scores. The HAQ and PROMIS HAQ 20-item scales yielded greater information content versus other scales in patients with more severe disease. The “best” PROMIS 20-item scale outperformed the other 20-item static forms over a broad range of 4 standard deviations. The 10-item simulated PROMIS CAT outperformed all other forms.Conclusion.Improved items and instruments yielded better information. The PROMIS HAQ is currently available and considered validated. The new PROMIS short forms, after validation, are likely to represent further improvement. CAT-based physical function/disability assessment offers superior performance over static forms of equal length.


2009 ◽  
Vol 89 (6) ◽  
pp. 589-600 ◽  
Author(s):  
Stephen M. Haley ◽  
Maria A. Fragala-Pinkham ◽  
Helene M. Dumas ◽  
Pengsheng Ni ◽  
George E. Gorton ◽  
...  

Background: Contemporary clinical assessments of activity are needed across the age span for children with cerebral palsy (CP). Computerized adaptive testing (CAT) has the potential to efficiently administer items for children across wide age spans and functional levels. Objective: The objective of this study was to examine the psychometric properties of a new item bank and simulated computerized adaptive test to assess activity level abilities in children with CP. Design: This was a cross-sectional item calibration study. Methods: The convenience sample consisted of 308 children and youth with CP, aged 2 to 20 years (X=10.7, SD=4.0), recruited from 4 pediatric hospitals. We collected parent-report data on an initial set of 45 activity items. Using an Item Response Theory (IRT) approach, we compared estimated scores from the activity item bank with concurrent instruments, examined discriminate validity, and developed computer simulations of a CAT algorithm with multiple stop rules to evaluate scale coverage, score agreement with CAT algorithms, and discriminant and concurrent validity. Results: Confirmatory factor analysis supported scale unidimensionality, local item dependence, and invariance. Scores from the computer simulations of the prototype CATs with varying stop rules were consistent with scores from the full item bank (r=.93–.98). The activity summary scores discriminated across levels of upper-extremity and gross motor severity and were correlated with the Pediatric Outcomes Data Collection Instrument (PODCI) physical function and sports subscale (r=.86), the Functional Independence Measure for Children (Wee-FIM) (r=.79), and the Pediatric Quality of Life Inventory–Cerebral Palsy version (r=.74). Limitations: The sample size was small for such IRT item banks and CAT development studies. Another limitation was oversampling of children with CP at higher functioning levels. Conclusions: The new activity item bank appears to have promise for use in a CAT application for the assessment of activity abilities in children with CP across a wide age range and different levels of motor severity.


Sign in / Sign up

Export Citation Format

Share Document