Ghent Developmental Balance Test: A New Tool to Evaluate Balance Performance in Toddlers and Preschool Children

Background Balance is a fundamental component of movement. Early identification of balance problems is important to plan early intervention. The Ghent Developmental Balance Test (GDBT) is a new assessment tool designed to monitor balance from the initiation of independent walking to 5 years of age. Objective The purpose of this study was to establish the psychometric characteristics of the GDBT. Methods To evaluate test-retest reliability, 144 children were tested twice on the GDBT by the same examiner, and to evaluate interrater reliability, videotaped GDBT sessions of 22 children were rated by 3 different raters. To evaluate the known-group validity of GDBT scores, z scores on the GDBT were compared between a clinical group (n=20) and a matched control group (n=20). Concurrent validity of GDBT scores with the subscale standardized scores of the Movement Assessment Battery for Children–Second Edition (M-ABC-2), the Peabody Developmental Motor Scales–Second Edition (PDMS-2), and the balance subscale of the Bruininks-Oseretsky Test–Second Edition (BOT-2) was evaluated in a combined group of the 20 children from the clinical group and 74 children who were developing typically. Results Test-retest and interrater reliability were excellent for the GDBT total scores, with intraclass correlation coefficients of .99 and .98, standard error of measurement values of 0.21 and 0.78, and small minimal detectable differences of 0.58 and 2.08, respectively. The GDBT was able to distinguish between the clinical group and the control group (t38=5.456, P<.001). Pearson correlations between the z scores on GDBT and the standardized scores of specific balance subscales of the M-ABC-2, PDMS-2, and BOT-2 were moderate to high, whereas correlations with subscales measuring constructs other than balance were low. Conclusions The GDBT is a reliable and valid clinical assessment tool for the evaluation of balance in toddlers and preschool-aged children.

Download Full-text

Interrater Reliability of the Star Excursion Balance Test

Journal of Athletic Training ◽

10.4085/1062-6050-48.3.03 ◽

2013 ◽

Vol 48 (5) ◽

pp. 621-626 ◽

Cited By ~ 83

Author(s):

Phillip A. Gribble ◽

Sarah E. Kelly ◽

Kathryn M. Refshauge ◽

Claire E. Hiller

Keyword(s):

Postural Control ◽

Interrater Reliability ◽

Assessment Tool ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Preliminary Evidence ◽

The Body ◽

Leg Length ◽

Balance Test ◽

Star Excursion Balance Test

Context: Dynamic postural control has gained popularity as a more useful assessment of function than static postural control. One measurement of dynamic postural control that has increased in frequency of use is the Star Excursion Balance Test (SEBT). Although the intrarater reliability of the SEBT is excellent, few authors have determined interrater reliability. Preliminary evidence has shown poor reliability between assessors. Objective: To determine interrater reliability using a group of investigators at 2 testing sites. A corollary purpose was to examine the interrater reliability when using normalized and nonnormalized performance scores on the SEBT. Design: Descriptive laboratory study. Setting: University research laboratory. Patients or Other Participants: A total of 29 healthy participants between 18 and 50 years of age. Intervention(s): Participants were evaluated by 5 raters at 2 testing sites. After participants performed 4 practice trials, each rater assessed 3 test trials in the anterior, posteromedial, and posterolateral reaching directions of the SEBT. Main Outcome Measure(s): Normalized and nonnormalized (leg-length) reaching distances were analyzed. Additionally, the mean and maximum values from the 3 test trials were analyzed, producing a total of 16 variables. Results: For all 16 measures, the interrater reliability was excellent. For the normalized maximum excursion distances, the intraclass correlation coefficients (1,1) ranged from 0.86 to 0.92. Reliability for the nonnormalized measurements was stronger, ranging from 0.89 to 0.94. Conclusions: When the raters have been trained by an experienced rater, the SEBT is a test with excellent reliability when used across multiple raters in different settings. This information adds to the body of knowledge that exists regarding the usefulness of the SEBT as an assessment tool in clinical and research practice. Establishing excellent interrater reliability with normalized and nonnormalized scores strengthens the evidence for using the SEBT, especially at multiple sites.

Download Full-text

Interobserver Reliability Using the Phonetic Level Evaluation With Severely and Profoundly Hearing-Impaired Children

Journal of Speech Language and Hearing Research ◽

10.1044/jshr.3405.989 ◽

1991 ◽

Vol 34 (5) ◽

pp. 989-999 ◽

Cited By ~ 6

Author(s):

Stephanie Shaw ◽

Truman E. Coggins

Keyword(s):

Interrater Reliability ◽

Interobserver Reliability ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Hearing Impaired ◽

Intraclass Correlation Coefficients ◽

Assessment Measure ◽

Impaired Children ◽

Speech Assessment ◽

Hearing Impaired Children

This study examines whether observers reliably categorize selected speech production behaviors in hearing-impaired children. A group of experienced speech-language pathologists was trained to score the elicited imitations of 5 profoundly and 5 severely hearing-impaired subjects using the Phonetic Level Evaluation (Ling, 1976). Interrater reliability was calculated using intraclass correlation coefficients. Overall, the magnitude of the coefficients was found to be considerably below what would be accepted in published behavioral research. Failure to obtain acceptably high levels of reliability suggests that the Phonetic Level Evaluation may not yet be an accurate and objective speech assessment measure for hearing-impaired children.

Download Full-text

CODEM Instrument

GeroPsych ◽

10.1024/1662-9647/a000100 ◽

2014 ◽

Vol 27 (1) ◽

pp. 23-31 ◽

Cited By ~ 4

Author(s):

Anne Kuemmel (This author contributed eq ◽

Julia Haberstroh (This author contributed ◽

Johannes Pantel

Keyword(s):

Convergent Validity ◽

Interrater Reliability ◽

Discriminant Validity ◽

Assessment Tool ◽

Intraclass Correlation ◽

Well Being ◽

Communication Behavior ◽

People With Dementia ◽

Pearson's R ◽

Pearson’S R

Communication and communication behaviors in situational contexts are essential conditions for well-being and quality of life in people with dementia. Measuring methods, however, are limited. The CODEM instrument, a standardized observational communication behavior assessment tool, was developed and evaluated on the basis of the current state of research in dementia care and social-communicative behavior. Initially, interrater reliability was examined by means of videoratings (N = 10 people with dementia). Thereupon, six caregivers in six German nursing homes observed 69 residents suffering from dementia and used CODEM to rate their communication behavior. The interrater reliability of CODEM was excellent (mean κ = .79; intraclass correlation = .91). Statistical analysis indicated that CODEM had excellent internal consistency (Cronbach’s α = .95). CODEM also showed excellent convergent validity (Pearson’s R = .88) as well as discriminant validity (Pearson’s R = .63). Confirmatory factor analysis verified the two-factor solution of verbal/content aspects and nonverbal/relationship aspects. With regard to the severity of the disease, the content and relational aspects of communication exhibited different trends. CODEM proved to be a reliable, valid, and sensitive assessment tool for examining communication behavior in the field of dementia. CODEM also provides researchers a feasible examination tool for measuring effects of psychosocial intervention studies that strive to improve communication behavior and well-being in dementia.

Download Full-text

Patient-Reported Dysphagia in Adults with Eosinophilic Esophagitis: Translation and Validation of the Swedish Eosinophilic Esophagitis Activity Index

Dysphagia ◽

10.1007/s00455-021-10277-5 ◽

2021 ◽

Author(s):

Sofie Albinsson ◽

Lisa Tuomi ◽

Christine Wennerås ◽

Helen Larsson

Keyword(s):

Eosinophilic Esophagitis ◽

Activity Index ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Control Group ◽

Cronbach’S Alpha ◽

Intraclass Correlation Coefficients ◽

Cronbach's Alpha ◽

Patient Reported ◽

Eortc Qlq

AbstractThe lack of a Swedish patient-reported outcome instrument for eosinophilic esophagitis (EoE) has limited the assessment of the disease. The aims of the study were to translate and validate the Eosinophilic Esophagitis Activity Index (EEsAI) to Swedish and to assess the symptom severity of patients with EoE compared to a nondysphagia control group. The EEsAI was translated and adapted to a Swedish cultural context (S-EEsAI) based on international guidelines. The S-EEsAI was validated using adult Swedish patients with EoE (n = 97) and an age- and sex-matched nondysphagia control group (n = 97). All participants completed the S-EEsAI, the European Organization for Research and Treatment of Cancer Quality of Life Questionnaire-Oesophageal Module 18 (EORTC QLQ-OES18), and supplementary questions regarding feasibility and demographics. Reliability and validity of the S-EEsAI were evaluated by Cronbach’s alpha and Spearman correlation coefficients between the domains of the S-EEsAI and the EORTC QLQ-OES18. A test–retest analysis of 29 patients was evaluated through intraclass correlation coefficients. The S-EEsAI had sufficient reliability with Cronbach’s alpha values of 0.83 and 0.85 for the “visual dysphagia question” and the “avoidance, modification and slow eating score” domains, respectively. The test–retest reliability was sufficient, with good to excellent intraclass correlation coefficients (0.60–0.89). The S-EEsAI domains showed moderate correlation to 6/10 EORTC QLQ-OES18 domains, indicating adequate validity. The patient S-EEsAI results differed significantly from those of the nondysphagia controls (p < 0.001). The S-EEsAI appears to be a valid and reliable instrument for monitoring adult patients with EoE in Sweden.

Download Full-text

Development and Initial Validation of a Project-Based Rubric to Assess the Systems-Based Practice Competency of Residents in the Clinical Chemistry Rotation of a Pathology Residency

Archives of Pathology & Laboratory Medicine ◽

10.5858/arpa.2013-0046-oa ◽

2014 ◽

Vol 138 (6) ◽

pp. 809-813

Author(s):

Carolyn R. Vitek ◽

Jane C. Dale ◽

Henry A. Homburger ◽

Sandra C. Bryant ◽

Amy K. Saenger ◽

...

Keyword(s):

Critical Thinking ◽

Interrater Reliability ◽

Clinical Chemistry ◽

Core Competencies ◽

Intraclass Correlation ◽

Reliability And Validity ◽

Correlation Coefficients ◽

Thinking Skills ◽

Project Evaluation ◽

Critical Thinking Skills

Context.— Systems-based practice (SBP) is 1 of 6 core competencies required in all resident training programs accredited by the Accreditation Council for Graduate Medical Education. Reliable methods of assessing resident competency in SBP have not been described in the medical literature. Objective.— To develop and validate an analytic grading rubric to assess pathology residents' analyses of SBP problems in clinical chemistry. Design.— Residents were assigned an SBP project based upon unmet clinical needs in the clinical chemistry laboratories. Using an iterative method, we created an analytic grading rubric based on critical thinking principles. Four faculty raters used the SBP project evaluation rubric to independently grade 11 residents' projects during their clinical chemistry rotations. Interrater reliability and Cronbach α were calculated to determine the reliability and validity of the rubric. Project mean scores and range were also assessed to determine whether the rubric differentiated resident critical thinking skills related to the SBP projects. Results.— Overall project scores ranged from 6.56 to 16.50 out of a possible 20 points. Cronbach α ranged from 0.91 to 0.96, indicating that the 4 rubric categories were internally consistent without significant overlap. Intraclass correlation coefficients ranged from 0.63 to 0.81, indicating moderate to strong interrater reliability. Conclusions.— We report development and statistical analysis of a novel SBP project evaluation rubric. The results indicate the rubric can be used to reliably assess pathology residents' critical thinking skills in SBP.

Download Full-text

Development of a Model for the Acquisition and Assessment of Advanced Laparoscopic Suturing Skills Using an Automated Device

Surgical Innovation ◽

10.1177/1553350618764221 ◽

2018 ◽

Vol 25 (3) ◽

pp. 286-290 ◽

Cited By ~ 2

Author(s):

Elif Bilgic ◽

Madoka Takao ◽

Pepa Kaneva ◽

Satoshi Endo ◽

Toshitatsu Takao ◽

...

Keyword(s):

Laparoscopic Surgery ◽

Interrater Reliability ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Instructional Video ◽

Validity Evidence ◽

Laparoscopic Suturing ◽

Intraclass Correlation Coefficients ◽

Operative Assessment ◽

Suturing Skills

Background. Needs assessment identified a gap regarding laparoscopic suturing skills targeted in simulation. This study collected validity evidence for an advanced laparoscopic suturing task using an Endo StitchTM device. Methods. Experienced (ES) and novice surgeons (NS) performed continuous suturing after watching an instructional video. Scores were based on time and accuracy, and Global Operative Assessment of Laparoscopic Surgery. Data are shown as medians [25th-75th percentiles] (ES vs NS). Interrater reliability was calculated using intraclass correlation coefficients (confidence interval). Results. Seventeen participants were enrolled. Experienced surgeons had significantly greater task (980 [964-999] vs 666 [391-711], P = .0035) and Global Operative Assessment of Laparoscopic Surgery scores (25 [24-25] vs 14 [12-17], P = .0029). Interrater reliability for time and accuracy were 1.0 and 0.9 (0.74-0.96), respectively. All experienced surgeons agreed that the task was relevant to practice. Conclusion. This study provides validity evidence for the task as a measure of laparoscopic suturing skill using an automated suturing device. It could help trainees acquire the skills they need to better prepare for clinical learning.

Download Full-text

Reliability and Responsiveness of Two Physical Performance Measures Examined in the Context of a Functional Training Intervention

Physical Therapy ◽

10.1093/ptj/80.1.8 ◽

2000 ◽

Vol 80 (1) ◽

pp. 8-16 ◽

Cited By ~ 61

Author(s):

Mary B King ◽

James O Judge ◽

Robert Whipple ◽

Leslie Wolfson

Keyword(s):

Physical Performance ◽

Interrater Reliability ◽

Intraclass Correlation ◽

Intervention Group ◽

Training Intervention ◽

Control Group ◽

Walk Test ◽

Test Retest Reliability ◽

6 Minute Walk Test ◽

Minute Walk

Abstract Background and Purpose. The reliability and responsiveness of 2 physical performance measures were assessed in this nonrandomized, controlled pilot exercise intervention. Subjects. Forty-five older individuals with mobility impairment (mean age=77.9 years, SD=5.9, range=70–92) were sequentially assigned to participate in an exercise program (intervention group) or to a control group. Methods. The intervention group performed exercise 3 times a week for 12 weeks that targeted muscle force, endurance, balance, and flexibility. Outcome measures were the 8-item Physical Performance Test (PPT-8) and the 6-minute walk test. Test-retest reliability and responsiveness indexes were determined for both tests; interrater reliability was measured for the PPT-8. Results. The intraclass correlation coefficient for interrater reliability for the PPT-8 was .96. Intraclass correlation coefficients for test-retest reliability were .88 for the PPT-8 and .93 for the 6-minute walk test. The intervention group improved 2.4 points and the control group improved 0.7 point on the PPT-8, as compared with baseline measurements. There was no change in 6-minute walk test distance in the intervention group when compared with the control group. The responsiveness index was .8 for the PPT-8 and .6 for the 6-minute walk test. Conclusion and Discussion. Measurements for both the PPT-8 and the 6-minute walk test appeared to be highly reliable. The PPT-8 was more responsive than the 6-minute walk test to change in performance expected with this functional training intervention.

Download Full-text

MP23: Giving medical students what they deserve - a rigorous, equitable and defensible CaRMS selection process

CJEM ◽

10.1017/cem.2019.158 ◽

2019 ◽

Vol 21 (S1) ◽

pp. S50

Author(s):

Q. Paterson ◽

R. Hartmann ◽

R. Woods ◽

L. Martin ◽

B. Thoma

Keyword(s):

Medical Students ◽

Selection Process ◽

Rank Order ◽

Intraclass Correlation ◽

Data Entry ◽

Weighted Average ◽

Correlation Coefficients ◽

Visual Aid ◽

Z Scores ◽

The University

Innovation Concept: The fairness of the Canadian Residency Matching Service (CaRMS) selection process has been called into question by rising rates of unmatched medical students and reports of bias and subjectivity. We outline how the University of Saskatchewan Royal College emergency medicine program evaluates CaRMS applications in a standardized, rigorous, equitable and defensible manner. Methods: Our CaRMS applicant evaluation methods were first utilized in the 2017 CaRMS cycle, based on published Best Practices, and have been refined yearly to ensure validity, standardization, defensibility, rigour, and to improve the speed and flow of data processing. To determine the reliability of the total application scores for each rater, single measures intraclass correlation coefficients (ICCs) were calculated using a random effects model in 2017 and 2018. Curriculum, Tool or Material: A secure, online spreadsheet was created that includes applicant names, reviewer assignments, data entry boxes, and formulas. Each file reviewer entered data in a dedicated sheet within the document. Each application was reviewed by two staff physicians and two to four residents. File reviewers used a standardized, criterion-based scoring rubric for each application component. The file score for each reviewer-applicant pair was converted into a z-score based on each reviewer's distribution of scores. Z-scores of all reviewers for a single applicant were then combined by weighted average, with the group of staff and group of residents each being weighted to represent half of the final file score. The ICC for the total raw scores improved from 0.38 (poor) in 2017 to 0.52 (moderate) in 2018. The data from each reviewer was amalgamated into a master sheet where applicants were sorted by final file score and heat-mapped to offer a visual aid regarding differences in ratings. Conclusion: Our innovation uses heat-mapped and formula-populated spreadsheets, scoring rubrics, and z-scores to normalize variation in scoring trends between reviewers. We believe this approach provides a rigorous, defensible, and reproducible process by which Canadian residency programs can appraise applicants and create a rank order list.

Download Full-text

Interrater Reliability of the Handwriting Speed Test

The Occupational Therapy Journal of Research ◽

10.1177/153944929701700404 ◽

1997 ◽

Vol 17 (4) ◽

pp. 280-287 ◽

Cited By ~ 8

Author(s):

Margaret Wallen ◽

Mary-Ann Bonney ◽

Lyn Lennox

Keyword(s):

Interrater Reliability ◽

Test Development ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Objective Evaluation ◽

Research Tool ◽

School Students ◽

Speed Test ◽

Intraclass Correlation Coefficients ◽

Handwriting Speed

The Handwriting Speed Test (HST), a standardized, norm-referenced test, was developed to provide an objective evaluation of the handwriting speed of school students from approximately 8 to 18 years of age. Part of the test development involved an examination of interrater reliability. Two raters scored 165 (13%) of the total 1292 handwriting samples. Using intraclass correlation coefficients, the interrater reliability was found to be excellent (ICC=1.00, P<0.0001). The process of examining interrater reliability resulted in modification to the scoring criteria of the test. Excellent interrater reliability provides support for the HST as a valuable clinical and research tool.

Download Full-text

Validation of an Evidence-Based Medicine Critically Appraised Topic Presentation Evaluation Tool (EBM C-PET)

Journal of Graduate Medical Education ◽

10.4300/jgme-d-12-00049.1 ◽

2013 ◽

Vol 5 (2) ◽

pp. 252-256 ◽

Cited By ~ 2

Author(s):

Hans B. Kersten ◽

John G. Frohna ◽

Erin L. Giudice

Keyword(s):

Internal Consistency ◽

Evidence Based Medicine ◽

Interrater Reliability ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Evidence Based ◽

Evaluation Tool ◽

Intraclass Correlation Coefficients ◽

Excellent Internal Consistency ◽

Based Medicine

Abstract Background Competence in evidence-based medicine (EBM) is an important clinical skill. Pediatrics residents are expected to acquire competence in EBM during their education, yet few validated tools exist to assess residents' EBM skills. Objective We sought to develop a reliable tool to evaluate residents' EBM skills in the critical appraisal of a research article, the development of a written EBM critically appraised topic (CAT) synopsis, and a presentation of the findings to colleagues. Methods Instrument development used a modified Delphi technique. We defined the skills to be assessed while reviewing (1) a written CAT synopsis and (2) a resident's EBM presentation. We defined skill levels for each item using the Dreyfus and Dreyfus model of skill development and created behavioral anchors using a frame-of-reference training technique to describe performance for each skill level. We evaluated the assessment instrument's psychometric properties, including internal consistency and interrater reliability. Results The EBM Critically Appraised Topic Presentation Evaluation Tool (EBM C-PET) is composed of 14 items that assess residents' EBM and global presentation skills. Resident presentations (N = 27) and the corresponding written CAT synopses were evaluated using the EBM C-PET. The EBM C-PET had excellent internal consistency (Cronbach α = 0.94). Intraclass correlation coefficients were used to assess interrater reliability. Intraclass correlation coefficients for individual items ranged from 0.31 to 0.74; the average intraclass correlation coefficients for the 14 items was 0.67. Conclusions We identified essential components of an assessment tool for an EBM CAT synopsis and presentation with excellent internal consistency and a good level of interrater reliability across 3 different institutions. The EBM C-PET is a reliable tool to document resident competence in higher-level EBM skills.

Download Full-text