Assessing Fidelity to Family-Based Treatment: An Exploratory Examination of Expert, Therapist, Parent, and Peer Ratings

Abstract Introduction: Fidelity is an essential component for evaluating the clinical and implementation outcomes related to delivery of evidence-based practices (EBPs). Effective measurement of fidelity requires clinical buy-in, and as such, requires a process that is not burdensome for clinicians and managers. As part of a larger implementation study, we examined fidelity to Family-Based Treatment (FBT) measured by several different raters including an expert, a peer, therapists themselves, and parents, with a goal of determining a pragmatic, reliable and efficient method to capture treatment fidelity to FBT. Methods: Each therapist audio-recorded at least one FBT case and submitted recordings from session 1, 2, and 3 from phase 1, plus one additional session from phase 1, two sessions from phase 2, and one session from phase 3. These submitted files were rated by an expert and a peer rater using a validated FBT fidelity measure. As well, therapists and parents rated fidelity immediately following each session and submitted ratings to the research team. Inter-observer reliability was calculated for each item using the intraclass correlation coefficient (ICC), comparing the expert ratings to ratings from each of the other raters (parents, therapists, and peer). Mean scale scores were compared using repeated measures ANOVA.Results: Intraclass correlation coefficients revealed that agreement was the best between expert and peer, with excellent, good, or fair agreement in 7 of 13 items from session 1, 2 and 3. There were only four such values when comparing expert to parent agreement, and two such values comparing expert to therapist ratings. The rest of the ICC values indicated poor agreement. Scale level analysis indicated that expert fidelity ratings for phase 1 treatment sessions scores were significantly higher than the peer ratings and, that parent fidelity ratings tended to be significantly higher than the other raters across all three treatment phases. There were no significant differences between expert and therapist mean scores.Conclusions: There may be challenges inherent in parents rating fidelity accurately. Peer rating or therapist self-rating may be considered pragmatic, efficient, and reliable approaches to fidelity assessment for real-world clinical settings.

Download Full-text

Assessing fidelity to family-based treatment: an exploratory examination of expert, therapist, parent, and peer ratings

Journal of Eating Disorders ◽

10.1186/s40337-020-00366-5 ◽

2021 ◽

Vol 9 (1) ◽

Author(s):

Jennifer Couturier ◽

Melissa Kimber ◽

Melanie Barwick ◽

Gail McVey ◽

Sheri Findlay ◽

...

Keyword(s):

Repeated Measures ◽

Phase 1 ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

The Other ◽

Peer Ratings ◽

Fidelity Assessment ◽

Scale Scores ◽

Family Based Treatment ◽

Family Based

Abstract Introduction Fidelity is an essential component for evaluating the clinical and implementation outcomes related to delivery of evidence-based practices (EBPs). Effective measurement of fidelity requires clinical buy-in, and as such, requires a process that is not burdensome for clinicians and managers. As part of a larger implementation study, we examined fidelity to Family-Based Treatment (FBT) measured by several different raters including an expert, a peer, therapists themselves, and parents, with a goal of determining a pragmatic, reliable and efficient method to capture treatment fidelity to FBT. Methods Each therapist audio-recorded at least one FBT case and submitted recordings from session 1, 2, and 3 from phase 1, plus one additional session from phase 1, two sessions from phase 2, and one session from phase 3. These submitted files were rated by an expert and a peer rater using a validated FBT fidelity measure. As well, therapists and parents rated fidelity immediately following each session and submitted ratings to the research team. Inter-observer reliability was calculated for each item using the intraclass correlation coefficient (ICC), comparing the expert ratings to ratings from each of the other raters (parents, therapists, and peer). Mean scale scores were compared using repeated measures ANOVA. Results Intraclass correlation coefficients revealed that agreement was the best between expert and peer, with excellent, good, or fair agreement in 7 of 13 items from session 1, 2 and 3. There were only four such values when comparing expert to parent agreement, and two such values comparing expert to therapist ratings. The rest of the ICC values indicated poor agreement. Scale level analysis indicated that expert fidelity ratings for phase 1 treatment sessions scores were significantly higher than the peer ratings and, that parent fidelity ratings tended to be significantly higher than the other raters across all three treatment phases. There were no significant differences between expert and therapist mean scores. Conclusions There may be challenges inherent in parents rating fidelity accurately. Peer rating or therapist self-rating may be considered pragmatic, efficient, and reliable approaches to fidelity assessment for real-world clinical settings.

Download Full-text

Assessing Fidelity to Family-Based Treatment: Expert, Therapist, Parent, and Peer Ratings

10.21203/rs.3.rs-49544/v1 ◽

2020 ◽

Author(s):

Jennifer Couturier ◽

Melissa Kimber ◽

Melanie Barwick ◽

Gail McVey ◽

Sheri Findlay ◽

...

Keyword(s):

Repeated Measures ◽

Phase 1 ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

The Other ◽

Peer Ratings ◽

Fidelity Assessment ◽

Scale Scores ◽

Family Based Treatment ◽

Family Based

Abstract Introduction: Fidelity is an essential component for evaluating the clinical and implementation outcomes related to delivery of evidence-based practices (EBPs). Effective measurement of fidelity requires clinical buy-in, and as such, requires a process that is not burdensome for clinicians and managers. As part of a larger implementation study, we examined fidelity to Family-Based Treatment (FBT) measured by several different raters including an expert, a peer, therapists themselves, and parents, with a goal of determining a pragmatic, reliable and efficient method to capture treatment fidelity to FBT. Methods: Each therapist audio-recorded at least one FBT case and submitted recordings from session 1, 2, and 3 from phase 1, plus one additional session from phase 1, two sessions from phase 2, and one session from phase 3. These submitted files were rated by an expert and a peer rater using a validated FBT fidelity measure. As well, therapists and parents rated fidelity immediately following each session and submitted ratings to the research team. Inter-observer reliability was calculated for each item using the intraclass correlation coefficient (ICC), comparing the expert ratings to ratings from each of the other raters (parents, therapists, and peer). Mean scale scores were compared using repeated measures ANOVA. Results: Intraclass correlation coefficients revealed that agreement was the best between expert and peer, with excellent, good, or fair agreement in 7 of 13 items from session 1, 2 and 3. There were only four such values when comparing expert to parent agreement, and two such values comparing expert to therapist ratings. The rest of the ICC values indicated poor agreement. Scale level analysis indicated that expert fidelity ratings for phase 1 treatment sessions scores were significantly higher than the peer ratings and, that parent fidelity ratings tended to be significantly higher than the other raters across all three treatment phases. There were no significant differences between expert and therapist mean scores. Conclusions: There may be challenges inherent in parents rating fidelity accurately. Peer rating or therapist self-rating may be considered pragmatic, efficient, and reliable approaches to fidelity assessment for real-world clinical settings.

Download Full-text

Comparing Performance Across In-person and Videoconference-Based Administrations of Common Neuropsychological Measures in Community-Based Survivors of Stroke

Journal of the International Neuropsychological Society ◽

10.1017/s1355617720001174 ◽

2020 ◽

pp. 1-14

Author(s):

Jodie E. Chapman ◽

Betina Gardner ◽

Jennie Ponsford ◽

Dominique A. Cadilhac ◽

Renerus J. Stolwyk

Keyword(s):

Neuropsychological Assessment ◽

Language Impairment ◽

Repeated Measures ◽

Verbal Learning ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Preliminary Evidence ◽

Service Access ◽

Community Based ◽

Neuropsychological Measures

Abstract Objective: Neuropsychological assessment via videoconference could assist in bridging service access gaps due to geographical, mobility, or infection control barriers. We aimed to compare performances on neuropsychological measures across in-person and videoconference-based administrations in community-based survivors of stroke. Method: Participants were recruited through a stroke-specific database and community advertising. Stroke survivors were eligible if they had no upcoming neuropsychological assessment, concurrent neurological and/or major psychiatric diagnoses, and/or sensory, motor, or language impairment that would preclude standardised assessment. Thirteen neuropsychological measures were administered in-person and via videoconference in a randomised crossover design (2-week interval). Videoconference calls were established between two laptop computers, facilitated by Zoom. Repeated-measures t tests, intraclass correlation coefficients (ICCs), and Bland–Altman plots were used to compare performance across conditions. Results: Forty-eight participants (26 men; M age = 64.6, SD = 10.1; M time since stroke = 5.2 years, SD = 4.0) completed both sessions on average 15.8 (SD = 9.7) days apart. For most measures, the participants did not perform systematically better in a particular condition, indicating agreement between administration methods. However, on the Hopkins Verbal Learning Test – Revised, participants performed poorer in the videoconference condition (Total Recall Mdifference = −2.11). ICC estimates ranged from .40 to .96 across measures. Conclusions: This study provides preliminary evidence that in-person and videoconference assessment result in comparable scores for most neuropsychological tests evaluated in mildly impaired community-based survivors of stroke. This preliminary evidence supports teleneuropsychological assessment to address service gaps in stroke rehabilitation; however, further research is needed in more diverse stroke samples.

Download Full-text

Automatic 3D dense phenotyping provides reliable and accurate shape quantification of the human mandible

Scientific Reports ◽

10.1038/s41598-021-88095-w ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Pieter-Jan Verhelst ◽

H. Matthews ◽

L. Verstraete ◽

F. Van der Cruyssen ◽

D. Mulier ◽

...

Keyword(s):

Repeated Measures ◽

Intraclass Correlation ◽

Three Dimensional ◽

Correlation Coefficients ◽

Surface Registration ◽

Anatomical Landmarks ◽

Centroid Size ◽

Intraclass Correlation Coefficients ◽

Total Variability ◽

Automatic Phenotyping

AbstractAutomatic craniomaxillofacial (CMF) three dimensional (3D) dense phenotyping promises quantification of the complete CMF shape compared to the limiting use of sparse landmarks in classical phenotyping. This study assesses the accuracy and reliability of this new approach on the human mandible. Classic and automatic phenotyping techniques were applied on 30 unaltered and 20 operated human mandibles. Seven observers indicated 26 anatomical landmarks on each mandible three times. All mandibles were subjected to three rounds of automatic phenotyping using Meshmonk. The toolbox performed non-rigid surface registration of a template mandibular mesh consisting of 17,415 quasi landmarks on each target mandible and the quasi landmarks corresponding to the 26 anatomical locations of interest were identified. Repeated-measures reliability was assessed using root mean square (RMS) distances of repeated landmark indications to their centroid. Automatic phenotyping showed very low RMS distances confirming excellent repeated-measures reliability. The average Euclidean distance between manual and corresponding automatic landmarks was 1.40 mm for the unaltered and 1.76 mm for the operated sample. Centroid sizes from the automatic and manual shape configurations were highly similar with intraclass correlation coefficients (ICC) of > 0.99. Reproducibility coefficients for centroid size were < 2 mm, accounting for < 1% of the total variability of the centroid size of the mandibles in this sample. ICC’s for the multivariate set of 325 interlandmark distances were all > 0.90 indicating again high similarity between shapes quantified by classic or automatic phenotyping. Combined, these findings established high accuracy and repeated-measures reliability of the automatic approach. 3D dense CMF phenotyping of the human mandible using the Meshmonk toolbox introduces a novel improvement in quantifying CMF shape.

Download Full-text

Infant with Clefts Observation Outcomes Instrument (iCOO): A New Outcome for Infants and Young Children with Orofacial Clefts

The Cleft Palate-Craniofacial Journal ◽

10.1177/10556656211040307 ◽

2021 ◽

pp. 105566562110403

Author(s):

Todd C. Edwards ◽

Carrie L. Heike ◽

Kathleen A. Kapp-Simon ◽

Salene M. Jones ◽

Brian G. Leroux ◽

...

Keyword(s):

Cleft Lip ◽

Intraclass Correlation ◽

Primary Caregivers ◽

Correlation Coefficients ◽

Measurement Properties ◽

Cross Sectional ◽

Scale Scores ◽

Health Domains ◽

And Function ◽

The Impact

Objective We evaluated the measurement properties for item and domain scores of the Infant with Clefts Observation Outcomes Instrument (iCOO). Design Cross-sectional (before lip surgery) and longitudinal study (preoperative baseline and 2 days and 2 months after lip surgery). Setting Three academic craniofacial centers and national online advertisements. Participants Primary caregivers with an infant with cleft lip with or without cleft palate (CL ± P) scheduled to undergo primary lip repair. There were 133 primary caregivers at baseline, 115 at 2 days postsurgery, and 112 at 2 months postsurgery. Main Outcome Measure(s) Caregiver observation items ( n = 61) and global impression of health and function items ( n = 8) across eight health domains. Results Mean age at surgery was 6.0 months (range 2.7-11.8 months). Five of eight iCOO domains have scale scores, with Cronbach’s alphas ranging from 0.67 to 0.87. Except for the Facial Skin and Mouth domain, iCOO scales had acceptable intraclass correlation coefficients (ICCs) ranging from 0.76 to 0.84. The internal consistency of the Global Impression items across all domains was 0.90 and had acceptable ICCs (range 0.76-0.91). Sixteen out of 20 (nonscale) items had acceptable ICCs (range 0.66-0.96). As anticipated, iCOO scores 2 days postoperatively were generally lower than baseline and scores 2 months postsurgery were consistent with baseline or higher. The iCOO took approximately 10 min to complete. Conclusions The iCOO meets measurement standards and may be used for assessing the impact of cleft-related treatments in clinical research and care. More research is needed on its use in various treatment contexts.

Download Full-text

Comparing Counts of Park Users With a Wearable Video Device and an Unmanned Aerial System

Journal for the Measurement of Physical Behaviour ◽

10.1123/jmpb.2020-0063 ◽

2021 ◽

pp. 1-8

Author(s):

Richard R. Suminski ◽

Gregory M. Dominick ◽

Matthew Saponaro

Keyword(s):

Repeated Measures ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Future Research ◽

Unmanned Aerial Systems ◽

Park Use ◽

Intraclass Correlation Coefficients ◽

Potential Benefits ◽

Absolute Agreement ◽

Aerial Systems

Evidence suggests that video captured with a wearable video device (WVD) may augment or supplant traditional methods for assessing park use. Unmanned aerial systems (UASs) are used to assess human activity, but research employing them for park assessments is sparse. Therefore, this study compared park user counts between a WVD and UAS. A diverse set of 33 amenities (e.g., playground) in three parks were videoed simultaneously by one researcher wearing a WVD and another operating the UAS. Assessments were done at 12 p.m. and 7 p.m. on weekends, with one park evaluated on two occasions 7 days apart. Two investigators independently reviewed videos and reached consensus on the counts of individuals at each amenity. Intraclass correlation coefficients (ICCs) were used to determine intra- and interrater reliabilities. A total of 404 (M = 4.7; SD = 9.6) and 389 (M = 4.5; SD = 9.0) individuals were counted in the UAS and WVD videos, respectively. Absolute agreement was 86% (74/86) and 100% when no individuals were using the amenity. Whether using all 86 videos or only videos having people (48 videos), ICCs indicated excellent reliability (ICC = .99; p < .001). The totals seen for the repeated measures were UAS = 146 and WVD = 136 for Day 1 and UAS = 169 and WVD = 161 for Day 2. Intrarater reliability was excellent for the UAS (ICC = .92; p < .001) and good for the WVD (ICC = .89; p < .001). Disagreement was mainly due to obstructions—people behind or under structures. This study provides support for the use of UASs for counting park users and future research examining the potential benefits of video analysis for assessing park use.

Download Full-text

Standardising the measurement of physical activity in people receiving haemodialysis: considerations for research and practice

BMC Nephrology ◽

10.1186/s12882-019-1634-1 ◽

2019 ◽

Vol 20 (1) ◽

Cited By ~ 2

Author(s):

Hannah M. L. Young ◽

Mark W. Orme ◽

Yan Song ◽

Maurice Dungey ◽

James O. Burton ◽

...

Keyword(s):

Physical Activity ◽

Sample Size ◽

Repeated Measures ◽

A Priori ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Future Research ◽

Wear Time ◽

Minimum Number ◽

The Uk

Abstract Background Physical activity (PA) is exceptionally low amongst the haemodialysis (HD) population, and physical inactivity is a powerful predictor of mortality, making it a prime focus for intervention. Objective measurement of PA using accelerometers is increasing, but standard reporting guidelines essential to effectively evaluate, compare and synthesise the effects of PA interventions are lacking. This study aims to (i) determine the measurement and processing guidance required to ensure representative PA data amongst a diverse HD population, and; (ii) to assess adherence to PA monitor wear amongst HD patients. Methods Clinically stable HD patients from the UK and China wore a SenseWear Armband accelerometer for 7 days. Step count between days (HD, Weekday, Weekend) were compared using repeated measures ANCOVA. Intraclass correlation coefficients (ICCs) determined reliability (≥0.80 acceptable). Spearman-Brown prophecy formula, in conjunction with a priori ≥ 80% sample size retention, identified the minimum number of days required for representative PA data. Results Seventy-seven patients (64% men, mean ± SD age 56 ± 14 years, median (interquartile range) time on HD 40 (19–72) months, 40% Chinese, 60% British) participated. Participants took fewer steps on HD days compared with non-HD weekdays and weekend days (3402 [95% CI 2665–4140], 4914 [95% CI 3940–5887], 4633 [95% CI 3558–5707] steps/day, respectively, p < 0.001). PA on HD days were less variable than non-HD days, (ICC 0.723–0.839 versus 0.559–0.611) with ≥ 1 HD day and ≥ 3 non-HD days required to provide representative data. Using these criteria, the most stringent wear-time retaining ≥ 80% of the sample was ≥7 h. Conclusions At group level, a wear-time of ≥7 h on ≥1HD day and ≥ 3 non-HD days is required to provide reliable PA data whilst retaining an acceptable sample size. PA is low across both HD and non- HD days and future research should focus on interventions designed to increase physical activity in both the intra and interdialytic period.

Download Full-text

Reliability of Safe Maximum Lifting Determinations of a Functional Capacity Evaluation

Physical Therapy ◽

10.1093/ptj/82.4.364 ◽

2002 ◽

Vol 82 (4) ◽

pp. 364-371 ◽

Cited By ~ 73

Author(s):

Douglas P Gross ◽

Michele C Battié

Keyword(s):

Functional Capacity ◽

Repeated Measures ◽

Interrater Reliability ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Functional Capacity Evaluation ◽

Measurement Variability ◽

Retest Reliability ◽

Repeated Measures Design ◽

Test Retest Reliability

Abstract Background and Purpose. Functional capacity evaluations (FCEs) are measurement tools used in predicting readiness to return to work following injury. The interrater and test-retest reliability of determinations of maximal safe lifting during kinesiophysical FCEs were examined in a sample of people who were off work and receiving workers' compensation. Subjects. Twenty-eight subjects with low back pain who had plateaued with treatment were enrolled. Five occupational therapists, trained and experienced in kinesiophysical methods, conducted testing. Methods. A repeated-measures design was used, with raters testing subjects simultaneously, yet independently. Subjects were rated on 2 occasions, separated by 2 to 4 days. Analyses included intraclass correlation coefficients (ICCs) and 95% confidence intervals. Results. The ICC values for interrater reliability ranged from .95 to .98. Test-retest values ranged from .78 to .94. Discussion and Conclusion. Inconsistencies in subjects' performance across sessions were the greatest source of FCE measurement variability. Overall, however, test-retest reliability was good and interrater reliability was excellent.

Download Full-text

Dynamic Footprint Measurement Collection Technique and Intrarater Reliability

Journal of the American Podiatric Medical Association ◽

10.7547/1020130 ◽

2012 ◽

Vol 102 (2) ◽

pp. 130-138 ◽

Cited By ~ 17

Author(s):

Jeanna M. Fascione ◽

Ryan T. Crews ◽

James S. Wrobel

Keyword(s):

Repeated Measures ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Healthy Population ◽

Intrarater Reliability ◽

Intraclass Correlation Coefficients ◽

Foot Posture ◽

Arch Index ◽

Post Hoc ◽

Dynamic Footprint

Background: Identifying the variability of footprint measurement collection techniques and the reliability of footprint measurements would assist with appropriate clinical foot posture appraisal. We sought to identify relationships between these measures in a healthy population. Methods: On 30 healthy participants, midgait dynamic footprint measurements were collected using an ink mat, paper pedography, and electronic pedography. The footprints were then digitized, and the following footprint indices were calculated with photo digital planimetry software: footprint index, arch index, truncated arch index, Chippaux-Smirak Index, and Staheli Index. Differences between techniques were identified with repeated-measures analysis of variance with post hoc test of Scheffe. In addition, to assess practical similarities between the different methods, intraclass correlation coefficients (ICCs) were calculated. To assess intrarater reliability, footprint indices were calculated twice on 10 randomly selected ink mat footprint measurements, and the ICC was calculated. Results: Dynamic footprint measurements collected with an ink mat significantly differed from those collected with paper pedography (ICC, 0.85–0.96) and electronic pedography (ICC, 0.29–0.79), regardless of the practical similarities noted with ICC values (P = .00). Intrarater reliability for dynamic ink mat footprint measurements was high for the footprint index, arch index, truncated arch index, Chippaux-Smirak Index, and Staheli Index (ICC, 0.74–0.99). Conclusions: Footprint measurements collected with various techniques demonstrate differences. Interchangeable use of exact values without adjustment is not advised. Intrarater reliability of a single method (ink mat) was found to be high. (J Am Podiatr Med Assoc 102(2): 130–138, 2012)

Download Full-text

Family-based treatment for children and adolescents with eating disorders: a mixed-methods evaluation of a blended evidence-based implementation approach

Translational Behavioral Medicine ◽

10.1093/tbm/ibz160 ◽

2019 ◽

Author(s):

Jennifer Couturier ◽

Melissa Kimber ◽

Melanie Barwick ◽

Tracy Woodford ◽

Gail Mcvey ◽

...

Keyword(s):

Eating Disorders ◽

Qualitative Interviews ◽

Readiness For Change ◽

Qualitative Description ◽

Evidence Based ◽

Organizational Readiness For Change ◽

Implementation Approach ◽

Fidelity Assessment ◽

Family Based Treatment ◽

Family Based

Abstract In this study, we evaluated a blended implementation approach with teams learning to provide family-based treatment (FBT) to adolescents with eating disorders. Four sites participated in a sequential mixed method pre–post study to evaluate the implementation of FBT in their clinical settings. The implementation approach included: (a) preparatory site visits; (b) the establishment of implementation teams; (c) a training workshop; (d) monthly clinical consultation; (e) monthly implementation consultation; and (f) fidelity assessment. Quantitative measures examining attitudes toward evidence-based practice, organizational learning environment and organizational readiness for change, as well as, individual readiness for change were delivered pre- and postimplementation. Correlational analyses were used to examine associations between baseline variables and therapist fidelity to FBT. Fundamental qualitative description guided the sampling and data collection for the qualitative interviews performed at the conclusion of the study. Seventeen individuals participated in this study (nine therapists, four medical practitioners, and four administrators). The predetermined threshold of implementation success of 80% fidelity in every FBT session was achieved by only one therapist. However, mean fidelity scores were similar to those reported in other studies. Participant attitudes, readiness, and self-efficacy were not associated with fidelity and did not change significantly from pre- to postimplementation. In qualitative interviews, all participants reported that the implementation intervention was helpful in adopting FBT. Our blended implementation approach was well received by participants. A larger trial is needed to determine which implementation factors predict FBT fidelity and impact patient outcomes.

Download Full-text