Tibial Tubercle Apophyseal Stage to Determine Skeletal Age in Pediatric Patients Undergoing ACL Reconstruction: A Validation and Reliability Study

Background: Anterior cruciate ligament (ACL) injuries demand individualized treatments based on an accurate estimation of the child’s skeletal age. Wrist radiographs, which have traditionally been used to determine skeletal age, have a number of limitations, including cost, radiation exposure, and inconvenience. Purpose: To evaluate the reliability and validity of a radiographic staging system using tibial apophyseal landmarks as hypothetical proxies for skeletal age to use in the preoperative management of pediatric ACL tears. Study Design: Cohort study (diagnosis); Level of evidence, 2. Methods: The study included children younger than 16 years of age who underwent ACL reconstruction between July 2008 and July 2018 and received both skeletal age radiography and knee radiography within 3 months of each other. Skeletal age was calculated from hand and wrist radiographs using the Greulich and Pyle atlas. Tibial apophyseal staging was categorized into 4 stages: cartilaginous stage (stage 1), apophyseal stage (stage 2), epiphyseal stage (stage 3), and bony/fused stage (stage 4). Data were collected by 2 independent assessors. The analysis was repeated 1 month later with the same assessors. We calculated descriptive statistics, measures of agreement, and the correlation between skeletal age and apophyseal stage. Results: The mean chronological age of the 287 patients included in the analysis was 12.9 ± 1.9 years; 164 (57%) of the patients were male. The overall Spearman r between skeletal age and tibial apophyseal staging was 0.69 (0.77 in males; 0.60 in females). The interrater reliability for the tibial apophyseal staging was substantial (Cohen κ = 0.66), and the intrarater reliability was excellent (Cohen κ = 0.82). The interrater reliability for skeletal age was excellent (intraclass correlation coefficient [ICC] = 0.93), as was the intrarater reliability (ICC = 0.97). Conclusion: The observed correlation between skeletal age and tibial apophyseal staging as well as observed intra- and interrater reliabilities demonstrated that tibial apophyseal landmarks on knee radiographs may be used to estimate skeletal age. This study supports the validity of knee radiographs in determining skeletal age and provides early evidence in certain clinical presentations to simplify the diagnostic workup and operative management of pediatric knee injuries, including ACL tears.

Download Full-text

Graft-Tunnel Mismatch in Endoscopic ACL Reconstruction: Reliability of Measuring Tunnel Lengths and Intra-articular Distance

Orthopaedic Journal of Sports Medicine ◽

10.1177/2325967118816317 ◽

2018 ◽

Vol 6 (12) ◽

pp. 232596711881631

Author(s):

Tim Dwyer ◽

Lucas Bristow ◽

Nicholas Bayley ◽

Ujash Sheth ◽

Jihad Abouali ◽

...

Keyword(s):

Acl Reconstruction ◽

Interrater Reliability ◽

Femoral Tunnel ◽

Cruciate Ligament ◽

Tibial Tunnel ◽

Intrarater Reliability ◽

Endoscopic Techniques ◽

Ct Measurements ◽

Anterior Cruciate ◽

High Degree

Background: A continued technical challenge for surgeons performing bone–patellar tendon–bone anterior cruciate ligament (ACL) reconstruction with endoscopic techniques is graft-tunnel mismatch. If tibial tunnel and intra-articular distances could be reliably estimated, surgeons could adjust the length of the femoral tunnel to minimize graft-tunnel mismatch. Purpose/Hypothesis: To determine whether arthroscopic measurement of the following was reliable: femoral tunnel distance (FTD), tibial tunnel distance (TTD), intra-articular distance (IAD), and total distance (TD; sum of these 3 measurements). It was hypothesized that intraoperative measurement of these distances would be reliable. Study Design: Controlled laboratory study. Methods: Eight sports fellowship–trained orthopedic surgeons independently performed arthroscopic measurements of the FTD, TTD, IAD, and TD in 7 cadaveric knees in which femoral and tibial tunnels had been drilled. Each surgeon performed the measurements twice using an EndoButton depth gauge. Following this, each parameter was measured open with a medial parapatellar approach. Finally, a computed tomography (CT) scan of each knee was performed, with the FTD, TTD, and IAD measured by a musculoskeletal radiologist. Inter- and intrarater reliability of the arthroscopic measurements was calculated, as well as the correlation between arthroscopic measurements and open and CT measurements. Results: Interrater reliability for the arthroscopic measurements was 0.8 for FTD, 0.89 for TTD, 0.61 for IAD, and 0.76 (range, 0.54-0.93) for TD. Intrarater reliability was 0.94 for FTD, 0.97 for TTD, 0.83 for IAD, and 0.93 for TD. The correlation between arthroscopic and open measurements was 0.9 for FTD, 0.94 for TTD, 0.4 for IAD, and 0.84 for TD. The correlation between arthroscopic and CT measurements was 0.85 for FTD, 0.92 for TTD, and 0.71 for IAD. Conclusion: The results of this study show that arthroscopic measurement of FTD and TTD has a high degree of intra- and interrater reliability, while that of IAD and TD demonstrates high intrarater reliability but moderate interrater reliability. Clinical Relevance: Reliable measurement of the TTD and IAD can potentially allow adjustment of the FTD, minimizing graft-tunnel mismatch in endoscopic ACL reconstruction.

Download Full-text

Reliability Assessment of Scores From Video-Recorded TGMD-3 Performances

Journal of Motor Learning and Development ◽

10.1123/jmld.2016-0007 ◽

2017 ◽

Vol 5 (1) ◽

pp. 59-68 ◽

Cited By ~ 16

Author(s):

Pauli Olavi Rintala ◽

Arja Kaarina Sääkslahti ◽

Susanna Iivonen

Keyword(s):

Motor Development ◽

Interrater Reliability ◽

Intraclass Correlation ◽

Kappa Statistic ◽

Intrarater Reliability ◽

Gross Motor ◽

Gross Motor Development ◽

Percent Agreement ◽

Two Samples ◽

Ball Skills

This study examined the intrarater and interrater reliability of the Test of Gross Motor Development—3rd Edition (TGMD-3). Participants were 60 Finnish children aged between 3 and 9 years, divided into three separate samples of 20. Two samples of 20 were used to examine the intrarater reliability of two different assessors, and the third sample of 20 was used to establish interrater reliability. Children’s TGMD-3 performances were video-recorded and later assessed using an intraclass correlation coefficient, a kappa statistic, and a percent agreement calculation. The intrarater reliability of the locomotor subtest, ball skills subtest, and gross motor total score ranged from 0.69 to 0.77, and percent agreement ranged from 87 to 91%. The interrater reliability of the locomotor subtest, ball skills subtest, and gross motor total score ranged from 0.56 to 0.64. Percent agreement of 83% was observed for locomotor skills, ball skills, and total skills, respectively. Hop, horizontal jump, and two-hand strike assessments showed the most difference between the assessors. These results show acceptable reliability for the TGMD-3 to analyze children’s gross motor skills.

Download Full-text

Influence of Rater Training on Inter- and Intrarater Reliability When Using the Rat Grimace Scale

Journal of the American Association for Laboratory Animal Science ◽

10.30802/aalas-jaalas-18-000044 ◽

2019 ◽

Vol 58 (2) ◽

pp. 178-183 ◽

Cited By ~ 8

Author(s):

Emily Q Zhang ◽

Vivian SY Leung ◽

Daniel SJ Pang

Keyword(s):

Acute Pain ◽

Interrater Reliability ◽

Intraclass Correlation ◽

Training Group ◽

Intrarater Reliability ◽

Rater Training ◽

Trainee Group ◽

Pain Models ◽

Ongoing Pain ◽

And Performance

Rodent grimace scales facilitate assessment of ongoing pain. Reported rater training using these scales varies considerably and may contribute to the observed variability in interrater reliability. This study evaluated the effect of training on interrater reliability with the Rat Grimace Scale (RGS). Two training sets (42 and 150 images) were prepared from acute pain models. Four trainee raters progressed through 2 rounds of training, scoring 42 images (set 1) followed by 150 images (set 2a). After each round, trainees reviewed the RGS and any problematic images with an experienced rater. The 150 images were then rescored (set 2b). Four years later, trainees rescored the 150 images (set 2c). A second group of raters (no-training group) scored the same image sets without review with the experienced rater. Inter- and intrarater reliability were evaluated by using the intraclass correlation coefficient (ICC), and ICC values were compared by using the Feldt test. In the trainee group, interrater reliability increased from moderate to very good between sets 1 and 2b and increased between sets 2a and 2b. Action units with the highest and lowest ICC at set 2b were orbital tightening and whiskers, respectively. In comparison to an experienced rater, the ICC for all trainees improved, ranging from 0.88 to 0.91 at set 2b. Four years later, very good interrater reliability was retained, and intrarater reliability was good or very good). The interrater reliability of the no-training group was moderate and did not improve from set 1 to set 2b. Training improved interrater reliability, with an associated reduction in 95%CI. In addition, training improved interrater reliability with an experienced rater, and performance was retained.

Download Full-text

Could Residents Adequately Assess the Severity of Hidradenitis Suppurativa? Interrater and Intrarater Reliability Assessment of Major Scoring Systems

Dermatology ◽

10.1159/000501771 ◽

2019 ◽

Vol 236 (1) ◽

pp. 8-14 ◽

Cited By ~ 1

Author(s):

Katarzyna Włodarek ◽

Aleksandra Stefaniak ◽

Łukasz Matusiak ◽

Jacek C. Szepietowski

Keyword(s):

Interrater Reliability ◽

Hidradenitis Suppurativa ◽

Intraclass Correlation ◽

Scoring Systems ◽

Staging System ◽

Severity Index ◽

Assessment Tools ◽

Intrarater Reliability ◽

Global Assessment Scale ◽

Interrater Variability

A wide variety of assessment tools have been proposed for hidradenitis suppurativa (HS) until now, but none of them meets the criteria for an ideal score. Because there is no gold standard scoring system, the choice of the measure instrument depends on the purpose of use and even on the physician’s experience in the subject of HS. The aim of this study was to assess the intrarater and interrater reliability of 6 scoring systems commonly used for grading severity of HS: the Hurley Staging System, the Refined Hurley Staging, the Hidradenitis Suppurativa Severity Score System (IHS4), the Hidradenitis Suppurativa Severity Index (HSSI), the Sartorius Hidradenitis Suppurativa Score and the Hidradenitis Suppurativa Physician’s Global Assessment Scale (HS-PGA). On the scoring day, 9 HS patients underwent a physical examination and disease severity assessment by a group of 16 dermatology residents using all evaluated instruments. Then, intrarater reliability was calculated using intraclass correlation coefficient (ICC), and interrater variability was evaluated using the coefficient of variation (CV). In all 6 scorings the ICCs were >0.75, indicating high intrarater reliability of all presented scales. The study has also demonstrated moderate agreement between raters in most of the evaluated measure instruments. The most reproducible methods, according to CVs, seem to be the Hurley staging, IHS4, and HSSI. None of the 6 evaluated scoring systems showed a significant advantage over the other when comparing ICCs, and all the instruments seem to be very reliable methods. The interrater reliability was usually good, but the most repeatable results between researchers were obtained for the easiest scales, including Hurley scoring, IHS4 and HSSI.

Download Full-text

Intrarater and Interrater Reliability of Infrared Image Analysis of Forearm Acupoints before and after Moxibustion

Evidence-based Complementary and Alternative Medicine ◽

10.1155/2020/6328756 ◽

2020 ◽

Vol 2020 ◽

pp. 1-8

Author(s):

Jiali Lou ◽

Yongliang Jiang ◽

Hantong Hu ◽

Xiaoyu Li ◽

Yajun Zhang ◽

...

Keyword(s):

Image Analysis ◽

Correlation Coefficient ◽

Temperature Change ◽

Intraclass Correlation Coefficient ◽

Interrater Reliability ◽

Intraclass Correlation ◽

Infrared Image ◽

Infrared Images ◽

Intrarater Reliability ◽

Before And After

The objective of this study was to determine the intrarater and interrater reliabilities of infrared image analysis of forearm acupoints before and after moxibustion. In this work, infrared images of acupoints in the forearm of 20 volunteers (M/F, 10/10) were collected prior to and after moxibustion by infrared thermography (IRT). Two trained raters performed the analysis of infrared images in two different periods at a one-week interval. The intraclass correlation coefficient (ICC) was calculated to determine the intrarater and interrater reliabilities. With regard to the intrarater reliability, ICC values were between 0.758 and 0.994 (substantial to excellent). For the interrater reliability, ICC values ranged from 0.707 to 0.964 (moderate to excellent). Given that the intrarater and interrater reliability levels show excellent concordance, IRT could be a reliable tool to monitor the temperature change of forearm acupoints induced by moxibustion.

Download Full-text

Use Caution When Assessing Preoperative Leg-Length Discrepancy in Pediatric Patients With Anterior Cruciate Ligament Injuries

The American Journal of Sports Medicine ◽

10.1177/0363546520952757 ◽

2020 ◽

Vol 48 (12) ◽

pp. 2948-2953

Author(s):

Madison R. Heath ◽

Alexandra H. Aitchison ◽

Lindsay M. Schlichte ◽

Christine Goodbody ◽

Frank A. Cordasco ◽

...

Keyword(s):

Anterior Cruciate Ligament ◽

Acl Reconstruction ◽

Cruciate Ligament ◽

Intraclass Correlation ◽

Case Series ◽

Surgical Reconstruction ◽

Radiographic Examination ◽

Leg Length ◽

Level Of Evidence ◽

Anterior Cruciate

Background: Pre- and postoperative standing hip-to-ankle radiographs are critical for monitoring potential postoperative growth arrest and resultant deformities after pediatric anterior cruciate ligament (ACL) reconstruction. Purpose: To determine the prevalence of apparent preoperative leg-length discrepancies (LLDs) that resolve at the first postoperative radiographic examination in patients undergoing ACL reconstruction in order to understand what proportion of the noted preoperative deformities may have been inaccurate. Study Design: Case series; Level of evidence, 4. Methods: A retrospective review of prospectively collected preoperative and first postoperative full-length hip-to-ankle radiographs was performed in a cohort of skeletally immature patients who had an acute ACL injury and underwent subsequent surgical reconstruction. Leg length measurements for both the injured and the uninjured legs were obtained for comparison. Results: A total of 112 patients (mean age, 12.7 ± 1.7 years) were included (79 boys and 33 girls). Leg-length measurement interrater reliability among 3 raters for 25 randomly chosen images was nearly perfect (intraclass correlation coefficient, 0.996; 95% CI, 0.994-0.998). At baseline, there was no apparent preoperative LLD (<5 mm) in 48% (n = 54) of participants, while 37% (n = 41) displayed a small apparent LLD (5 to <10 mm), 12% (n = 13) displayed a moderate apparent LLD (10 to <15 mm), and 4% (n = 4) displayed a large apparent LLD (≥15 mm). Of the patients with an apparent preoperative LLD, 66% (n = 38) of them tore their ACL on the leg measuring shorter. At first postoperative radiographs, 48% (n = 28) of patients with an apparent preoperative LLD showed resolution to no LLD: 46% (n = 19) of patients with a small apparent preoperative LLD, 54% (n = 7) of patients with a moderate apparent LLD, and 50% (n = 2) of patients with a large apparent LLD. Conclusion: A high percentage of patients (48%) with apparent preoperative LLDs showed resolution to no LLDs by their first postoperative imaging, indicating that preoperative hip-to-ankle radiographs display some false LLDs in patients with recent ACL tears who are unable to fully extend their injured leg and bear weight.

Download Full-text

Assessment of the Intrarater and Interrater Reliability of an Established Clinical Task Analysis Methodology

Anesthesiology ◽

10.1097/00000542-200205000-00016 ◽

2002 ◽

Vol 96 (5) ◽

pp. 1129-1139 ◽

Cited By ~ 46

Author(s):

Jason Slagle ◽

Matthew B. Weinger ◽

My-Than T. Dinh ◽

Vanessa V. Brumer ◽

Kevin Williams

Keyword(s):

Real Time ◽

Task Analysis ◽

Interrater Reliability ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Intrarater Reliability ◽

Intraclass Correlation Coefficients ◽

Percent Time ◽

Analysis Methodology ◽

And Task

Background Task analysis may be useful for assessing how anesthesiologists alter their behavior in response to different clinical situations. In this study, the authors examined the intraobserver and interobserver reliability of an established task analysis methodology. Methods During 20 routine anesthetic procedures, a trained observer sat in the operating room and categorized in real-time the anesthetist's activities into 38 task categories. Two weeks later, the same observer performed task analysis from videotapes obtained intraoperatively. A different observer performed task analysis from the videotapes on two separate occasions. Data were analyzed for percent of time spent on each task category, average task duration, and number of task occurrences. Rater reliability and agreement were assessed using intraclass correlation coefficients. Results Intrarater reliability was generally good for categorization of percent time on task and task occurrence (mean intraclass correlation coefficients of 0.84-0.97). There was a comparably high concordance between real-time and video analyses. Interrater reliability was generally good for percent time and task occurrence measurements. However, the interrater reliability of the task duration metric was unsatisfactory, primarily because of the technique used to capture multitasking. Conclusions A task analysis technique used in anesthesia research for several decades showed good intrarater reliability. Off-line analysis of videotapes is a viable alternative to real-time data collection. Acceptable interrater reliability requires the use of strict task definitions, sophisticated software, and rigorous observer training. New techniques must be developed to more accurately capture multitasking. Substantial effort is required to conduct task analyses that will have sufficient reliability for purposes of research or clinical evaluation.

Download Full-text

Interrater Reliability of the Berg Balance Scale When Used by Clinicians of Various Experience Levels to Assess People With Lower Limb Amputations

Physical Therapy ◽

10.2522/ptj.20130182 ◽

2014 ◽

Vol 94 (3) ◽

pp. 371-378 ◽

Cited By ~ 21

Author(s):

Christopher K. Wong

Keyword(s):

Lower Limb ◽

Interrater Reliability ◽

Clinical Training ◽

Intraclass Correlation ◽

Berg Balance Scale ◽

Intrarater Reliability ◽

Rater Reliability ◽

Study Objective ◽

Balance Scale ◽

Scale Scores

Background People with lower limb amputations frequently have impaired balance ability. The Berg Balance Scale (BBS) has excellent psychometric properties for people with neurologic disorders and elderly people dwelling in the community. A Rasch analysis demonstrated the validity of the BBS for people with lower limb amputations of all ability strata, but rater reliability has not been tested. Objective The study objective was to determine the interrater reliability and intrarater reliability of BBS scores and the differences in scores assigned by testers with various levels of experience when assessing people with lower limb amputations. Design This reliability study of video-recorded single-session BBS assessments had a cross-sectional design. Methods From a larger study of people with lower limb amputations, 5 consecutively recruited participants using prostheses were video recorded during an in-person BBS assessment. Sixteen testers independently rated the video-recorded assessments. Testers were 3 physical therapists, 1 occupational therapist, 3 third-year and 4 second-year doctor of physical therapy (DPT) students, and 5 first-year DPT students without clinical training. Rater reliability was calculated using intraclass correlation coefficients (ICC [2,k]). Differences in scores assigned by testers with various levels of experience were determined by use of an analysis of variance with Tukey post hoc tests. Results The average age of the participants was 53.0 years (SD=15.7). Amputations had occurred at the ankle disarticulation, transtibial, and transfemoral levels because of vascular, trauma, and medical etiologies an average of 8.2 years earlier (SD=7.9). Berg Balance Scale scores spanned all ability strata. Interrater reliability (ICC [2,k]=.99) and intrarater reliability of scores determined in person and through video-recorded assessments by the same testers (ICC [2,k]=.99) were excellent. For participants with the lowest levels of ability, licensed professionals assigned lower scores than did DPT students without clinical training. Limitations Intrarater reliability calculations were based on 2 testers. Conclusions Berg Balance Scale scores assigned to people using prostheses by testers with various levels of clinical experience had excellent interrater reliability and intrarater reliability.

Download Full-text

Validity, Reliability, and Ability to Identify Fall Status of the Berg Balance Scale, BESTest, Mini-BESTest, and Brief-BESTest in Patients With COPD

Physical Therapy ◽

10.2522/ptj.20150391 ◽

2016 ◽

Vol 96 (11) ◽

pp. 1807-1815 ◽

Cited By ~ 35

Author(s):

Cristina Jácome ◽

Joana Cruz ◽

Ana Oliveira ◽

Alda Marques

Keyword(s):

Interrater Reliability ◽

Intraclass Correlation ◽

Berg Balance Scale ◽

Performance Validity ◽

Operating Characteristics ◽

Intrarater Reliability ◽

Balance Test ◽

Balance Scale ◽

Balance Tests ◽

Abc Scale

Abstract Background The Berg Balance Scale (BBS), Balance Evaluation Systems Test (BESTest), Mini-BESTest, and Brief-BESTest are useful in the assessment of balance. Their psychometric properties, however, have not been tested in patients with chronic obstructive pulmonary disease (COPD). Objective This study aimed to compare the validity, reliability, and ability to identify fall status of the BBS, BESTest, Mini-BESTest, and the Brief-BESTest in patients with COPD. Design A cross-sectional study was conducted. Methods Forty-six patients (24 men, 22 women; mean age=75.9 years, SD=7.1) were included. Participants were asked to report their falls during the previous 12 months and to fill in the Activity-specific Balance Confidence (ABC) Scale. The BBS and the BESTest were administered. Mini-BESTest and Brief-BESTest scores were computed based on the participants' BESTest performance. Validity was assessed by correlating balance tests with each other and with the ABC Scale. Interrater reliability (2 raters), intrarater reliability (48–72 hours), and minimal detectable changes (MDCs) were established. Receiver operating characteristics assessed the ability of each balance test to differentiate between participants with and without a history of falls. Results Balance test scores were significantly correlated with each other (Spearman correlation rho=.73–.90) and with the ABC Scale (rho=.53–.75). Balance tests presented high interrater reliability (intraclass correlation coefficient [ICC]=.85–.97) and intrarater reliability (ICC=.52–.88) and acceptable MDCs (MDC=3.3–6.3 points). Although all balance tests were able to identify fall status (area under the curve=0.74–0.84), the BBS (sensitivity=73%, specificity=77%) and the Brief-BESTest (sensitivity=81%, specificity=73%) had the higher ability to identify fall status. Limitations Findings are generalizable mainly to older patients with moderate COPD. Conclusions The 4 balance tests are valid, reliable, and valuable in identifying fall status in patients with COPD. The Brief-BESTest presented slightly higher interrater reliability and ability to differentiate participants' fall status.

Download Full-text

FEASIBILITY AND RELIABILITY OF MUSCULOSKELETAL ULTRASOUND MEASUREMENT OF THE MEDIAL PATELLOFEMORAL LIGAMENT

Orthopaedic Journal of Sports Medicine ◽

10.1177/2325967120s00185 ◽

2020 ◽

Vol 8 (4_suppl3) ◽

pp. 2325967120S0018

Author(s):

Andrea Stracciolini ◽

Laura Boucher ◽

Sarah Jackson ◽

Naomi Brown ◽

Danielle Magrini ◽

...

Keyword(s):

Interrater Reliability ◽

Medial Patellofemoral Ligament ◽

Intraclass Correlation ◽

Musculoskeletal Ultrasound ◽

Reliability Testing ◽

Intrarater Reliability ◽

Good Reliability ◽

Medicine Physician ◽

Mean Width ◽

Musculoskeletal Ultrasonography

Background The medial patellofemoral ligament (MPFL) is an important soft tissue constraint to preventing patellar dislocations in young athletes. The anatomy of the MPFL has been investigated in cadaveric studies and magnetic resonance studies. No studies to date have provided anatomical data of the MPFL on ultrasonography. Purpose To investigate the feasibility of musculoskeletal ultrasonography for the evaluation of the MPFL, and to determine interrater and intrarater reliability for MPFL ultrasound measures. Methods Ten control participants (20 knees) 20 to 50 years underwent ultrasonography performed by 3 researchers (musculoskeletal ultrasound radiologist, athletic trainer/biomechanist, primary care sports medicine physician) from 3 different institutions for interrater reliability testing. Intrarater reliability testing was performed at 2 separate institutions by 4 physicians, each performing the same knee ultrasound protocol on 20 knees in 10 study participants 2 to 3 weeks apart. In total, 180 images were created for interrater reliability, and 480 images for intrarater reliability. Examinations were performed with linear high-frequency transducers (10-18 MHz) with the participant in the supine position and the extremity flexed at 45°. Measurements included ligament length (long axis to ligament) from the patellar to the femoral attachment sites, ligament width (short axis to ligament) at the patellar attachment, and ligament thickness (long axis to ligament) midway between the patella and femur. Mean and SD were calculated for all measurements. Intraclass correlation coefficient (ICC) analysis was used to assess intrarater and interrater reliability. ICC values < 0.40 indicated poor reliability, whereas those between 0.40 and 0.75 indicated fair to good reliability, and those > 0.75 indicated excellent reliability. Results The mean US value for MPFL length was 44.83mm (SD 6.68), mean thickness 2.66mm (SD 0.85), and mean width 11.76mm (SD 2.99). The overall ICC values for interrater reliability testing indicated fair to good reliability for length measures (0.7) and poor reliability for thickness (–0.1) and width (0.3; Table 1.1). Overall ICC values for intrarater reliability indicated fair to good reliability for length (0.5), excellent for thickness (0.9), and poor reliability for width (–0.3; Table 1.2). Conclusions Musculoskeletal ultrasonography is a feasible and reliable office-based method of measuring MPFL length and thickness. These quantitative measures set the groundwork for establishing normative anatomical measures of the MPFL in athletes and establish a protocol for testing and measuring the MPFL using musculoskeletal ultrasonography. [Table: see text][Table: see text]

Download Full-text