Intra- and Interobserver Reliability Comparison of Clinical Gait Analysis Data between Two Gait Laboratories

Comparing clinical gait analysis (CGA) data between clinical centers is critical in the treatment and rehabilitation progress. However, CGA protocols and system configurations, as well as choice of marker sets and individual variability during marker attachment, may affect the comparability of data. The aim of this study was to evaluate reliability of CGA data collected between two gait analysis laboratories. Three healthy subjects underwent a standardized CGA protocol at two separate centers. Kinematic data were captured using the same motion capturing systems (two systems, same manufacturer, but different analysis software and camera configurations). The CGA data were analyzed by the same two observers for both centers. Interobserver reliability was calculated using single measure intraclass correlation coefficients (ICC). Intraobserver as well as between-laboratory intraobserver reliability were assessed using an average measure ICC. Interobserver reliability for all joints (ICCtotal = 0.79) was found to be significantly lower (p < 0.001) than intraobserver reliability (ICCtotal = 0.93), but significantly higher (p < 0.001) than between-laboratory intraobserver reliability (ICCtotal = 0.55). Data comparison between both centers revealed significant differences for 39% of investigated parameters. Different hardware and software configurations impact CGA data and influence between-laboratory comparisons. Furthermore, lower intra- and interobserver reliability were found for ankle kinematics in comparison to the hip and knee, particularly for interobserver reliability.

Download Full-text

Spinal Instability Neoplastic Score: An Analysis of Reliability and Validity From the Spine Oncology Study Group

Journal of Clinical Oncology ◽

10.1200/jco.2010.34.3897 ◽

2011 ◽

Vol 29 (22) ◽

pp. 3072-3077 ◽

Cited By ~ 277

Author(s):

Daryl R. Fourney ◽

Evan M. Frangou ◽

Timothy C. Ryken ◽

Christian P. DiPaola ◽

Christopher I. Shaffrey ◽

...

Keyword(s):

Predictive Validity ◽

Interobserver Reliability ◽

Intraclass Correlation ◽

Reliability And Validity ◽

Correlation Coefficients ◽

Intraobserver Reliability ◽

Spinal Tumors ◽

Spinal Instability ◽

Study Group ◽

Reliable Classification

PurposeStandardized indications for treatment of tumor-related spinal instability are hampered by the lack of a valid and reliable classification system. The objective of this study was to determine the interobserver reliability, intraobserver reliability, and predictive validity of the Spinal Instability Neoplastic Score (SINS).MethodsClinical and radiographic data from 30 patients with spinal tumors were classified as stable, potentially unstable, and unstable by members of the Spine Oncology Study Group. The median category for each patient case (consensus opinion) was used as the gold standard for predictive validity testing. On two occasions at least 6 weeks apart, each rater also scored each patient using SINS. Each total score was converted into a three-category data field, with 0 to 6 as stable, 7 to 12 as potentially unstable, and 13 to 18 as unstable.ResultsThe κ statistics for interobserver reliability were 0.790, 0.841, 0.244, 0.456, 0.462, and 0.492 for the fields of location, pain, bone quality, alignment, vertebral body collapse, and posterolateral involvement, respectively. The κ statistics for intraobserver reliability were 0.806, 0.859, 0.528, 0.614, 0.590, and 0.662 for the same respective fields. Intraclass correlation coefficients for inter- and intraobserver reliability of total SINS score were 0.846 (95% CI, 0.773 to 0.911) and 0.886 (95% CI, 0.868 to 0.902), respectively. The κ statistic for predictive validity was 0.712 (95% CI, 0.676 to 0.766).ConclusionSINS demonstrated near-perfect inter- and intraobserver reliability in determining three clinically relevant categories of stability. The sensitivity and specificity of SINS for potentially unstable or unstable lesions were 95.7% and 79.5%, respectively.

Download Full-text

Accuracy and Reliability of Observational Gait Analysis Data: Judgments of Push-off in Gait After Stroke

Physical Therapy ◽

10.1093/ptj/83.2.146 ◽

2003 ◽

Vol 83 (2) ◽

pp. 146-160 ◽

Cited By ~ 63

Author(s):

Jennifer L McGinley ◽

Patricia A Goldie ◽

Kenneth M Greenwood ◽

Sandra J Olney

Keyword(s):

Power Generation ◽

Clinical Practice ◽

Gait Analysis ◽

Rating Scales ◽

Physical Therapists ◽

Interobserver Reliability ◽

Intraclass Correlation ◽

Analysis Data ◽

Observational Assessment ◽

Analysis System

Abstract Background and Purpose. Physical therapists routinely observe gait in clinical practice. The purpose of this study was to determine the accuracy and reliability of observational assessments of push-off in gait after stroke. Subjects. Eighteen physical therapists and 11 subjects with hemiplegia following a stroke participated in the study. Method. Measurements of ankle power generation were obtained from subjects following stroke using a gait analysis system. Concurrent videotaped gait performances were observed by the physical therapists on 2 occasions. Ankle power generation at push-off was scored as either normal or abnormal using two 11-point rating scales. These observational ratings were correlated with the measurements of peak ankle power generation. Results. A high correlation was obtained between the observational ratings and the measurements of ankle power generation (mean Pearson r=.84). Interobserver reliability was moderately high (mean intraclass correlation coefficient [ICC (2,1)]=.76). Intraobserver reliability also was high, with a mean ICC (2,1) of .89 obtained. Discussion and Conclusion. Physical therapists were able to make accurate and reliable judgments of push-off in videotaped gait of subjects following stroke using observational assessment. Further research is indicated to explore the accuracy and reliability of data obtained with observational gait analysis as it occurs in clinical practice.

Download Full-text

Clinical photographs in the assessment of adult spinal deformity: a comparison to radiographic parameters

Journal of Neurosurgery Spine ◽

10.3171/2020.11.spine201732 ◽

2021 ◽

pp. 1-5

Author(s):

Devon J. Ryan ◽

Nicholas D. Stekas ◽

Ethan W. Ayres ◽

Mohamed A. Moawad ◽

Eaman Balouch ◽

...

Keyword(s):

Spinal Deformity ◽

Adult Spinal Deformity ◽

Interobserver Reliability ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Intraobserver Reliability ◽

Consecutive Series ◽

Anatomical Landmarks ◽

Radiographic Parameters ◽

Clinical Measures

OBJECTIVE The goal of this study was to reliably predict sagittal and coronal spinal alignment with clinical photographs by using markers placed at easily localized anatomical landmarks. METHODS A consecutive series of patients with adult spinal deformity were enrolled from a single center. Full-length standing radiographs were obtained at the baseline visit. Clinical photographs were taken with reflective markers placed overlying C2, S1, the greater trochanter, and each posterior-superior iliac spine. Sagittal radiographic parameters were C2 pelvic angle (CPA), T1 pelvic angle (TPA), and pelvic tilt. Coronal radiographic parameters were pelvic obliquity and T1 coronal tilt. Linear regressions were performed to evaluate the relationship between radiographic parameters and their photographic “equivalents.” The data were reanalyzed after stratifying the cohort into low–body mass index (BMI) (< 30) and high-BMI (≥ 30) groups. Interobserver and intraobserver reliability was assessed for clinical measures via intraclass correlation coefficients (ICCs). RESULTS A total of 38 patients were enrolled (mean age 61 years, mean BMI 27.4 kg/m2, 63% female). All regression models were significant, but sagittal parameters were more closely correlated to photographic parameters than coronal measurements. TPA and CPA had the strongest associations with their photographic equivalents (both r2 = 0.59, p < 0.001). Radiographic and clinical parameters tended to be more strongly correlated in the low-BMI group. Clinical measures of TPA and CPA had high intraobserver reliability (all ICC > 0.99, p < 0.001) and interobserver reliability (both ICC > 0.99, p < 0.001). CONCLUSIONS The photographic measures of spinal deformity developed in this study were highly correlated with their radiographic counterparts and had high inter- and intraobserver reliability. Clinical photography can not only reduce radiation exposure in patients with adult spinal deformity, but also be used to assess deformity when full-spine radiographs are unavailable.

Download Full-text

Interobserver Reliability Using the Phonetic Level Evaluation With Severely and Profoundly Hearing-Impaired Children

Journal of Speech Language and Hearing Research ◽

10.1044/jshr.3405.989 ◽

1991 ◽

Vol 34 (5) ◽

pp. 989-999 ◽

Cited By ~ 6

Author(s):

Stephanie Shaw ◽

Truman E. Coggins

Keyword(s):

Interrater Reliability ◽

Interobserver Reliability ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Hearing Impaired ◽

Intraclass Correlation Coefficients ◽

Assessment Measure ◽

Impaired Children ◽

Speech Assessment ◽

Hearing Impaired Children

This study examines whether observers reliably categorize selected speech production behaviors in hearing-impaired children. A group of experienced speech-language pathologists was trained to score the elicited imitations of 5 profoundly and 5 severely hearing-impaired subjects using the Phonetic Level Evaluation (Ling, 1976). Interrater reliability was calculated using intraclass correlation coefficients. Overall, the magnitude of the coefficients was found to be considerably below what would be accepted in published behavioral research. Failure to obtain acceptably high levels of reliability suggests that the Phonetic Level Evaluation may not yet be an accurate and objective speech assessment measure for hearing-impaired children.

Download Full-text

Assessment of reliability and validity of the 5-scale grading system of the point-of-care immunoassay for tear matrix metalloproteinase-9

Scientific Reports ◽

10.1038/s41598-021-92020-6 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Minjeong Kim ◽

Ja Young Oh ◽

Seon Ha Bae ◽

Seung Hyeun Lee ◽

Won Jun Lee ◽

...

Keyword(s):

Matrix Metalloproteinase ◽

Calibration Curve ◽

Point Of Care ◽

Interobserver Reliability ◽

Intraclass Correlation ◽

Reliability And Validity ◽

Correlation Coefficients ◽

Grading System ◽

Intraclass Correlation Coefficients ◽

The Difference

AbstractWe evaluated the reliability and validity of the 5-scale grading system to interpret the point-of-care immunoassay for tear matrix metalloproteinase (MMP)-9. Six observers graded red bands of photographs of the readout window in MMP-9 immunoassay kit (InflammaDry) two times with 2-week interval based on the 5-scale grading system (i.e. grade 0–4). Interobserver and intraobserver reliability were evaluated using intraclass correlation coefficients. The interobserver agreements were analyzed according to the severity of tear MMP-9 expression. To validate the system, a concentration calibration curve was made using MMP-9 solutions with reference concentrations, then the distribution of MMP-9 concentrations was analyzed according to the 5-scale grading system. Both intraobserver and interobserver reliability was excellent. The readout grades were significantly correlated with the quantified colorimetric densities. The interobserver variance of readout grades had no correlation with the severity of the measured densities. The band density continued to increase up to a maximal concentration (i.e. 5000 ng/mL) according to the calibration curve. The difference of grades reflected the change of MMP-9 concentrations sensitively, especially between grade 2 and 4. Together, our data indicate that the subjective 5-scale grading system in the point-of-care MMP-9 immunoassay is an easy and reliable method with acceptable accuracy.

Download Full-text

Interobserver Reliability and Change in the Sagittal Tibial Tubercle–Trochlear Groove Distance with Increasing Knee Flexion Angles

The Journal of Knee Surgery ◽

10.1055/s-0041-1729547 ◽

2021 ◽

Author(s):

Ian S. MacLean ◽

Taylor M. Southworth ◽

Ian J. Dempsey ◽

Neal B. Naveen ◽

Hailey P. Huddleston ◽

...

Keyword(s):

Knee Flexion ◽

Sagittal Plane ◽

Interobserver Reliability ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Flexion Angle ◽

Tibial Tubercle ◽

Trochlear Groove ◽

Knee Flexion Angle ◽

Intraclass Correlation Coefficients

AbstractThe tibial tubercle–trochlear groove (TT-TG) distance is currently utilized to evaluate knee alignment in patients with patellar instability. Sagittal plane pathology measured by the sagittal tibial tubercle–trochlear groove (sTT-TG) distance has been described in instability but may also be important to consider in patients with cartilage injury. This study aims to (1) describe interobserver reliability of the sTT-TG distance and (2) characterize the change in the sTT-TG distance with respect to changing knee flexion angles. In this cadaveric study, six nonpaired cadaveric knees underwent magnetic resonance imaging (MRI) studies at each of the following degrees of knee flexion: −5, 0, 5, 10, 15, and 20. The sTT-TG distance was measured on the axial T2 sequence. Four reviewers measured this distance for each cadaver at each flexion angle. Intraclass correlation coefficients were calculated to determine interobserver reliability and reproducibility of the sTT-TG measurement. Analysis of variance (ANOVA) tests and Friedman's tests with a Bonferroni's correction were performed for each cadaver to compare sTT-TG distances at each flexion angle. Significance was defined as p < 0.05. There was excellent interobserver reliability of the sTT-TG distance with all intraclass correlation coefficients >0.9. The tibial tubercle progressively becomes more posterior in relation to the trochlear groove (more negative sTT-TG distance) with increasing knee flexion. The sTT-TG distance is a measurement that is reliable between attending surgeons and across training levels. The sTT-TG distance is affected by small changes in knee flexion angle. Awareness of knee flexion angle on MRI is important when this measurement is utilized by surgeons.

Download Full-text

3D Biometrics for Hindfoot Alignment Using Weightbearing Computed Tomography

Foot & Ankle International ◽

10.1177/1071100719835492 ◽

2019 ◽

Vol 40 (6) ◽

pp. 720-726 ◽

Cited By ~ 24

Author(s):

Jian Zhong Zhang ◽

François Lintz ◽

Alessio Bernasconi ◽

Shu Zhang ◽

Keyword(s):

Computed Tomography ◽

Comparative Study ◽

Interobserver Reliability ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Level Of Evidence ◽

Hindfoot Alignment ◽

Mean Values ◽

Intraclass Correlation Coefficients ◽

Prospective Comparative Study

Background: Weightbearing computed tomography (WBCT) is a useful tool for the assessment of hindfoot alignment (HA). Foot ankle offset (FAO) is a recently introduced parameter, determined from WBCT images using semiautomatic software. The aim of this study was to determine the clinical relevance and reproducibility of FAO for the evaluation of HA. Methods: A prospective comparative study was performed on consecutive patients requiring bilateral WBCT between September 2017 and April 2018. Based on the clinical assessment of HA, patients were divided into 3 groups: (1) normal alignment group (G1), (2) valgus (G2), and (3) varus (G3). FAO and long axial view (HACT) were measured on WBCT images, and the groups were compared. The reproducibility of FAO and HACT was determined through intraclass correlation coefficients (ICCs). Regression analysis was performed to investigate the correlation between the 2 methods. Overall, 249 feet (126 patients) were included (G1 = 115, G2 = 78, and G3 = 56 feet). Results: The mean values for FAO and HACT were 1.2% ± 2.8% and 3.9 ± 3.1, respectively, in G1; 8.1% ± 3.7% and 9.7 ± 4.9 in G2; and −6.6% ± 4.8% and −8.2 ± 6.6 in G3. Intra- and interobserver reliability was 0.987 and 0.988 for FAO and 0.949 and 0.949 for HACT, respectively. There was a good linear correlation between HACT and FAO ( R2 = 0.744), with a regression slope of 1.064. Conclusions: WBCT was a useful method for the characterization of HA. FAO was reproducible and correlated well with physical examination. Level of Evidence: Level II, prospective comparative study.

Download Full-text

High Reproducibility of an Automated Measurement of Mobility for Patients with Axial Spondyloarthritis

The Journal of Rheumatology ◽

10.3899/jrheum.170941 ◽

2018 ◽

Vol 45 (10) ◽

pp. 1383-1388 ◽

Cited By ~ 3

Author(s):

Juan L. Garrido-Castro ◽

Rafael Curbelo ◽

Ramón Mazzucchelli ◽

María E. Domínguez-González ◽

Cristina Gonzalez-Navas ◽

...

Keyword(s):

Ankylosing Spondylitis ◽

Repeated Measures ◽

Axial Spondyloarthritis ◽

Activity Index ◽

Interobserver Reliability ◽

Intraclass Correlation ◽

Disease Activity Index ◽

Intraobserver Reliability ◽

Spinal Mobility ◽

Video Capture

Objective.Conventional measures of spinal mobility used in the assessment of patients with axial spondyloarthritis (axSpA), such as the Bath Ankylosing Spondylitis Metrology Index and its components, are subject to interobserver variability. The University of Córdoba Ankylosing Spondylitis Metrology Index (UCOASMI) is a validated composite index based on a motion video-capture system, UCOTrack. Our objective was to assess its reproducibility in clinical practice settings.Methods.We carried out an observational study of repeated measures in 3 centers. Video-capture systems were installed and adapted to clinical rooms. Patients with axSpA and stable disease were selected by consecutive stratified sampling [disease duration, sex, and the Bath Ankylosing Spondylitis Disease Activity Index (BASDAI)]. Intraobserver reliability of the UCOASMI and of conventional measures was tested 3–5 days apart. For interobserver reliability, 3 patients from each center were evaluated in the other centers, within 3–7 days. The intraclass correlation coefficients (ICC) were calculated.Results.Thirty patients were included (73% men, mean age 53 yrs, mean BASDAI 3.0). Interobserver and intraobserver ICC of the UCOASMI was 0.98. Conventional measurements showed lower but adequate reproducibility as well, except for interobserver reliability of lateral flexion (0.41), cervical rotation (0.61), and Schöber test (0.07), and intraobserver reliability of tragus-to-wall distance (0.30).Conclusion.Reproducibility of the UCOASMI seems very high, and apparently more reliable than conventional measures of mobility.

Download Full-text

Comparison of Reliability of Norberg Angle and Distraction Index as Measurements for Hip Laxity in Dogs

Veterinary and Comparative Orthopaedics and Traumatology ◽

10.1055/s-0040-1709460 ◽

2020 ◽

Vol 33 (04) ◽

pp. 274-278

Author(s):

Julius Klever ◽

Andreas Brühschwein ◽

Silvia Wagner ◽

Sven Reese ◽

Andrea Meyer-Lindenberg

Keyword(s):

Correlation Coefficient ◽

Clinical Significance ◽

Intraclass Correlation Coefficient ◽

Hip Joint ◽

Interobserver Reliability ◽

Intraclass Correlation ◽

Joint Laxity ◽

Intraobserver Reliability ◽

Level Of Experience ◽

Norberg Angle

Abstract Objective The main purpose of the study was to compare reliability of measurements for the evaluation of hip joint laxity in 59 dogs. Materials and Methods Measurement of the distraction index (DI) of the PennHIP method and the Norberg angle (NA) of the Fédération Cynologique Internationale (FCI) scoring scheme as well as scoring according to the FCI scheme and the Swiss scoring scheme were performed by three observers at different level of experience. For each dog, two radiographs were acquired with each method by the same operator to evaluate intraoperator-reliability. Results Intraoperator-reliability was slightly better for the NA compared with the DI with an intraclass correlation coefficient (ICC) of 0.962 and 0.892 respectively. The ICC showed excellent results in intraobserver-reliability and interobserver-reliability for both the NA (ICC 0.975; 0.969) and the DI (ICC 0.986; 0.972). Thus, the NA as well as the DI can be considered as reliable measurements. The FCI scheme and the Swiss scoring scheme provide similar reliability. While the FCI scheme seems to be slightly more reliable in experienced observers (Kappa FCI 0.687; Kappa Swiss 0.681), the Swiss scoring scheme had a noticeable better reliability for the unexperienced observer (Kappa FCI 0.465; Kappa Swiss 0.514). Clinical Significance The Swiss scoring scheme provides a structured guideline for the interpretation of hip radiographs and can thus be recommended to unexperienced observers.

Download Full-text

The Evaluation of Trochlear Osseous Morphology: An Epidemiologic Study

Orthopaedic Journal of Sports Medicine ◽

10.1177/2325967120s00441 ◽

2020 ◽

Vol 8 (7_suppl6) ◽

pp. 2325967120S0044

Author(s):

Sercan Yalçin ◽

Gabriel Onor ◽

Scott Kaar ◽

lee Pace ◽

Paolo Ferrua ◽

...

Keyword(s):

Correlation Coefficient ◽

Normal Population ◽

Trochlear Dysplasia ◽

Interobserver Reliability ◽

Intraclass Correlation ◽

Epidemiologic Study ◽

Intraobserver Reliability ◽

Severe Dysplasia ◽

Mild Dysplasia ◽

Study Population

Objectives: The purpose of this study is to investigate the prevalence of the trochlear dysplasia in our study population. Methods: We obtained 692 skeletally mature femoral specimens from the [Blinded Institution], [Blinded Collection]. Five observers were asked to evaluate each specimen for trochlear dysplasia on a scale between 0 and 3 (0 – normal/no dysplasia; 1 – mild dysplasia; 2 – moderate dysplasia; 3 – severe dysplasia). Each observer made initial evaluations for interobserver reliability. Each observer then re-evaluated each specimen one month later to determine intraobserver reliability. We evaluated inter and intraobserver reliability utilizing intraclass correlation coefficient (ICC). All statistics were performed with SPSS v.25 (IBM, USA). Results: The interobserver ICC of first and second evaluation of all observers were found to be 0.906 [0.894-0,916] and 0.904 [0.892-0.915], respectively. The intraobserver ICC of observers were as follows: Reviewer1: 0.799 [0.771-0.825]; Reviewer2: 0.686 [0.645-0.724]; Reviewer3: 0.808 [0.781-0.832]; Reviewer4: 0.787 [0.757-0.814]; Reviewer5: 0.778 [0.747-0.806]. These results show intra and interobserver correlation was good to excellent. The percentages of normal trochlea, mild dysplasia, moderate dysplasia and severe dysplasia for first evaluation, by reviewer, are as follows: Reviewer 1: 82.7%, 12.1%, 4.0%, 1.2%; Reviewer 2: 37.3%, 26.2%, 27.5%, 9.1%; Reviewer 3: 57.9%, 28.0%, 12.1%, 1.9%; Reviewer 4: 64.2%, 25.6%, 7.7%, 2.6%; Reviewer 5: 65.6%, 14.9%, 12.3%, 7.2%. The percentages of normal trochlea, mild dysplasia, moderate dysplasia and severe dysplasia for second evaluation, by reviewer, are as follows: Reviewer 1: 78.8%, 16.6%, 3.6%, 1.0%; Reviewer 2: 40.3%, 26.4%, 23.3%, 10.0%; Reviewer 3: 42.2%,35.1%, 18.8%, 3.9%; Reviewer 4: 57.4%, 31.9%, 8.2%, 2.5%; Reviewer 5: 73.7%, 8.2%, 9.7%, 8.4%. In total, the percentages of normal trochlea, mild dysplasia, moderate dysplasia and severe dysplasia were 60.00%, 22.51%, 12.72%, 4.77%; respectively. Conclusions: This study shows that although there was no absolute criteria to grade trochlear dysplasia, observers had similar opinions on the degree of dysplasia. Also, our cohort shows that moderate to severe dysplasia is not uncommon as it is present in around 17% of knees in our cohort. This is the first epidemiologic study evaluating the prevalence of trochlear dysplasia in the normal population.

Download Full-text