Inter- and intraobserver agreement of three classification systems for lateral clavicle fractures – reliability comparison between two specialist groups

Abstract Background Although of great value in the management of lateral clavicle fractures, substantial variation in their classification exists. We performed a retrospective study to address the inter- and intraobserver reliability of three different classification systems for lateral clavicle fractures. Methods Radiographs of 20 lateral clavicle fractures that represented a full spectrum of adult fracture patterns were graded by five experienced radiologists and five experienced trauma surgeons according to the Orthopaedic Trauma Association (OTA), the Neer, and the Jäger/Breitner classification systems. This evaluation was performed at two different time points separated by 3 months. To measure the observer agreement, the Fleiss kappa coefficient (κ) was applied and assessed according to the grading of Landis and Koch. Results The overall interobserver reliability showed a fair agreement in all three classification systems. For the OTA classification system, the interobserver agreement showed a mean kappa value of 0.338 ranging from 0.350 (radiologists) to 0.374 (trauma surgeons). Kappa values of the interobserver agreement for the Neer classification system ranged from 0.238 (trauma surgeons) to 0.276 (radiologists) with a mean κ of 0.278. The Jäger/Breitner classification system demonstrated a mean kappa value of 0.330 ranging from 0.306 (trauma surgeons) to 0.382 (radiologists). The overall intraobserver reliability was moderate for the OTA and the Jäger/Breitner classification systems, while the overall intraobserver reliability for the Neer classification system was fair. The kappa values of the intraobserver agreements showed, in all classification systems, a wide range with the OTA classification system ranging from 0.086 to 0.634, the Neer classification system ranging from 0.137 to 0.448, and a range from 0.154 to 0.625 of the Jäger/Breitner classification system. Conclusions The low inter- and intraobserver agreement levels exhibited in all three classification systems by both specialist groups suggest that the tested lateral clavicle fracture classification systems are unreliable and, therefore, of limited value. We should recognize there is considerable inconsistency in how physicians classify lateral clavicle fractures and therefore any conclusions based on these classifications should be recognized as being somewhat subjective.

Download Full-text

The Interrater and Intrarater Agreement of a Modified Neer Classification System and Associated Treatment Choice for Lateral Clavicle Fractures

The American Journal of Sports Medicine ◽

10.1177/0363546515593949 ◽

2015 ◽

Vol 43 (10) ◽

pp. 2431-2436 ◽

Cited By ~ 8

Author(s):

Chul-Hyun Cho ◽

Joo Han Oh ◽

Gu-Hee Jung ◽

Gi-Hyuk Moon ◽

In Hyeok Rhyou ◽

...

Keyword(s):

Classification System ◽

Treatment Choice ◽

Neer Classification ◽

Lateral Clavicle ◽

Clavicle Fractures

Download Full-text

Comprehensive classification system for multirod constructs across three-column osteotomies: a reliability study

Journal of Neurosurgery Spine ◽

10.3171/2020.6.spine20678 ◽

2021 ◽

Vol 34 (1) ◽

pp. 103-109

Author(s):

Mostafa H. El Dafrawy ◽

Owoicho Adogwa ◽

Adam M. Wegner ◽

Nicholas A. Pallotta ◽

Michael P. Kelly ◽

...

Keyword(s):

Classification System ◽

Interobserver Reliability ◽

Kappa Coefficient ◽

Intraobserver Reliability ◽

Osteotomy Site ◽

New Classification ◽

Good Reliability ◽

Kappa Value ◽

Comprehensive Classification ◽

High Degree

OBJECTIVEIn this study, the authors’ goal was to determine the intra- and interobserver reliability of a new classification system that allows the description of all possible constructs used across three-column osteotomies (3COs) in terms of rod configuration and density.METHODSThirty-five patients with multirod constructs (MRCs) across a 3CO were classified by two spinal surgery fellows according to the new system, and then were reclassified 2 weeks later. Constructs were classified as follows: the number of rods across the osteotomy site followed by a letter corresponding to the type of rod configuration: “M” is for a main rod configuration, defined as a single rod spanning the osteotomy. “L” is for linked rod configurations, defined as 2 rods directly connected to each other at the osteotomy site. “S” is for satellite rod configurations, which were defined as a short rod independent of the main rod with anchors above and below the 3CO. “A” is for accessory rods, defined as an additional rod across the 3CO attached to main rods but not attached to any anchors across the osteotomy site. “I” is for intercalary rod configurations, defined as a rod connecting 2 separate constructs across the 3CO, without the intercalary rod itself attached to any anchors across the osteotomy site. The intra- and interobserver reliability of this classification system was determined.RESULTSA sample estimation for validation assuming two readers and 35 subjects results in a two-sided 95% confidence interval with a width of 0.19 and a kappa value of 0.8 (SD 0.3). The Fleiss kappa coefficient (κ) was used to calculate the degree of agreement between interrater and intraobserver reliability. The interrater kappa coefficient was 0.3, and the intrarater kappa coefficient was 0.63 (good reliability). This scenario represents a high degree of agreement despite a low kappa coefficient. Correct observations by both observers were 34 of 35 and 33 of 35 at both time points. Misclassification was related to difficulty in determining connectors versus anchors.CONCLUSIONSMRCs across 3COs have variable rod configurations. Currently, no classification system or agreement on nomenclature exists to define the configuration of rods across 3COs. The authors present a new, comprehensive MRC classification system with good inter- and intraobserver reliability and a high degree of agreement that allows for a standardized description of MRCs across 3COs.

Download Full-text

The Buttazzoni Classification of Distal Radial Fractures in Adults: Interobserver and Intraobserver Reliability

Hand ◽

10.1007/s11552-009-9163-1 ◽

2009 ◽

Vol 4 (3) ◽

pp. 283-288 ◽

Cited By ~ 4

Author(s):

Mats Å. Wadsten ◽

Arkan S. Sayed-Noor ◽

Gùran O. Sjù;dén ◽

Olle Svensson ◽

Gunnar G. Buttazzoni

Keyword(s):

Classification System ◽

Kappa Coefficient ◽

Intraobserver Reliability ◽

Classification Systems ◽

Distal Radial Fracture ◽

Radial Fracture ◽

New Classification ◽

Distal Radial Fractures ◽

Orthopedic Surgeons ◽

Fracture Types

Despite the fact that distal radial fracture is the commonest fracture, there is a little evidence-based knowledge about the value of its classification to guide management and predict prognosis. The available classification systems are either complicated or weakly applicable in clinical practice. Older's classification is the most reliable, but does not cover all radial fracture types. We evaluated the interobserver and intraobserver reliability of a new classification system which is a modification of Older's classification covering all radial fracture types. Two hundred and thirty-two consecutive adult patients with acute distal radial fractures were blindly evaluated according to the new classification by three orthopedic surgeons twice with 1-year interval. The interobserver reliability was measured using the Fleiss kappa coefficient, and the intraobserver reliability was measured using the Cohen's kappa coefficient. The new classification showed fair to substantial interobserver and intraobserver reliability, i.e., results comparable to the reliability of commonly used classification systems. The reliability was better for younger patients and when evaluation was carried out by hand-surgery-interested orthopedic surgeons. The new classification system is simple, covers all radial fracture types, and has an acceptable reliability. Further studies are needed to judge its ability to direct management and predict prognosis.

Download Full-text

Ischaemic stroke etiological classification system: the agreement analysis of CISS, SPARKLE and TOAST

Stroke and Vascular Neurology ◽

10.1136/svn-2018-000226 ◽

2019 ◽

Vol 4 (3) ◽

pp. 123-128 ◽

Cited By ~ 1

Author(s):

Haojie Zhang ◽

Zixiao Li ◽

Yunyi Dai ◽

Enhui Guo ◽

Changqing Zhang ◽

...

Keyword(s):

Ischemic Stroke ◽

Ischaemic Stroke ◽

Classification System ◽

High Reliability ◽

Classification Systems ◽

Rater Reliability ◽

Stroke Classification ◽

Subtype Classification ◽

Stroke Research ◽

Kappa Value

Background and purposeThe ideal stroke classification system needs to have validity, high reliability and applicability among different stroke research settings. The Chinese Ischemic Stroke Subclassification (CISS) and the Subtypes of Ischemic Stroke Classification System (SPARKLE) have emerged recently but have not been tested using agreement analysis. As a result, the objective of this study is to investigate the level of agreement among stroke subtype classifications using CISS, SPARKLE and Trial of Org 10172 in Acute Stroke Treatment (TOAST). We also analyse the inter-rater reliability of CISS.MethodsThe data include 623 inpatients who have had an ischaemic stroke, accrued from Beijing Tiantan Hospital between 1 October 2015 and 19 April 2016. According to the diagnostic standards of the three subtype classification systems, 299 inpatients who satisfied the requirements of our study were independently classified with etiological subtypes, and we compared the three subclassifications.ResultsThere was substantial overall agreement among the three classification systems: CISS versus SPARKLE (kappa value=0.684, p<0.001), CISS versus TOAST (kappa value=0.615, p<0.001) and SPARKLE versus TOAST (kappa value=0.675, p<0.001). The inter-rater reliability of CISS was excellent (kappa value=0.857, p<0.001). Furthermore, among the three subtype classification systems, the variance analysis results of the etiological subtypes were not uniform.ConclusionThere were generally substantial agreements among three ischaemic stroke etiological classification systems. CISS is a valid and reliable classification system, with which different stroke research centres can apply and compare data.

Download Full-text

The Oswestry-Bristol Classification

The Bone & Joint Journal ◽

10.1302/0301-620x.102b1.bjj-2019-0366.r3 ◽

2020 ◽

Vol 102-B (1) ◽

pp. 102-107 ◽

Cited By ~ 3

Author(s):

Nikhil Sharma ◽

Ashley Brown ◽

Theodoros Bouras ◽

Jan H. Kuiper ◽

Jonathan Eldridge ◽

...

Keyword(s):

Classification System ◽

Interobserver Agreement ◽

Trochlear Dysplasia ◽

Significant Risk ◽

Patellofemoral Instability ◽

Kappa Statistic ◽

Significant Risk Factor ◽

Classification Systems ◽

Intraobserver Agreement ◽

Dejour Classification

Aims Trochlear dysplasia is a significant risk factor for patellofemoral instability. The Dejour classification is currently considered the standard for classifying trochlear dysplasia, but numerous studies have reported poor reliability on both plain radiography and MRI. The severity of trochlear dysplasia is important to establish in order to guide surgical management. We have developed an MRI-specific classification system to assess the severity of trochlear dysplasia, the Oswestry-Bristol Classification (OBC). This is a four-part classification system comprising normal, mild, moderate, and severe to represent a normal, shallow, flat, and convex trochlear, respectively. The purpose of this study was to assess the inter- and intraobserver reliability of the OBC and compare it with that of the Dejour classification. Methods Four observers (two senior and two junior orthopaedic surgeons) independently assessed 32 CT and axial MRI scans for trochlear dysplasia and classified each according to the OBC and the Dejour classification systems. Assessments were repeated following a four-week interval. The inter- and intraobserver agreement was determined by using Fleiss’ generalization of Cohen’s kappa statistic and S-statistic nominal and linear weights. Results The OBC showed fair-to-good interobserver agreement and good-to-excellent intraobserver agreement (mean kappa 0.68). The Dejour classification showed poor interobserver agreement and fair-to-good intraobserver agreement (mean kappa 0.52) Conclusion The OBC can be used to assess the severity of trochlear dysplasia. It can be applied in clinical practice to simplify and standardize surgical decision-making in patients with recurrent patella instability. Cite this article: Bone Joint J 2020;102-B(1):102–107

Download Full-text

REPRODUCIBILITY OF MODIFIED WALDENSTRÖM CLASSIFICATION IN PERTHES DISEASE

Acta Ortopédica Brasileira ◽

10.1590/1413-785220212902242018 ◽

2021 ◽

Vol 29 (2) ◽

pp. 92-96

Author(s):

FELIPPI GUIZARDI CORDEIRO ◽

PATRICIA MORENO GRANGEIRO ◽

BRUNO SÉRGIO FERREIRA MASSA ◽

NEI BOTTER MONTENEGRO ◽

ROBERTO GUARNIERO

Keyword(s):

Classification System ◽

Reasonable Agreement ◽

Interobserver Agreement ◽

Interobserver Reliability ◽

Perthes Disease ◽

Intraobserver Agreement ◽

Level Of Evidence ◽

Resident Physicians ◽

Kappa Value ◽

Kendall’S W

ABSTRACT Objective: The purpose of our study is to evaluate intraobserver and interobserver reliability of modified Waldenström classification system for Legg-Calvé-Perthes disease and assess the influence of the professional’s area of expertise in the assessment. Methods: Twelve evaluators assessed 40 pairs of pelvic radiographs of patients with Legg-Calvé-Perthes disease. After two weeks, a new evaluation was performed by the same evaluators. Kappa and Kendall’s W indexes were used to evaluate both intraobserver and interobserver reliability and determine the influence of the evaluators’ experience and area of expertise. Results: The average intraobserver kappa value was 0.394, with a reasonable agreement level. The interobserver Kappa value was 0.243 in the first evaluation (95% CI, 0.227-0.259 and p < 0.0001) and 0.245 in the second evaluation (95% CI, 0.229-0.260 and p < 0.0001). The Kendall’s W values obtained for pediatric orthopedists, radiologists and resident physicians were 0.686, 0.630 and 0.529 (p < 0.0001), respectively. Conclusion: The modified Waldenström classification presented both moderate and reasonable levels of intraobserver agreement, and reasonable level of interobserver agreement. The evaluators’ degree of experience and area of expertise influenced the concordance level found. Level of Evidence II, Diagnostic Studies - Investigating a Diagnostic Test.

Download Full-text

Novel Arthroscopic Classification of Osteochondritis Dissecans of the Knee

The American Journal of Sports Medicine ◽

10.1177/0363546516637175 ◽

2016 ◽

Vol 44 (7) ◽

pp. 1694-1698 ◽

Cited By ~ 24

Author(s):

James L. Carey ◽

Eric J. Wall ◽

Nathan L. Grimm ◽

Theodore J. Ganley ◽

Eric W. Edmonds ◽

...

Keyword(s):

Osteochondritis Dissecans ◽

Classification System ◽

Interobserver Reliability ◽

Intraclass Correlation ◽

Intraobserver Reliability ◽

Classification Systems ◽

Future Research ◽

Multicenter Studies ◽

Level Of Evidence ◽

Surgical Evaluation

Background: Several systems have been proposed for classifying osteochondritis dissecans (OCD) of the knee during surgical evaluation. No single classification includes mutually exclusive categories that capture all of the salient features of stability, chondral fissuring, and fragment detachment. Furthermore, no study has assessed the reliability of these classification systems. Purpose: To determine the intra- and interobserver reliability of a novel, comprehensive arthroscopic classification system with mutually exclusive OCD lesion types. Study Design: Cohort study (diagnosis); Level of evidence, 3. Methods: The Research in OsteoChondritis of the Knee (ROCK) study group developed a classification system for arthroscopic evaluation of OCD of the knee that includes 6 arthroscopic categories—3 immobile types and 3 mobile types. To optimize comprehensibility and applicability, each was developed with a memorable name, a brief description, a line diagram corresponding to the archetypal arthroscopic appearance, and an arthroscopic photograph depicting this archetype. Thirty representative arthroscopic videos were evaluated by 10 orthopaedic surgeon raters, who classified each lesion. After 4 weeks, the raters again classified the OCD lesions depicted in the 30 videos in a new, randomly selected order. Reliability was assessed via the intraclass correlation coefficient (ICC). Results: The interobserver reliability of this novel arthroscopy classification was estimated by an ICC of 0.94 (95% CI, 0.91-0.97) for the first round and 0.95 (95% CI, 0.93-0.98) for the second round. According to the standards for the magnitude of the reliability coefficient of Altman, these ICCs indicate that interobserver reliability was very good. The intraobserver reliability was estimated by an ICC of 0.96 (95% CI, 0.95-0.97), which indicates that the intraobserver reliability was similarly very good. Conclusion: The ROCK OCD knee arthroscopy classification system demonstrated excellent intra- and interobserver reliability. In light of this reliability, this classification system may be used clinically and to facilitate future research, including multicenter studies for OCD.

Download Full-text

Interobserver agreement of the Paris and simplified classifications of superficial colonic lesions: a Western study

Endoscopy International Open ◽

10.1055/a-1352-3437 ◽

2021 ◽

Vol 09 (03) ◽

pp. E388-E394

Author(s):

Francesco Cocomazzi ◽

Marco Gentile ◽

Francesco Perri ◽

Antonio Merla ◽

Fabrizio Bossa ◽

...

Keyword(s):

Interobserver Agreement ◽

Classification Systems ◽

Sensitivity Analyses ◽

Size Estimation ◽

Rater Agreement ◽

Staff Members ◽

Video Clips ◽

Paris Classification ◽

Level Of Agreement

Abstract Background and study aims The Paris classification of superficial colonic lesions has been widely adopted, but a simplified description that subgroups the shape into pedunculated, sessile/flat and depressed lesions has been proposed recently. The aim of this study was to evaluate the accuracy and inter-rater agreement among 13 Western endoscopists for the two classification systems. Methods Seventy video clips of superficial colonic lesions were classified according to the two classifications, and their size estimated. The interobserver agreement for each classification was assessed using both Cohen k and AC1 statistics. Accuracy was taken as the concordance between the standard morphology definition and that made by participants. Sensitivity analyses investigated agreement between trainees (T) and staff members (SM), simple or mixed lesions, distinct lesion phenotypes, and for laterally spreading tumors (LSTs). Results Overall, the interobserver agreement for the Paris classification was substantial (κ = 0.61; AC1 = 0.66), with 79.3 % accuracy. Between SM and T, the values were superimposable. For size estimation, the agreement was 0.48 by the κ-value, and 0.50 by AC1. For single or mixed lesions, κ-values were 0.60 and 0.43, respectively; corresponding AC1 values were 0.68 and 0.57. Evaluating the several different polyp subtypes separately, agreement differed significantly when analyzed by the k-statistics (0.08–0.12) or the AC1 statistics (0.59–0.71). Analyses of LSTs provided a κ-value of 0.50 and an AC1 score of 0.62, with 77.6 % accuracy. The simplified classification outperformed the Paris classification: κ = 0.68, AC1 = 0.82, accuracy = 91.6 %. Conclusions Agreement is often measured with Cohen’s κ, but we documented higher levels of agreement when analyzed with the AC1 statistic. The level of agreement was substantial for the Paris classification, and almost perfect for the simplified system.

Download Full-text

Toward the Development of a Comprehensive Clinically Oriented Patient Profile: A Systematic Review of the Purpose, Characteristic, and Methodological Quality of Classification Systems of Adult Spinal Deformity

Neurosurgery ◽

10.1093/neuros/nyab023 ◽

2021 ◽

Author(s):

Kenny Yat Hong Kwan ◽

J Naresh-Babu ◽

Wilco Jacobs ◽

Marinus de Kleuver ◽

David W Polly ◽

...

Keyword(s):

Systematic Review ◽

Decision Making ◽

Classification System ◽

Methodological Quality ◽

Literature Search ◽

Adult Spinal Deformity ◽

Classification Systems ◽

Systematic Literature Search ◽

Patient Profile

Abstract BACKGROUND Existing adult spinal deformity (ASD) classification systems are based on radiological parameters but management of ASD patients requires a holistic approach. A comprehensive clinically oriented patient profile and classification of ASD that can guide decision-making and correlate with patient outcomes is lacking. OBJECTIVE To perform a systematic review to determine the purpose, characteristic, and methodological quality of classification systems currently used in ASD. METHODS A systematic literature search was conducted in MEDLINE, EMBASE, CINAHL, and Web of Science for literature published between January 2000 and October 2018. From the included studies, list of classification systems, their methodological measurement properties, and correlation with treatment outcomes were analyzed. RESULTS Out of 4470 screened references, 163 were included, and 54 different classification systems for ASD were identified. The most commonly used was the Scoliosis Research Society-Schwab classification system. A total of 35 classifications were based on radiological parameters, and no correlation was found between any classification system levels with patient-related outcomes. Limited evidence of limited quality was available on methodological quality of the classification systems. For studies that reported the data, intraobserver and interobserver reliability were good (kappa = 0.8). CONCLUSION This systematic literature search revealed that current classification systems in clinical use neither include a comprehensive set of dimensions relevant to decision-making nor did they correlate with outcomes. A classification system comprising a core set of patient-related, radiological, and etiological characteristics relevant to the management of ASD is needed.

Download Full-text

CT Findings in General Practice Patients with Suspected Acute Sinusitis

Acta Radiologica ◽

10.1177/02841851960373p258 ◽

1996 ◽

Vol 37 (3P2) ◽

pp. 708-713 ◽

Cited By ~ 9

Author(s):

M. Lindbæk ◽

U. L.-H. Johnsen ◽

E. Kaastad ◽

S. Dølvik ◽

P. Møll ◽

...

Keyword(s):

General Practice ◽

Clinical Diagnosis ◽

Interobserver Agreement ◽

Acute Sinusitis ◽

Fluid Level ◽

Ct Findings ◽

Kappa Value ◽

Ct Evaluation ◽

Common Combination ◽

Maxillary Sinuses

Purpose: To study CT findings in general practice patients with a clinical diagnosis of acute sinusitis, and to examine the interobserver variation between 2 radiologists with regard to their CT evaluation. Material and Methods: Two hundred and one patients were examined with coronal CT images of the paranasal sinuses within 2 days of the clinical diagnosis. Patients with chronic sinusitis were excluded. Fluid level or total opacification of any sinus were used as evidence of sinusitis. Results: One hundred and twenty-seven (63%) patients had fluid level or total opacification in a sinus region, most in more than one region. One hundred and fifteen had CT signs of sinusitis in the ethmoid region, 84 in the maxillary, 18 in the frontal, and 10 in the sphenoid. Forty-nine patients had a negative CT. In the evaluation of interobserver agreement, the overall assessment of the CT yielded a kappa value of 0.70. Conclusion: The study demonstrated great variation in the CT findings in general practice patients with suspected acute sinusitis. More than one sinus region was affected in most patients in whom sinusitis was confirmed by CT imaging; the most common combination was ethmoid and maxillary sinuses. The interobserver agreement was substantial.

Download Full-text