Analysis of interobserver reproducibility in grading               histological patterns of dysplastic nevi*

BACKGROUND: Dysplastic nevi are among the most important cutaneous melanoma simulators. They are important risk markers for this neoplasia and can be its potential precursors. Some authors found a statistically significant relationship between the degree of dysplasia and the risk for developing melanoma. However, reproducibility of grading criteria ranged from poor to fair in the researched articles. OBJECTIVE: To test the reproducibility of the grading criteria proposed by Sagebiel et al. regarding dysplastic nevi. METHODS: Histological specimens of 75 dysplastic nevi were graded, independently and in a blinded fashion, according to preestablished criteria, by a panel of 10 pathologists with different levels of experience. Diagnostic agreement was calculated using weighted kappa and intraclass correlation coefficients. RESULTS: The average of weighted kappa values was 0.13 for all observers, 0.12 for dermatopathologists, 0.18 for general pathologists and 0.05 for residents. Intraclass correlation coefficient values were 0.2 for all observers, 0.18 for dermatopathologists, 0.33 for general pathologists and 0.15 for residents. CONCLUSIONS: Histopathological grading for dysplastic nevi was not reproducible in this Brazilian series, so the criteria used are not a helpful histopathological parameter for clinicopathological correlation.

Download Full-text

Multilevel Exploratory Factor Analysis of the Feeling Word Checklist–24

Assessment ◽

10.1177/1073191116632336 ◽

2016 ◽

Vol 24 (7) ◽

pp. 907-918 ◽

Cited By ~ 4

Author(s):

Karin Lindqvist ◽

Fredrik Falkenström ◽

Rolf Sandell ◽

Rolf Holmqvist ◽

Annika Ekeblad ◽

...

Keyword(s):

Factor Analysis ◽

Exploratory Factor Analysis ◽

Therapeutic Relationship ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Emotional Reactions ◽

Intraclass Correlation Coefficients ◽

Therapeutic Interaction ◽

Random Intercept ◽

Different Levels

Emotional reactions are a vital part of the therapeutic relationship. The Feeling Word Checklist–24 (FWC-24) is an instrument asking the clinician (or the patient) to report to what degree he or she has experienced various feelings during a therapeutic interaction. The aim of this study was to assess the factor structure of the clinician-rated FWC-24 when taking dependencies in the data into account. The sample was deliberately heterogeneous and consisted of 4,443 ratings made by 101 psychotherapists working with different psychotherapy methods in relation to 191 patients of different ages, genders, and with different primary diagnoses. A random intercept-only model revealed large intraclass correlation coefficients at the therapist level, indicating that a multilevel analysis was warranted. A two-level exploratory factor analysis with therapists as the between level and patients plus sessions as the within level was conducted. The items from FWC-24 were found to be best represented by four factors on the between level and four factors on the within level. The factor structures were largely similar on the two levels and were labeled Engaged, Inadequate, Relaxed, and Moved. The different factors explained different amounts of variance on different levels, indicating that some factors are more therapist dependent and some more patient dependent.

Download Full-text

Agreement in Occupational Exposures Between Men and Women Using Retrospective Assessments by Expert Coders

Annals of Work Exposures and Health ◽

10.1093/annweh/wxy074 ◽

2018 ◽

Vol 62 (9) ◽

pp. 1159-1170 ◽

Cited By ~ 2

Author(s):

Aude Lacourt ◽

France Labrèche ◽

Mark S Goldberg ◽

Jack Siemiatycki ◽

Jérôme Lavoué

Keyword(s):

Intraclass Correlation ◽

Correlation Coefficients ◽

Occupational Exposures ◽

Weighted Kappa ◽

Case Control Studies ◽

Men And Women ◽

Bayesian Hierarchical ◽

Intraclass Correlation Coefficients ◽

Occupational Agent ◽

Exposure Metrics

Abstract Objectives To estimate the level of agreement and identify notable differences in occupational exposures (agents) between men and women from retrospective assessments by expert coders. Methods Lifetime occupational histories of 1657 men and 2073 women from two case–control studies, were translated into exposure estimates to 243 agents, from data on 13882 jobs. Exposure estimates were summarized as proportions and frequency-weighted intensity of exposure for 59 occupational codes by sex. Agreement between metrics of exposure in men’s and women’s jobs was determined with intraclass correlation coefficients (ICC) and weighted Kappa coefficients, using as unit of analysis (‘cell’) a combination of occupational code and occupational agent. ‘Notable’ differences between men and women were identified for each cell, according to a Bayesian hierarchical model for both proportion and frequency-weighted intensity of exposure. Results For cells common to both men and women, the ICC for continuous probability of exposure was 0.84 (95% CI: 0.83–0.84) and 7.4% of cells showed notable differences with jobs held by men being more often exposed. A weighted kappa of 0.67 (95% CI: 0.61–0.73) was calculated for intensity of exposure, and an ICC of 0.67 (95% CI: 0.62–0.71) for frequency-weighted intensity of exposure, with a tendency of higher values of exposure metrics in jobs held by men. Conclusions Exposures were generally in agreement between men and women. Some notable differences were identified, most of them explained by differential sub-occupations or industries or dissimilar reported tasks within the studied occupations.

Download Full-text

Reliability and Validity of Trunk Assessment for People With Multiple Sclerosis

Physical Therapy ◽

10.1093/ptj/86.1.66 ◽

2006 ◽

Vol 86 (1) ◽

pp. 66-76 ◽

Cited By ~ 35

Author(s):

Geert Verheyden ◽

Godelieve Nuyens ◽

Alice Nieuwboer ◽

Pol Van Asch ◽

Piet Ketelaer ◽

...

Keyword(s):

Multiple Sclerosis ◽

Construct Validity ◽

Intraclass Correlation ◽

Reliability And Validity ◽

Correlation Coefficients ◽

Weighted Kappa ◽

Altman Analysis ◽

Bland Altman Analysis ◽

Intraclass Correlation Coefficients ◽

Test Retest Reliability

Abstract Background and Purpose. Standardized scales are a prerequisite for rehabilitation and research. This study was designed to determine the reliability and validity of scores on items of the trunk assessment of the Melsbroek Disability Scoring Test (MDST) and Trunk Impairment Scale (TIS) in people with multiple sclerosis (MS). Subjects. Thirty people with MS participated in the study. Methods. Interrater and test-retest reliability and construct validity were assessed. Results. Kappa and weighted kappa values for the items of the trunk assessment of the MDST ranged from .74 to .95, and the kappa and weighted kappa values for the TIS items ranged from .46 to 1.00. Intraclass correlation coefficients for interrater and test-retest agreement were .93 and .92, respectively, for the trunk assessment of the MDST and .97 and .95, respectively, for the TIS. Bland-Altman analysis showed consistency of scores without observer bias. Construct validity was established. Discussion and Conclusion. The MDST and TIS provide reliable assessments of the trunk and are valid scales for measuring trunk performance in people with MS. [Verheyden G, Nuyens G, Nieuwboer A, et al. Reliability and validity of trunk assessment for people with multiple sclerosis.

Download Full-text

HoNOS–ABI: a reliable outcome measure of neuropsychiatric sequelae to brain injury?

Psychiatric Bulletin ◽

10.1192/pb.29.2.53 ◽

2005 ◽

Vol 29 (2) ◽

pp. 53-55 ◽

Cited By ~ 12

Author(s):

S. Fleminger ◽

E. Leigh ◽

P. Eames ◽

L. Langrell ◽

R. Nagraj ◽

...

Keyword(s):

Brain Injury ◽

Interrater Reliability ◽

Outcome Measure ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Acquired Brain Injury ◽

Weighted Kappa ◽

Routine Clinical Practice ◽

Clinical Implications ◽

Intraclass Correlation Coefficients

Aims and MethodThe Health of the Nation Outcome Scale for Acquired Brain Injury (HoNOS–ABI) is a relatively new outcome measure designed to assess the neuropsychiatric sequelae of brain damage. This study investigated the interrater reliability of this scale. Fifty patients with traumatic brain injury receiving rehabilitation were each rated twice on the HoNOS–ABI, by two different raters. There were 24 raters in total.ResultsWeighted kappa values ranged from 0.43 to 0.84 and intraclass correlation coefficients from 0.58 to 0.97 for the ten items assessed. This indicated that agreement was moderate to substantial for all items.Clinical ImplicationsThe scales consistently measured the items of interest across different raters. This indicates that HoNOS–ABI is a reliable outcome measure when applied by different raters in routine clinical practice.

Download Full-text

Evaluation of the inter and intraobserver reproducibility of the GRASP method: a goniometric method to measure the isolated glenohumeral range of motion in the shoulder joint

Journal of Experimental Orthopaedics ◽

10.1186/s40634-021-00352-z ◽

2021 ◽

Vol 8 (1) ◽

Author(s):

Miguel Angel Ruiz Ibán ◽

Susana Alonso Güemes ◽

Raquel Ruiz Díaz ◽

Cristina Victoria Asenjo Gismero ◽

Alejandro Lorente Gomez ◽

...

Keyword(s):

Range Of Motion ◽

Internal Rotation ◽

Glenohumeral Joint ◽

External Rotation ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Outpatient Setting ◽

Interobserver Reproducibility ◽

Level Of Evidence ◽

Intraclass Correlation Coefficients

Abstract Purpose To evaluate the intra and interobserver reproducibility of a new goniometric method for evaluating the isolated passive range of motion of the glenohumeral joint in an outpatient setting. Methods This is a prospective observational study on healthy subjects. The Glenohumeral ROM Assessment with Scapular Pinch (GRASP) method is a new method for assessing the isolated range of motion (ROM) of the glenohumeral joint (GH) by a single examiner with a clinical goniometer. It measures the isolated glenohumeral passive abduction (GH-AB), passive external rotation (GH-ER) and internal rotation (GH-IR) with the arm at 45º of abduction. These three GH ROM parameters were measured in both shoulders of 30 healthy volunteers (15 males/15 females, mean age:41.6[SD = 10.3] years). The full shoulder passive abduction, passive external rotation and internal rotation 45º of abduction were measured by the same examiners with a goniometer for comparison. One examiner made two evaluations and a second examiner made a third one. The primary outcome was the intra- and interobserver reproducibility of the measurements assessed with intraclass correlation coefficients (ICC) and the Bland–Altman plot. Results The intra-observer ICC for isolated glenohumeral ROM were: 0.84 ± 0.07 for GH-ABD, 0.63 ± 0.09 for GH-ER, and 0.61 ± 0.14 for GH-IR. The inter-observer ICC for isolated glenohumeral ROM were: 0.86 ± 0.06 for GH-ABD, 0.68 ± 0.12 for GH-ER, and 0.62 ± 0.14 for GH-IR. These results were similar to those obtained for full shoulder ROM assessment with a goniometer. Conclusion The GRASP method is reproducible for quick assessment of isolated glenohumeral ROM. Level of evidence III

Download Full-text

Development and Reliability of an Athlete Introductory Movement Screen for Use in Emerging Junior Athletes

Pediatric Exercise Science ◽

10.1123/pes.2018-0244 ◽

2019 ◽

Vol 31 (4) ◽

pp. 448-457 ◽

Cited By ~ 2

Author(s):

Simon A. Rogers ◽

Peter Hassmén ◽

Alexandra H. Roberts ◽

Alison Alcock ◽

Wendy L. Gilleard ◽

...

Keyword(s):

Intraclass Correlation ◽

Correlation Coefficients ◽

Weighted Kappa ◽

Intrarater Reliability ◽

Training Interventions ◽

Intraclass Correlation Coefficients ◽

Junior Athletes ◽

Sum Score ◽

Movement Screening ◽

Sum Scores

Purpose: A novel 4-task Athlete Introductory Movement Screen was developed and tested to provide an appropriate and reliable movement screening tool for youth sport practitioners. Methods: The overhead squat, lunge, push-up, and a prone brace with shoulder touches were selected based on previous assessments. A total of 28 mixed-sport junior athletes (18 boys and 10 girls; mean age = 15.7 [1.8] y) completed screening after viewing standardized demonstration videos. Athletes were filmed performing 8 repetitions of each task and assessed retrospectively by 2 independent raters using a 3-point scale. The primary rater reassessed the footage 3 weeks later. A subgroup (n = 11) repeated the screening 7 days later, and a further 8 athletes were reassessed 6 months later. Intraclass correlation coefficients (ICC), typical error (TE), coefficient of variation (CV%), and weighted kappa (k) were used in reliability analysis. Results: For the Athlete Introductory Movement Screen 4-task sum score, intrarater reliability was high (ICC = .97; CV = 2.8%), whereas interrater reliability was good (intraclass correlation coefficient = .88; CV = 5.6%). There was a range of agreement from fair to almost perfect (k = .31–.89) between raters across individual movements. A 7-day and 6-month test–retest held good reliability and acceptable CVs (≤ 10%) for sum scores. Conclusion: The 4-task Athlete Introductory Movement Screen appears to be a reliable tool for profiling emerging athletes. Reliability was strongest within the same rater; it was lower, yet acceptable, between 2 raters. Scores can provide an overview of appropriate movement competencies, helping practitioners assess training interventions in the athlete development pathway.

Download Full-text

Reproducibility of food and nutrient intake estimates using a semi-quantitative FFQ in Australian adults

Public Health Nutrition ◽

10.1017/s1368980009005023 ◽

2009 ◽

Vol 12 (12) ◽

pp. 2359-2365 ◽

Cited By ~ 49

Author(s):

Torukiri I Ibiebele ◽

Sanjoti Parekh ◽

Kylie-ann Mallitt ◽

Maria Celia Hughes ◽

Peter K O’Rourke ◽

...

Keyword(s):

Vegetable Intake ◽

Adult Population ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Weighted Kappa ◽

Kappa Statistics ◽

Food Groups ◽

Intraclass Correlation Coefficients ◽

One Year ◽

General Community

AbstractObjectiveTo assess the reproducibility of a 135-item self-administered semi-quantitative FFQ.DesignControl subjects who had previously completed an FFQ relating to usual dietary intake in a nationwide case–control study of cancer between November 2003 and April 2004 were randomly selected, re-contacted, and invited to complete the same FFQ a second time approximately one year later (between January and April 2005). Agreement between the two FFQ was compared using weighted kappa statistics and intraclass correlation coefficients (ICC) for food groups and nutrients. Summary questions, included in the FFQ, were used to assess overall intakes of vegetables, fruits and meat.SettingGeneral community in Australia.SubjectsOne hundred men and women aged 22–79 years, randomly selected from the previous control population.ResultsThe weighted κ and ICC measures of agreement for food groups were moderate to substantial for seventeen of the eighteen food groups. For nutrients, weighted κ ranged from 0·44 for starch to 0·83 for alcohol while ICC ranged from 0·51 to 0·91 for the same nutrients. Estimates of meat, fruit and vegetable intake using summary questions were similar for both survey periods, but were significantly lower than estimates from summed individual food items.ConclusionsThe FFQ produced reproducible results and is reasonable in assessing the usual intake of various foods and nutrients among an Australian adult population.

Download Full-text

Evaluation of previously embolized intracranial aneurysms: inter-and intra-rater reliability among neurosurgeons and interventional neuroradiologists

Journal of NeuroInterventional Surgery ◽

10.1136/neurintsurg-2017-013231 ◽

2017 ◽

Vol 10 (5) ◽

pp. 462-466 ◽

Cited By ~ 3

Author(s):

Scott L Zuckerman ◽

Nikita Lakomkin ◽

Jordan A Magarik ◽

Jan Vargas ◽

Marcus Stephens ◽

...

Keyword(s):

Intracranial Aneurysms ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Weighted Kappa ◽

Years Of Experience ◽

Rater Agreement ◽

Rater Reliability ◽

Intraclass Correlation Coefficients ◽

Level Of Agreement ◽

Strong Agreement

BackgroundThe angiographic evaluation of previously coiled aneurysms can be difficult yet remains critical for determining re-treatment.ObjectiveThe main objective of this study was to determine the inter-rater reliability for both the Raymond Scale and per cent embolization among a group of neurointerventionalists evaluating previously embolized aneurysms.MethodsA panel of 15 neurointerventionalists examined 92 distinct cases of immediate post-coil embolization and 1 year post-embolization angiographs. Each case was presented four times throughout the study, along with alterations in demographics in order to evaluate intra-rater reliability. All respondents were asked to provide the per cent embolization (0–100%) and Raymond Scale grade (1-3) for each aneurysm. Inter-rater reliability was evaluated by computing weighted kappa values (for the Raymond Scale) and intraclass correlation coefficients (ICC) for per cent embolization.Results10 neurosurgeons and 5 interventional neuroradiologists evaluated 368 simulated cases. The agreement among all readers employing the Raymond Scale was fair (κ=0.35) while concordance in per cent embolization was good (ICC=0.64). Clinicians with fewer than 10 years of experience demonstrated a significantly greater level of agreement than the group with greater than 10 years (κ=0.39 and ICC=0.70 vs κ=0.28 and ICC=0.58). When the same aneurysm was presented multiple times, clinicians demonstrated excellent consistency when assessing per cent embolization (ICC=0.82), but moderate agreement when employing the Raymond classification (κ=0.58).ConclusionsIdentifying the per cent embolization in previously coiled aneurysms resulted in good inter- and intra-rater agreement, regardless of years of experience. The strong agreement among providers employing per cent embolization may make it a valuable tool for embolization assessment in this patient population.

Download Full-text

Interobserver Reliability Using the Phonetic Level Evaluation With Severely and Profoundly Hearing-Impaired Children

Journal of Speech Language and Hearing Research ◽

10.1044/jshr.3405.989 ◽

1991 ◽

Vol 34 (5) ◽

pp. 989-999 ◽

Cited By ~ 6

Author(s):

Stephanie Shaw ◽

Truman E. Coggins

Keyword(s):

Interrater Reliability ◽

Interobserver Reliability ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Hearing Impaired ◽

Intraclass Correlation Coefficients ◽

Assessment Measure ◽

Impaired Children ◽

Speech Assessment ◽

Hearing Impaired Children

This study examines whether observers reliably categorize selected speech production behaviors in hearing-impaired children. A group of experienced speech-language pathologists was trained to score the elicited imitations of 5 profoundly and 5 severely hearing-impaired subjects using the Phonetic Level Evaluation (Ling, 1976). Interrater reliability was calculated using intraclass correlation coefficients. Overall, the magnitude of the coefficients was found to be considerably below what would be accepted in published behavioral research. Failure to obtain acceptably high levels of reliability suggests that the Phonetic Level Evaluation may not yet be an accurate and objective speech assessment measure for hearing-impaired children.

Download Full-text

Is there a relationship between the overhead press and split jerk maximum performance? Influence of sex

International Journal of Sports Science & Coaching ◽

10.1177/17479541211020452 ◽

2021 ◽

pp. 174795412110204

Author(s):

Marcos A Soriano ◽

G Gregory Haff ◽

Paul Comfort ◽

Francisco J Amaro-Gahete ◽

Antonio Torres-González ◽

...

Keyword(s):

Confidence Intervals ◽

Body Mass ◽

Upper Limb ◽

High Reliability ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Training Experience ◽

Maximum Performance ◽

Repetition Maximum ◽

Intraclass Correlation Coefficients

The aims of this study were to (I) determine the differences and relationship between the overhead press and split jerk performance in athletes involved in weightlifting training, and (II) explore the magnitude of these differences in one-repetition maximum (1RM) performances between sexes. Sixty-one men (age: 30.4 ± 6.7 years; height: 1.8 ± 0.5 m; body mass 82.5 ± 8.5 kg; weightlifting training experience: 3.7 ± 3.5 yrs) and 21 women (age: 29.5 ± 5.2 yrs; height: 1.7 ± 0.5 m; body mass: 62.6 ± 5.7 kg; weightlifting training experience: 3.0 ± 1.5 yrs) participated. The 1RM performance of the overhead press and split jerk were assessed for all participants, with the overhead press assessed on two occasions to determine between-session reliability. The intraclass correlation coefficients (ICC) and 95% confidence intervals showed a high reliability for the overhead press ICC = 0.98 (0.97 – 0.99). A very strong correlation and significant differences were found between the overhead press and split jerk 1RM performances for all participants (r = 0.90 [0.93 – 0.85], 60.2 ± 18.3 kg, 95.7 ± 29.3 kg, p ≤ 0.001). Men demonstrated stronger correlations between the overhead press and split jerk 1RM performances (r = 0.83 [0.73-0.90], p ≤ 0.001) compared with women (r = 0.56 [0.17-0.80], p = 0.008). These results provide evidence that 1RM performance of the overhead press and split jerk performance are highly related, highlighting the importance of upper-limb strength in the split jerk maximum performance.

Download Full-text