scholarly journals Reliability of the Grading System for Voiding Cystourethrograms in the Management of Vesicoureteral Reflux: An Interrater Comparison

2016 ◽  
Vol 2016 ◽  
pp. 1-4 ◽  
Author(s):  
Süleyman Çelebi ◽  
Seyithan Özaydın ◽  
Cemile Beşik Baştaş ◽  
Özgür Kuzdan ◽  
Cankat Erdoğan ◽  
...  

Aim. Vesicoureteral reflux (VUR) is one of the most common conditions seen in pediatric urology. Fortunately, there are many treatment options for this disorder. The grading system for VUR varies among doctors, and the literature on its reliability is sparse. Here, we assessed the effectiveness of the current VUR grading system.Methods. A series of 40 voiding cystourethrogram (VCUG) studies were selected. Four pediatric urologists (PU) and four pediatric radiologists (PR) independently graded each VCUG and then agreed on a uniform interpretation. For statistical analysis the intraclass correlation coefficient (ICC) was applied to assess interrater agreement.Results. ICC values ranging from 0.82 to 0.88 reflected the strong reliability of VCUG for grading cases of VUR among pediatric urologists and radiologists as separate groups, and the reliability between the two groups was also good, as indicated by an ICC of 0.89. Despite the high ICC, disagreement existed between raters; the lowest agreement was associated with middle grades (III and IV).Conclusions. The interrater reliability of the international grading system for VUR was high but imperfect. Thus, grading differences at middle grades can profoundly influence the type of treatment pursued.

2020 ◽  
Vol 2020 ◽  
pp. 1-8
Author(s):  
Jiali Lou ◽  
Yongliang Jiang ◽  
Hantong Hu ◽  
Xiaoyu Li ◽  
Yajun Zhang ◽  
...  

The objective of this study was to determine the intrarater and interrater reliabilities of infrared image analysis of forearm acupoints before and after moxibustion. In this work, infrared images of acupoints in the forearm of 20 volunteers (M/F, 10/10) were collected prior to and after moxibustion by infrared thermography (IRT). Two trained raters performed the analysis of infrared images in two different periods at a one-week interval. The intraclass correlation coefficient (ICC) was calculated to determine the intrarater and interrater reliabilities. With regard to the intrarater reliability, ICC values were between 0.758 and 0.994 (substantial to excellent). For the interrater reliability, ICC values ranged from 0.707 to 0.964 (moderate to excellent). Given that the intrarater and interrater reliability levels show excellent concordance, IRT could be a reliable tool to monitor the temperature change of forearm acupoints induced by moxibustion.


2017 ◽  
Vol 2017 ◽  
pp. 1-8 ◽  
Author(s):  
Laszlo Kiraly ◽  
Jana Stange ◽  
Kathleen S. Kunert ◽  
Saadettin Sel

Background.To estimate repeatability and comparability of central corneal thickness (CCT) and keratometry measurements obtained by four different devices in healthy eyes.Methods.Fifty-five healthy eyes from 55 volunteers were enrolled in this study. CCT (IOLMaster 700, Pentacam HR, and Cirrus HD-OCT) and keratometry readings (IOLMaster 700, Pentacam HR, and iDesign) were measured. For statistical analysis, the corneal spherocylinder was converted into power vectors (J0, J45). Repeatability was assessed by intraclass correlation coefficient (ICC). Agreement of measurements between the devices was evaluated by the Bland-Altman method.Results.The analysis of repeatability of CCT data of IOLMaster 700, Pentacam HR, and Cirrus HD-OCT showed high ICCs (range 0.995 to 0.999). The comparison of CCT measurements revealed statistically significant differences between Pentacam HR versus IOLMaster 700 (p<0.0001) and Pentacam HR versus Cirrus HD-OCT (p<0.0001), respectively. There was no difference in CCT measurements between IOLMaster 700 and Cirrus HD-OCT (p=0.519). The repeatability of keratometry readings (J0 and J45) of IOLMaster 700, Pentacam HR, and iDesign was also high with ICCs ranging from 0.974 to 0.999. The Pentacam HR revealed significantly higher J0 in comparison to IOLMaster 700 (p=0.009) and iDesign (p=0.041); however, no significant difference was between IOLMaster 700 and iDesign (p=0.426). Comparison of J45 showed no significant difference between IOLMaster 700, Pentacam HR, and iDesign. These results were in accordance with Bland-Altman plots.Conclusion.In clinical practice, the devices analyzed should not be used interchangeably due to low agreement regarding CCT as well as keratometry readings.


2007 ◽  
Vol 35 (6) ◽  
pp. 933-935 ◽  
Author(s):  
Vishal M. Mehta ◽  
Liz W. Paxton ◽  
Stefan X. Fornalski ◽  
Rick P. Csintalan ◽  
Donald C. Fithian

Background The International Knee Documentation Committee (IKDC) forms are commonly used to measure outcomes after anterior cruciate ligament (ACL) reconstruction. The knee examination portion of the IKDC forms includes a radiographic grading system to grade degenerative changes. The interrater and intrarater reliability of this radiographic grading system remain unknown. Hypothesis We hypothesize that the IKDC radiographic grading system will have acceptable interrater and intrarater reliability. Study Design Case series (diagnosis); Level of evidence, 4. Methods Radiographs of 205 ACL-reconstructed knees were obtained at 5-year follow-up. Specifically, weightbearing posteroanterior radiographs of the operative knee in 35° to 45° of flexion and a lateral radiograph in 30° of flexion were used. The radiographs were independently graded by 2 sports medicine fellowship—trained orthopaedic surgeons using the IKDC 2000 standard instructions. One surgeon graded the same radiographs 6 months apart, blinded to patient and prior IKDC grades. The percentage agreement was calculated for each of the 5 knee compartments as defined by the IKDC. Interrater reliability was evaluated using the intraclass correlation coefficient (ICC) 2-way mixed effect model with absolute agreement. The Spearman rank-order correlation coefficient (rs) was applied to evaluate intrarater reliability. Results The interrater agreement between the 2 surgeons was 59% for the medial joint space (ICC = 0.46; 95% confidence interval [CI] = 0.35-0.56), 54% for the lateral joint space (ICC = 0.45; 95% CI = 0.27-0.58), 49% for the patellofemoral joint (ICC = 0.40; 95% CI = 0.26-0.52), 63% for the anterior joint space (ICC = 0.20; 95% CI = 0.05-0.34), and 44% for the posterior joint space (ICC = 0.28; 95% CI = 0.15-0.40). The intrarater agreement was 83% for the medial joint space (rs = .77, P < .001), 86% for the lateral joint space (rs = .76, P < .001), 81% for the patellofemoral joint (rs = .79, P < .001), 91% for the anterior joint space (rs = .48, P < .001), and 69% for the posterior joint space (rs = .64, P < .001). Conclusions While intrarater reliability was acceptable, interrater reliability was poor. These findings suggest that multiple raters may score the same radiographs differently using the IKDC radiographic grading system. The use of a single rater to grade all radiographs when using the IKDC radiographic grading system maximizes reliability.


1998 ◽  
Vol 18 (4) ◽  
pp. 193-206 ◽  
Author(s):  
Lena Haglund ◽  
Lars-Hakan Thorell ◽  
Jan Walinder

A Swedish version of the Occupational Case Analysis Interview and Rating Scale (OCAIRS-S) has been tested earlier for interrater reliability. The present study, using the second version of OCAIRS-S and including a sample of 145 patients, showed interrater correlations between .88 and .96 (Intraclass Correlation Coefficient). The results indicate that OCAIRS-S predicts which patients should be included in and excluded from occupational therapy and identifies patients who should be observed more before making such decisions. The study indicates a need for further investigations regarding which components in OCAIRS-S influence the occupational therapist in judging the patient's need for occupational therapy.


2018 ◽  
Vol 10 (4) ◽  
pp. 274-284 ◽  
Author(s):  
Suzanne F van Rijn ◽  
Elisa L Zwerus ◽  
Koen LM Koenraadt ◽  
Wilco CH Jacobs ◽  
Michel PJ van den Bekerom ◽  
...  

Background The universal goniometer is a simple measuring tool. With this review we aimed to investigate the reliability and validity of the universal goniometer in measurements of the adults' elbow. Methods Preferred Reporting Items for Systematic reviews and Meta-Analysis guidelines were followed and our study protocol was published online at PROSPERO. A literature search was conducted on relevant studies. Methodological quality was assessed using the Quality Appraisal of Diagnostic Reliability (QAREL) scoring system. Results Out of 697 studies yielded from our literature search, 12 were included. Six studies were rated as high quality. The intrarater reliability intraclass correlation coefficient ranged from 0.45 to 0.99, the interrater reliability ranged from intraclass correlation coefficient 0.53–0.97. One study providing instructions on goniometric alignment did not find a difference in expert versus non-expert examiners. Another study in which examiners were not instructed found a higher interrater reliability in expert examiners. One study investigating the validity of the goniometer in elbow measurements found a maximum standard error of the mean of 11.5° for total range of motion. Discussion Overall, the studies showed high intra- and interrater reliability of the universal goniometer. The reliability of the universal goniometer in non-expert examiners can be increased by clear instructions on goniometric alignment.


2001 ◽  
Vol 10 (2) ◽  
pp. 79-83 ◽  
Author(s):  
LH Hogg ◽  
MB Bobek ◽  
LC Mion ◽  
BM Legere ◽  
S Banjac ◽  
...  

BACKGROUND: Critical care nurses must assess the effectiveness of sedatives and analgesic agents in order to titrate doses. OBJECTIVES: To measure the interrater reliability of 2 sedation scales used to assess patients in medical intensive care units. METHODS: The interrater reliabilities of the Motor Activity Assessment Scale and the Luer sedation scale were compared prospectively in 31 patients receiving mechanical ventilation in an 18-bed medical intensive care unit of a tertiary care institution. Three registered nurses, 1 clinical pharmacist, and 1 physician simultaneously and independently followed a standardized procedure to rate each patient by using the 2 scales. Scales were randomly ordered to counteract ordering effect. Analysis of variance with post hoc Duncan multiple range tests was used to detect bias; a correlation coefficient matrix was used to examine degree of association among raters; and the intraclass correlation coefficient was measured to control for multiple raters. RESULTS: No significant bias was detected with either scale. The Motor Activity Assessment Scale had less variation (Pearson r = 0.75-0.92) than did the Luer scale (Pearson r = 0.37-0.94) and had a stronger intraclass correlation coefficient (0.81 vs 0.79). CONCLUSIONS: The Motor Activity Assessment Scale showed the highest consistency among raters.


Author(s):  
Jenell L. S. Wittmer ◽  
James M. LeBreton

Statistics used to index interrater similarity are prevalent in many areas of the social sciences, with multilevel research being one of the most common domains for estimating interrater similarity. Multilevel research spans multiple hierarchical levels, such as individuals, teams, departments, and the organization. There are three main research questions that multilevel researchers answer using indices of interrater agreement and interrater reliability: (a) Does the nesting of lower-level units (e.g., employees) within higher-level units (e.g., work teams) result in the non-independence of residuals, which is an assumption of the general linear model?; (b) Is there sufficient agreement between scores on measures collected from lower-level units (e.g., employees perceptions of customer service climate) to justify aggregating data to the higher-level (e.g., team-level climate)?; and (c) Following data aggregation, how effective are the higher-level unit means at distinguishing between those higher levels (e.g., how reliably do team climate scores distinguish between the teams)? Interrater agreement and interrater reliability refer to the extent to which lower-level data nested or clustered within a higher-level unit are similar to one another. While closely related, interrater agreement and reliability differ from one another in how similarity is defined. Interrater reliability is the relative consistency in lower-level data. For example, to what degree do the scores assigned by raters tend to correlate with one another? Alternatively, interrater agreement is the consensus of the lower-level data points. For example, estimates of interrater agreement are used to determine the extent to which ratings made by judges/observers could be considered interchangeable or equivalent in terms of their values. Thus, while interrater agreement and reliability both estimate the similarity of ratings by judges/observers, but they define interrater similarity in slightly different ways, and these statistics are suited to address different types of research questions. The first research question that these statistics address, the issue of non-independence, is typically measured using an interclass correlation statistic that is a function of both interrater reliability and agreement. However, in the context of non-independence, the intraclass correlation is most often interpreted as an effect size. The second multilevel research question, concerning adequate agreement to aggregate lower-level data to a higher level, would require a measure on interrater agreement, as the research is looking for consensus among raters. Finally, the third multilevel research question, concerning the reliability of higher-level means, not only requires a different variation of the intraclass correlation, but is also a function of both interrater reliability and agreement. Multilevel research requires researchers to appropriately apply interrater agreement and/or reliability statistics to their data, as well as follow best practices for calculating and interpreting these statistics.


2014 ◽  
Vol 2014 ◽  
pp. 1-7 ◽  
Author(s):  
Laila Najjari ◽  
Julia Hennemann ◽  
Pia Larscheid ◽  
Thomas Papathemelis ◽  
Nicolai Maass

Purpose. In the present study we want to propose a classification system to quantify cystoceles by perineal ultrasound (PUS).Materials and Methods.120 PUS data were analyzed measuring the distance between the lowest point of the bladder and the midpubic line (MPL) during rest and Valsalva. Results were classified into groups and compared to POP-Q using theκ-coefficient. Results for exact bladder position were checked for interrater reliability using ICC and Pearson’s coefficient and results for classification were checked using theκ-coefficient. Bladder positions at rest and Valsalva were correlated with the distance between these points.Results. Highly significant differences concerning the position at rest and the distance between rest and Valsalva were found between the groups. For the interrater agreement, the Pearson correlation coefficient wasρ=0.98, the ICC (A-1) = 0.98, andκ=1.00. Comparing the classification results for POP-Q and PUS, the kappa-coefficient wasκ=0.65.Conclusion. PUS using the MPL and the classification system is a highly reliable tool for the evaluation of cystoceles. PUS shows good correlation with POP-Q. Furthermore, PUS offers a doubtless identification of the descending organ. Further studies are needed to evaluate the clinical use of the classification system proposed here.


2014 ◽  
Vol 49 (5) ◽  
pp. 640-646 ◽  
Author(s):  
Mark A. Kevern ◽  
Michael Beecher ◽  
Smita Rao

Context: Athletes who participate in throwing and racket sports consistently demonstrate adaptive changes in glenohumeral-joint internal and external rotation in the dominant arm. Measurements of these motions have demonstrated excellent intrarater and poor interrater reliability. Objective: To determine intrarater reliability, interrater reliability, and standard error of measurement for shoulder internal rotation, external rotation, and total arc of motion using an inclinometer in 3 testing procedures in National Collegiate Athletic Association Division I baseball and softball athletes. Design: Cross-sectional study. Setting: Athletic department. Patients or Other Participants Thirty-eight players participated in the study. Shoulder internal rotation, external rotation, and total arc of motion were measured by 2 investigators in 3 test positions. The standard supine position was compared with a side-lying test position, as well as a supine test position without examiner overpressure. Results: Excellent intrarater reliability was noted for all 3 test positions and ranges of motion, with intraclass correlation coefficient values ranging from 0.93 to 0.99. Results for interrater reliability were less favorable. Reliability for internal rotation was highest in the side-lying position (0.68) and reliability for external rotation and total arc was highest in the supine-without-overpressure position (0.774 and 0.713, respectively). The supine-with-overpressure position yielded the lowest interrater reliability results in all positions. The side-lying position had the most consistent results, with very little variation among intraclass correlation coefficient values for the various test positions. Conclusions: The results of our study clearly indicate that the side-lying test procedure is of equal or greater value than the traditional supine-with-overpressure method.


Sign in / Sign up

Export Citation Format

Share Document