Appraisal of a scoring instrument for training and testing neonatal intubation skills

ObjectiveTo determine the validity, reliability, feasibility and applicability of a neonatal intubation scoring instrument.DesignProspective observational study.SettingSimulation-based research and training centre (Center for Advanced Pediatric and Perinatal Education), California, USA.SubjectsForty clinicians qualified for neonatal intubation.InterventionsVideotaped elective intubations on a neonatal patient simulator were scored by two independent raters. One rater scored the intubations twice. We scored the preparation of equipment and premedication, intubation performance, tube position/fixation, communication, number of attempts, duration and successfulness of the procedure.Main outcome measuresIntraclass correlation coefficients (ICC) were calculated for intrarater and inter-rater reliability. Kappa coefficients for individual items and mean kappa coefficients for all items combined were calculated. Construct validity was assessed with one-way analysis of variance using the hypothesis that experienced clinicians score higher than less experienced clinicians. The approximate time to score one intubation and the instrument’s applicability in another setting were evaluated.ResultsICCs for intrarater and inter-rater reliability were 0.99 (95% CI 0.98 to 0.99) and 0.89 (95% CI 0.35 to 0.96), and mean kappa coefficients were 0.93 (95% CI 0.85 to 1.01) and 0.71 (95% CI 0.56 to 0.92), respectively. There were no differences between the more and less experienced clinicians regarding preparation, performance, communication and total scores. The experienced group scored higher only on tube position/fixation (p=0.02). Scoring one intubation took approximately 15 min. Our instrument, developed in The Netherlands, could be readily applied in the USA.ConclusionsOur scoring instrument for simulated neonatal intubations appears to be reliable, feasible and applicable in another centre. Construct validity could not be established.

Download Full-text

Is the location of the signal intensity weighted centroid a reliable measurement of fluid displacement within the disc?

Biomedical Engineering / Biomedizinische Technik ◽

10.1515/bmt-2016-0178 ◽

2018 ◽

Vol 63 (4) ◽

pp. 453-460 ◽

Cited By ~ 7

Author(s):

Vahid Abdollah ◽

Eric C. Parent ◽

Michele C. Battié

Keyword(s):

Signal Intensity ◽

Water Distribution ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Region Of Interest ◽

Rater Reliability ◽

Fluid Displacement ◽

Intraclass Correlation Coefficients ◽

The Mean ◽

Standard Error Of Measurement

Abstract Degenerated discs have shorter T2-relaxation time and lower MR signal. The location of the signal-intensity-weighted-centroid reflects the water distribution within a region-of-interest (ROI). This study compared the reliability of the location of the signal-intensity-weighted-centroid to mean signal intensity and area measurements. L4-L5 and L5-S1 discs were measured on 43 mid-sagittal T2-weighted 3T MRI images in adults with back pain. One rater analysed images twice and another once, blinded to measurements. Discs were semi-automatically segmented into a whole disc, nucleus, anterior and posterior annulus. The coordinates of the signal-intensity-weighted-centroid for all regions demonstrated excellent intraclass-correlation-coefficients for intra- (0.99–1.00) and inter-rater reliability (0.97–1.00). The standard error of measurement for the Y-coordinates of the signal-intensity-weighted-centroid for all ROIs were 0 at both levels and 0 to 2.7 mm for X-coordinates. The mean signal intensity and area for the whole disc and nucleus presented excellent intra-rater reliability with intraclass-correlation-coefficients from 0.93 to 1.00, and 0.92 to 1.00 for inter-rater reliability. The mean signal intensity and area had lower reliability for annulus ROIs, with intra-rater intraclass-correlation-coefficient from 0.5 to 0.76 and inter-rater from 0.33 to 0.58. The location of the signal-intensity-weighted-centroid is a reliable biomarker for investigating the effects of disc interventions.

Download Full-text

Reliability and validity of the iSense optical scanner for measuring volume of transtibial residual limb models

Prosthetics and Orthotics International ◽

10.1177/0309364618806038 ◽

2018 ◽

Vol 43 (2) ◽

pp. 213-220 ◽

Cited By ~ 1

Author(s):

Lucy Armitage ◽

Li Khim Kwah ◽

Lauren Kark

Keyword(s):

Intraclass Correlation ◽

Reliability And Validity ◽

Correlation Coefficients ◽

Criterion Validity ◽

Residual Limb ◽

Limb Volume ◽

Rater Reliability ◽

Measuring Volume ◽

Intraclass Correlation Coefficients ◽

Optical Scanner

Background: Residual limb volume is often measured as part of routine care for people with amputations. These measurements assist in the timing of prosthetic fitting or replacement. In order to make well informed decisions, clinicians need access to measurement tools that are valid and reliable. Objectives: To assess the reliability and criterion validity of the iSense optical scanner in measuring volume of transtibial residual limb models. Study Design: Three assessors performed two measurements each on 13 residual limb models with an iSense optical scanner (3D systems, USA). Intra-rater and inter-rater reliability were calculated using intraclass correlation coefficients. Bland Altman plots were inspected for agreement. Criterion validity was assessed using a steel rod of known dimensions. Ten repeated measurements were performed by one assessor. A t-test was used to determine differences between measured and true rod volume. Results: Intra-rater reliability was excellent (range of intraclass correlation coefficients: 0.991–0.997, all with narrow 95% confidence intervals). While the intraclass correlation coefficients suggest excellent inter-rater reliability between all three assessors (range of intraclass correlation coefficients: 0.952–0.986), the 95% confidence intervals were wide between assessor 3 and the other two assessors. Poor agreement with assessor 3 was also seen in the Bland-Altman plots. Criterion validity was very poor with a significant difference between the mean iSense measurement and the true rod volume (difference: 221.18 mL; p < 0.001). Conclusions: Although intra-rater reliability was excellent for the iSense scanner, we did not find similar results for inter-rater reliability and validity. These results suggest that further testing of the iSense scanner is required prior to use in clinical practice. Clinical relevance The iSense offers a low cost scanning option for residual limb volume measurement. Intra-rater reliability was excellent, but inter-rater reliability and validity were such that clinical adoption is not indicated at present.

Download Full-text

Inter-Rater Reliability: Intraclass Correlation Coefficients

Educational and Psychological Measurement ◽

10.1177/001316448104100127 ◽

1981 ◽

Vol 41 (1) ◽

pp. 223-226 ◽

Cited By ~ 6

Author(s):

Dong Won Cho

Keyword(s):

Intraclass Correlation ◽

Correlation Coefficients ◽

Rater Reliability ◽

Intraclass Correlation Coefficients

Download Full-text

Reliability of Autism-Tics, AD/HD, and other Comorbidities (A–TAC) Inventory in a Test-Retest Design

Psychological Reports ◽

10.2466/03.15.pr0.114k10w1 ◽

2014 ◽

Vol 114 (1) ◽

pp. 93-103 ◽

Cited By ~ 15

Author(s):

Tomas Larson ◽

Eva Norén Selinus ◽

Clara Hellner Gumpert ◽

Thomas Nilsson ◽

Nóra Kerekes ◽

...

Keyword(s):

Intraclass Correlation ◽

Correlation Coefficients ◽

Population Based ◽

Autism Spectrum ◽

Good Test ◽

Rater Reliability ◽

Retest Reliability ◽

Intraclass Correlation Coefficients ◽

Intraclass Correlations ◽

Test Retest Reliability

The Autism-Tics, AD/HD, and other Comorbidities (A–TAC) inventory is used in epidemiological research to assess neurodevelopmental problems and coexisting conditions. Although the A–TAC has been applied in various populations, data on retest reliability are limited. The objective of the present study was to present additional reliability data. The A–TAC was administered by lay assessors and was completed on two occasions by parents of 400 individual twins, with an average interval of 70 days between test sessions. Intra- and inter-rater reliability were analysed with intraclass correlations and Cohen's κ. A–TAC showed excellent test-retest intraclass correlations for both autism spectrum disorder and attention deficit hyperactivity disorder (each at .84). Most modules in the A–TAC had intra- and inter-rater reliability intraclass correlation coefficients of ≥ .60. Cohen's κ indicated acceptable reliability. The current study provides statistical evidence that the A–TAC yields good test-retest reliability in a population-based cohort of children.

Download Full-text

Assessment of Reliability and Validity of the Cochlear Implant Skills Review: A New Measure to Evaluate Cochlear Implant Users' Device Skills and Knowledge

American Journal of Audiology ◽

10.1044/2020_aja-20-00038 ◽

2021 ◽

pp. 1-23

Author(s):

Kara Vasil ◽

Jessica Lewis ◽

Christin Ray ◽

Jodi Baxter ◽

Claire Bernstein ◽

...

Keyword(s):

Construct Validity ◽

Cochlear Implant ◽

Interrater Reliability ◽

Intraclass Correlation ◽

Reliability And Validity ◽

Correlation Coefficients ◽

Intraclass Correlation Coefficients ◽

Assessment Measure ◽

Valid Assessment ◽

Device Use

Purpose The Cochlear Implant Skills Review (CISR) was developed as a measure of cochlear implant (CI) users' skills and knowledge regarding device use. This study aimed to determine intra- and interrater reliability and agreement and establish construct validity for the CISR. Method In this study, the CISR was developed and administered to a cohort of 30 adult CI users. Participants included new CI users with less than 1 year of CI experience and experienced CI users with greater than 1 year of CI experience. The CISR administration required participants to demonstrate skills using the various features of their CI processors. Intra- and interrater reliability were assessed using intraclass correlation coefficients, agreement was assessed using Cohen's kappa, and construct validity was assessed by relating CISR performance to duration of CI use. Results Overall reliability for the entire instrument was 92.7%. Inter- and intrarater agreement were generally substantial or higher. Duration of CI use was a significant predictor of CISR performance. Conclusions The CISR is a reliable and valid assessment measure of device skills and knowledge for adult CI users. Clinicians can use this tool to evaluate areas of needed instruction and counseling and to assess users' skills over time.

Download Full-text

Reliability and Validity of Trunk Assessment for People With Multiple Sclerosis

Physical Therapy ◽

10.1093/ptj/86.1.66 ◽

2006 ◽

Vol 86 (1) ◽

pp. 66-76 ◽

Cited By ~ 35

Author(s):

Geert Verheyden ◽

Godelieve Nuyens ◽

Alice Nieuwboer ◽

Pol Van Asch ◽

Piet Ketelaer ◽

...

Keyword(s):

Multiple Sclerosis ◽

Construct Validity ◽

Intraclass Correlation ◽

Reliability And Validity ◽

Correlation Coefficients ◽

Weighted Kappa ◽

Altman Analysis ◽

Bland Altman Analysis ◽

Intraclass Correlation Coefficients ◽

Test Retest Reliability

Abstract Background and Purpose. Standardized scales are a prerequisite for rehabilitation and research. This study was designed to determine the reliability and validity of scores on items of the trunk assessment of the Melsbroek Disability Scoring Test (MDST) and Trunk Impairment Scale (TIS) in people with multiple sclerosis (MS). Subjects. Thirty people with MS participated in the study. Methods. Interrater and test-retest reliability and construct validity were assessed. Results. Kappa and weighted kappa values for the items of the trunk assessment of the MDST ranged from .74 to .95, and the kappa and weighted kappa values for the TIS items ranged from .46 to 1.00. Intraclass correlation coefficients for interrater and test-retest agreement were .93 and .92, respectively, for the trunk assessment of the MDST and .97 and .95, respectively, for the TIS. Bland-Altman analysis showed consistency of scores without observer bias. Construct validity was established. Discussion and Conclusion. The MDST and TIS provide reliable assessments of the trunk and are valid scales for measuring trunk performance in people with MS. [Verheyden G, Nuyens G, Nieuwboer A, et al. Reliability and validity of trunk assessment for people with multiple sclerosis.

Download Full-text

Reliability of assessment of medical students’ non-technical skills using a behavioural marker system: does clinical experience matter?

BMJ Simulation and Technology Enhanced Learning ◽

10.1136/bmjstel-2020-000705 ◽

2020 ◽

pp. bmjstel-2020-000705

Author(s):

Benjamin Clarke ◽

Samantha E Smith ◽

Emma Claire Phillips ◽

Ailsa Hamilton ◽

Joanne Kerins ◽

...

Keyword(s):

Medical Students ◽

Clinical Experience ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Technical Skills ◽

Rater Reliability ◽

Single Measure ◽

Marker System ◽

Intraclass Correlation Coefficients ◽

Reliability Coefficients

IntroductionNon-technical skills are recognised to play an integral part in safe and effective patient care. Medi-StuNTS (Medical Students’ Non-Technical Skills) is a behavioural marker system developed to enable assessment of medical students’ non-technical skills. This study aimed to assess whether newly trained raters with high levels of clinical experience could achieve reliability coefficients of >0.7 and to compare differences in inter-rater reliability of raters with varying clinical experience.MethodsForty-four raters attended a workshop on Medi-StuNTS before independently rating three videos of medical students participating in immersive simulation scenarios. Data were grouped by raters’ levels of clinical experience. Inter-rater reliability was assessed by calculating intraclass correlation coefficients (ICC).ResultsEleven raters with more than 10 years of clinical experience achieved single-measure ICC of 0.37 and average-measures ICC of 0.87. Fourteen raters with more than or equal to 5 years and less than 10 years of clinical experience achieved single-measure ICC of 0.09 and average-measures ICC of 0.59. Nineteen raters with less than 5 years of clinical experience achieved single-measure ICC of 0.09 and average-measures ICC 0.65.ConclusionsUsing 11 newly trained raters with high levels of clinical experience produced highly reliable ratings that surpassed the prespecified inter-rater reliability standard; however, a single rater from this group would not achieve sufficiently reliable ratings. This is consistent with previous studies using other medical behavioural marker systems. This study demonstrated a decrease in inter-rater reliability of raters with lower levels of clinical experience, suggesting caution when using this population as raters for assessment of non-technical skills.

Download Full-text

Vein Measurement by Peripherally Inserted Central Catheter Nurses Using Ultrasound: A Reliability Study

Journal of the Association for Vascular Access ◽

10.1016/j.java.2013.08.001 ◽

2013 ◽

Vol 18 (4) ◽

pp. 234-238 ◽

Cited By ~ 8

Author(s):

Rebecca Sharp ◽

Andrea Gordon ◽

Antonina Mikocka-Walus ◽

Jessie Childs ◽

Carol Grech ◽

...

Keyword(s):

Intraclass Correlation ◽

Correlation Coefficients ◽

Cephalic Vein ◽

Basilic Vein ◽

Rater Reliability ◽

Vein Thrombosis ◽

Intraclass Correlation Coefficients ◽

Measurement Protocol ◽

Brachial Vein ◽

Deep Vein

Abstract Background: Peripherally inserted central catheters (PICCs) are increasingly inserted by trained registered nurses, necessitating the development of specialized skills such as the use of ultrasound. The selection of an adequately sized vein is an important factor in reducing adverse events such as deep vein thrombosis. However, PICC nurses may receive minimal training in the use of ultrasound for vein measurement. Objective: We aimed to demonstrate the reliability of a vein measurement protocol using ultrasound by a PICC nurse trained in sonography. Methods: The diameter of the basilic, brachial, and cephalic veins in the left arms of healthy participants (n =12) were measured using ultrasound by a PICC nurse and a sonographer. A PICC nurse performed the measurement twice and the sonographer once; the PICC nurse's results were compared for intra-rater reliability and compared with the sonographer for inter-rater reliability. The results were analyzed using intraclass correlation coefficients (ICCs). Results: Inter-rater reliability between the PICC nurse and the sonographer was adequate, the ICC for the brachial vein was 0.60 (95% confidence interval [CI], 0.06–0.87), basilic vein ICC was 0.87 (95% CI, 0.58–0.96) and cephalic vein ICC was 0.77 (95% CI, 0.39–0.93). Intra-rater reliability of the PICC nurse was higher; the ICC for the brachial vein was 0.80 (95% CI, 0.44–0.94), basilic vein ICC was 0.92 (95% CI, 0.67–0.98), and cephalic vein ICC was 0.78 (95% CI, 0.40–0.93). Conclusions: Using a suitable protocol, a PICC nurse was able to measure vein diameter reliably when compared with a sonographer and consistently replicate these results.

Download Full-text

Reliability and Validity of the HIV Disability Questionnaire (HDQ) with Adults Living with HIV in the United States

Journal of the International Association of Providers of AIDS Care (JIAPAC) ◽

10.1177/2325958219888461 ◽

2019 ◽

Vol 18 ◽

pp. 232595821988846 ◽

Cited By ~ 1

Author(s):

Kelly Kathleen O’Brien ◽

David Kietrys ◽

Mary Lou Galantino ◽

James Scott Parrott ◽

Tracy Davis ◽

...

Keyword(s):

United States ◽

Construct Validity ◽

Social Inclusion ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Internal Consistency Reliability ◽

The United States ◽

Intraclass Correlation Coefficients ◽

Disability Questionnaire ◽

Test Retest Reliability

Objectives: To assess measurement properties of the HIV Disability Questionnaire (HDQ) among adults with HIV in the United States. Methods: We administered the HDQ, World Health Organization Disability Assessment Schedule II (WHODAS 2.0), and a demographic questionnaire. For internal consistency reliability, we calculated Cronbach α and Kuder-Richardson-20 (KR-20) statistics for disability and episodic scores, respectively (≥0.80 acceptable). For test–retest reliability, we calculated intraclass correlation coefficients (>0.8 acceptable). For construct validity, we tested 15 a priori hypotheses assessing correlations between HDQ and WHODAS 2.0 scores. Results: Of the 128 participants, the majority were males (68%), median age 51 years, taking antiretroviral therapy (96%). Cronbach α ranged from 0.88 (social inclusion) to 0.93 (uncertainty). The KR-20 ranged from 0.86 (cognitive) to 0.96 (uncertainty). Intraclass correlation coefficients ranged from 0.88 (physical, cognitive, social inclusion) to 0.92 (mental–emotional). Of the 15 hypotheses, 13 (87%) were confirmed. Conclusions: The HDQ demonstrates internal consistency reliability, test–retest reliability, and construct validity when administered to a sample of adults with HIV in the United States.

Download Full-text

Intra and Inter-rater Reliability between Ultrasound Imaging and Caliper Measures to determine Spring Ligament Dimensions in Cadavers

Scientific Reports ◽

10.1038/s41598-019-51384-6 ◽

2019 ◽

Vol 9 (1) ◽

Author(s):

Fernando Santiago-Nuño ◽

Patricia Palomo-López ◽

Ricardo Becerro-de-Bengoa-Vallejo ◽

César Calvo-Lobo ◽

Marta Elena Losa-Iglesias ◽

...

Keyword(s):

Ultrasound Imaging ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Absolute Accuracy ◽

Strong Correlations ◽

Perfect Agreement ◽

Rater Reliability ◽

Intraclass Correlation Coefficients ◽

Spring Ligament ◽

Good Repeatability

Abstract The purpose was to evaluate intra and inter-rater reliability, repeatability and absolute accuracy between ultrasound imaging (US) and caliper measures to determine Spring ligament (SL) dimensions in cadavers. SLs were identified from 62 human feet from formaldehyde-embalmed cadavers. Intra and inter-observer reliability, repeatability and absolute accuracy of SL width, thickness and length between US and caliper measurements were determined at intra and inter-session by intraclass correlation coefficients, Pearson´s correlation coefficients, Student t tests, standard errors of measurement, minimum detectable changes, values of normality, 95% limits of agreement, and Bland-Altman plots. Excellent inter-session and inter-rater reliability, adequate absolute accuracy, almost perfect agreement and strong correlations were shown for caliper, US and their comparison for all SL dimensions. US measurements presented higher absolute accuracy than caliper measures for SL length and thickness dimensions, while caliper displayed greater absolute accuracy for SL width dimensions. Good repeatability (P > 0.05) was shown for all SL dimensions by US, caliper and their comparison, except for SL width dimension measured with US (P = 0.019). Both US and caliper could be recommended for all SL dimensions evaluation due to their excellent reliability and absolute accuracy in cadavers, although width dimensions should be considered with caution due to US repeatability differences.

Download Full-text