Appraisal of a scoring instrument for training and testing neonatal intubation skills

2018 ◽  
Vol 104 (5) ◽  
pp. F521-F527 ◽  
Author(s):  
Romy N Bouwmeester ◽  
Mathijs Binkhorst ◽  
Nicole K Yamada ◽  
Rosa Geurtzen ◽  
Arno F J van Heijst ◽  
...  

ObjectiveTo determine the validity, reliability, feasibility and applicability of a neonatal intubation scoring instrument.DesignProspective observational study.SettingSimulation-based research and training centre (Center for Advanced Pediatric and Perinatal Education), California, USA.SubjectsForty clinicians qualified for neonatal intubation.InterventionsVideotaped elective intubations on a neonatal patient simulator were scored by two independent raters. One rater scored the intubations twice. We scored the preparation of equipment and premedication, intubation performance, tube position/fixation, communication, number of attempts, duration and successfulness of the procedure.Main outcome measuresIntraclass correlation coefficients (ICC) were calculated for intrarater and inter-rater reliability. Kappa coefficients for individual items and mean kappa coefficients for all items combined were calculated. Construct validity was assessed with one-way analysis of variance using the hypothesis that experienced clinicians score higher than less experienced clinicians. The approximate time to score one intubation and the instrument’s applicability in another setting were evaluated.ResultsICCs for intrarater and inter-rater reliability were 0.99 (95% CI 0.98 to 0.99) and 0.89 (95% CI 0.35 to 0.96), and mean kappa coefficients were 0.93 (95% CI 0.85 to 1.01) and 0.71 (95% CI 0.56 to 0.92), respectively. There were no differences between the more and less experienced clinicians regarding preparation, performance, communication and total scores. The experienced group scored higher only on tube position/fixation (p=0.02). Scoring one intubation took approximately 15 min. Our instrument, developed in The Netherlands, could be readily applied in the USA.ConclusionsOur scoring instrument for simulated neonatal intubations appears to be reliable, feasible and applicable in another centre. Construct validity could not be established.

2018 ◽  
Vol 63 (4) ◽  
pp. 453-460 ◽  
Author(s):  
Vahid Abdollah ◽  
Eric C. Parent ◽  
Michele C. Battié

Abstract Degenerated discs have shorter T2-relaxation time and lower MR signal. The location of the signal-intensity-weighted-centroid reflects the water distribution within a region-of-interest (ROI). This study compared the reliability of the location of the signal-intensity-weighted-centroid to mean signal intensity and area measurements. L4-L5 and L5-S1 discs were measured on 43 mid-sagittal T2-weighted 3T MRI images in adults with back pain. One rater analysed images twice and another once, blinded to measurements. Discs were semi-automatically segmented into a whole disc, nucleus, anterior and posterior annulus. The coordinates of the signal-intensity-weighted-centroid for all regions demonstrated excellent intraclass-correlation-coefficients for intra- (0.99–1.00) and inter-rater reliability (0.97–1.00). The standard error of measurement for the Y-coordinates of the signal-intensity-weighted-centroid for all ROIs were 0 at both levels and 0 to 2.7 mm for X-coordinates. The mean signal intensity and area for the whole disc and nucleus presented excellent intra-rater reliability with intraclass-correlation-coefficients from 0.93 to 1.00, and 0.92 to 1.00 for inter-rater reliability. The mean signal intensity and area had lower reliability for annulus ROIs, with intra-rater intraclass-correlation-coefficient from 0.5 to 0.76 and inter-rater from 0.33 to 0.58. The location of the signal-intensity-weighted-centroid is a reliable biomarker for investigating the effects of disc interventions.


2018 ◽  
Vol 43 (2) ◽  
pp. 213-220 ◽  
Author(s):  
Lucy Armitage ◽  
Li Khim Kwah ◽  
Lauren Kark

Background: Residual limb volume is often measured as part of routine care for people with amputations. These measurements assist in the timing of prosthetic fitting or replacement. In order to make well informed decisions, clinicians need access to measurement tools that are valid and reliable. Objectives: To assess the reliability and criterion validity of the iSense optical scanner in measuring volume of transtibial residual limb models. Study Design: Three assessors performed two measurements each on 13 residual limb models with an iSense optical scanner (3D systems, USA). Intra-rater and inter-rater reliability were calculated using intraclass correlation coefficients. Bland Altman plots were inspected for agreement. Criterion validity was assessed using a steel rod of known dimensions. Ten repeated measurements were performed by one assessor. A t-test was used to determine differences between measured and true rod volume. Results: Intra-rater reliability was excellent (range of intraclass correlation coefficients: 0.991–0.997, all with narrow 95% confidence intervals). While the intraclass correlation coefficients suggest excellent inter-rater reliability between all three assessors (range of intraclass correlation coefficients: 0.952–0.986), the 95% confidence intervals were wide between assessor 3 and the other two assessors. Poor agreement with assessor 3 was also seen in the Bland-Altman plots. Criterion validity was very poor with a significant difference between the mean iSense measurement and the true rod volume (difference: 221.18 mL; p < 0.001). Conclusions: Although intra-rater reliability was excellent for the iSense scanner, we did not find similar results for inter-rater reliability and validity. These results suggest that further testing of the iSense scanner is required prior to use in clinical practice. Clinical relevance The iSense offers a low cost scanning option for residual limb volume measurement. Intra-rater reliability was excellent, but inter-rater reliability and validity were such that clinical adoption is not indicated at present.


2014 ◽  
Vol 114 (1) ◽  
pp. 93-103 ◽  
Author(s):  
Tomas Larson ◽  
Eva Norén Selinus ◽  
Clara Hellner Gumpert ◽  
Thomas Nilsson ◽  
Nóra Kerekes ◽  
...  

The Autism-Tics, AD/HD, and other Comorbidities (A–TAC) inventory is used in epidemiological research to assess neurodevelopmental problems and coexisting conditions. Although the A–TAC has been applied in various populations, data on retest reliability are limited. The objective of the present study was to present additional reliability data. The A–TAC was administered by lay assessors and was completed on two occasions by parents of 400 individual twins, with an average interval of 70 days between test sessions. Intra- and inter-rater reliability were analysed with intraclass correlations and Cohen's κ. A–TAC showed excellent test-retest intraclass correlations for both autism spectrum disorder and attention deficit hyperactivity disorder (each at .84). Most modules in the A–TAC had intra- and inter-rater reliability intraclass correlation coefficients of ≥ .60. Cohen's κ indicated acceptable reliability. The current study provides statistical evidence that the A–TAC yields good test-retest reliability in a population-based cohort of children.


2021 ◽  
pp. 1-23
Author(s):  
Kara Vasil ◽  
Jessica Lewis ◽  
Christin Ray ◽  
Jodi Baxter ◽  
Claire Bernstein ◽  
...  

Purpose The Cochlear Implant Skills Review (CISR) was developed as a measure of cochlear implant (CI) users' skills and knowledge regarding device use. This study aimed to determine intra- and interrater reliability and agreement and establish construct validity for the CISR. Method In this study, the CISR was developed and administered to a cohort of 30 adult CI users. Participants included new CI users with less than 1 year of CI experience and experienced CI users with greater than 1 year of CI experience. The CISR administration required participants to demonstrate skills using the various features of their CI processors. Intra- and interrater reliability were assessed using intraclass correlation coefficients, agreement was assessed using Cohen's kappa, and construct validity was assessed by relating CISR performance to duration of CI use. Results Overall reliability for the entire instrument was 92.7%. Inter- and intrarater agreement were generally substantial or higher. Duration of CI use was a significant predictor of CISR performance. Conclusions The CISR is a reliable and valid assessment measure of device skills and knowledge for adult CI users. Clinicians can use this tool to evaluate areas of needed instruction and counseling and to assess users' skills over time.


2006 ◽  
Vol 86 (1) ◽  
pp. 66-76 ◽  
Author(s):  
Geert Verheyden ◽  
Godelieve Nuyens ◽  
Alice Nieuwboer ◽  
Pol Van Asch ◽  
Piet Ketelaer ◽  
...  

Abstract Background and Purpose. Standardized scales are a prerequisite for rehabilitation and research. This study was designed to determine the reliability and validity of scores on items of the trunk assessment of the Melsbroek Disability Scoring Test (MDST) and Trunk Impairment Scale (TIS) in people with multiple sclerosis (MS). Subjects. Thirty people with MS participated in the study. Methods. Interrater and test-retest reliability and construct validity were assessed. Results. Kappa and weighted kappa values for the items of the trunk assessment of the MDST ranged from .74 to .95, and the kappa and weighted kappa values for the TIS items ranged from .46 to 1.00. Intraclass correlation coefficients for interrater and test-retest agreement were .93 and .92, respectively, for the trunk assessment of the MDST and .97 and .95, respectively, for the TIS. Bland-Altman analysis showed consistency of scores without observer bias. Construct validity was established. Discussion and Conclusion. The MDST and TIS provide reliable assessments of the trunk and are valid scales for measuring trunk performance in people with MS. [Verheyden G, Nuyens G, Nieuwboer A, et al. Reliability and validity of trunk assessment for people with multiple sclerosis.


2020 ◽  
pp. bmjstel-2020-000705
Author(s):  
Benjamin Clarke ◽  
Samantha E Smith ◽  
Emma Claire Phillips ◽  
Ailsa Hamilton ◽  
Joanne Kerins ◽  
...  

IntroductionNon-technical skills are recognised to play an integral part in safe and effective patient care. Medi-StuNTS (Medical Students’ Non-Technical Skills) is a behavioural marker system developed to enable assessment of medical students’ non-technical skills. This study aimed to assess whether newly trained raters with high levels of clinical experience could achieve reliability coefficients of >0.7 and to compare differences in inter-rater reliability of raters with varying clinical experience.MethodsForty-four raters attended a workshop on Medi-StuNTS before independently rating three videos of medical students participating in immersive simulation scenarios. Data were grouped by raters’ levels of clinical experience. Inter-rater reliability was assessed by calculating intraclass correlation coefficients (ICC).ResultsEleven raters with more than 10 years of clinical experience achieved single-measure ICC of 0.37 and average-measures ICC of 0.87. Fourteen raters with more than or equal to 5 years and less than 10 years of clinical experience achieved single-measure ICC of 0.09 and average-measures ICC of 0.59. Nineteen raters with less than 5 years of clinical experience achieved single-measure ICC of 0.09 and average-measures ICC 0.65.ConclusionsUsing 11 newly trained raters with high levels of clinical experience produced highly reliable ratings that surpassed the prespecified inter-rater reliability standard; however, a single rater from this group would not achieve sufficiently reliable ratings. This is consistent with previous studies using other medical behavioural marker systems. This study demonstrated a decrease in inter-rater reliability of raters with lower levels of clinical experience, suggesting caution when using this population as raters for assessment of non-technical skills.


2013 ◽  
Vol 18 (4) ◽  
pp. 234-238 ◽  
Author(s):  
Rebecca Sharp ◽  
Andrea Gordon ◽  
Antonina Mikocka-Walus ◽  
Jessie Childs ◽  
Carol Grech ◽  
...  

Abstract Background: Peripherally inserted central catheters (PICCs) are increasingly inserted by trained registered nurses, necessitating the development of specialized skills such as the use of ultrasound. The selection of an adequately sized vein is an important factor in reducing adverse events such as deep vein thrombosis. However, PICC nurses may receive minimal training in the use of ultrasound for vein measurement. Objective: We aimed to demonstrate the reliability of a vein measurement protocol using ultrasound by a PICC nurse trained in sonography. Methods: The diameter of the basilic, brachial, and cephalic veins in the left arms of healthy participants (n =12) were measured using ultrasound by a PICC nurse and a sonographer. A PICC nurse performed the measurement twice and the sonographer once; the PICC nurse's results were compared for intra-rater reliability and compared with the sonographer for inter-rater reliability. The results were analyzed using intraclass correlation coefficients (ICCs). Results: Inter-rater reliability between the PICC nurse and the sonographer was adequate, the ICC for the brachial vein was 0.60 (95% confidence interval [CI], 0.06–0.87), basilic vein ICC was 0.87 (95% CI, 0.58–0.96) and cephalic vein ICC was 0.77 (95% CI, 0.39–0.93). Intra-rater reliability of the PICC nurse was higher; the ICC for the brachial vein was 0.80 (95% CI, 0.44–0.94), basilic vein ICC was 0.92 (95% CI, 0.67–0.98), and cephalic vein ICC was 0.78 (95% CI, 0.40–0.93). Conclusions: Using a suitable protocol, a PICC nurse was able to measure vein diameter reliably when compared with a sonographer and consistently replicate these results.


Author(s):  
Kelly Kathleen O’Brien ◽  
David Kietrys ◽  
Mary Lou Galantino ◽  
James Scott Parrott ◽  
Tracy Davis ◽  
...  

Objectives: To assess measurement properties of the HIV Disability Questionnaire (HDQ) among adults with HIV in the United States. Methods: We administered the HDQ, World Health Organization Disability Assessment Schedule II (WHODAS 2.0), and a demographic questionnaire. For internal consistency reliability, we calculated Cronbach α and Kuder-Richardson-20 (KR-20) statistics for disability and episodic scores, respectively (≥0.80 acceptable). For test–retest reliability, we calculated intraclass correlation coefficients (>0.8 acceptable). For construct validity, we tested 15 a priori hypotheses assessing correlations between HDQ and WHODAS 2.0 scores. Results: Of the 128 participants, the majority were males (68%), median age 51 years, taking antiretroviral therapy (96%). Cronbach α ranged from 0.88 (social inclusion) to 0.93 (uncertainty). The KR-20 ranged from 0.86 (cognitive) to 0.96 (uncertainty). Intraclass correlation coefficients ranged from 0.88 (physical, cognitive, social inclusion) to 0.92 (mental–emotional). Of the 15 hypotheses, 13 (87%) were confirmed. Conclusions: The HDQ demonstrates internal consistency reliability, test–retest reliability, and construct validity when administered to a sample of adults with HIV in the United States.


2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Fernando Santiago-Nuño ◽  
Patricia Palomo-López ◽  
Ricardo Becerro-de-Bengoa-Vallejo ◽  
César Calvo-Lobo ◽  
Marta Elena Losa-Iglesias ◽  
...  

Abstract The purpose was to evaluate intra and inter-rater reliability, repeatability and absolute accuracy between ultrasound imaging (US) and caliper measures to determine Spring ligament (SL) dimensions in cadavers. SLs were identified from 62 human feet from formaldehyde-embalmed cadavers. Intra and inter-observer reliability, repeatability and absolute accuracy of SL width, thickness and length between US and caliper measurements were determined at intra and inter-session by intraclass correlation coefficients, Pearson´s correlation coefficients, Student t tests, standard errors of measurement, minimum detectable changes, values of normality, 95% limits of agreement, and Bland-Altman plots. Excellent inter-session and inter-rater reliability, adequate absolute accuracy, almost perfect agreement and strong correlations were shown for caliper, US and their comparison for all SL dimensions. US measurements presented higher absolute accuracy than caliper measures for SL length and thickness dimensions, while caliper displayed greater absolute accuracy for SL width dimensions. Good repeatability (P > 0.05) was shown for all SL dimensions by US, caliper and their comparison, except for SL width dimension measured with US (P = 0.019). Both US and caliper could be recommended for all SL dimensions evaluation due to their excellent reliability and absolute accuracy in cadavers, although width dimensions should be considered with caution due to US repeatability differences.


Sign in / Sign up

Export Citation Format

Share Document