Sample size formulas for estimating intraclass correlation coefficients with precision and assurance

Abstract Background There are challenges related to the accurate and efficient measurement of lymphedema in people with breast cancer. The LymphaTech 3D Imaging System (LymphaTech, Atlanta, GA, USA) is a mobile, noninvasive platform that provides limb geometry measurements. Objective The objective of this study was to estimate the reliability and validity of the LymphaTech for measuring arm volume in the context of women seeking care in a specialty breast cancer rehabilitation clinic. Design This was a cross-sectional reliability and convergent validity study. Methods People who had stage I to IV breast cancer with lymphedema or were at risk for it were included. Arm volume was measured in 66 participants using the LymphaTech and perometer methods. Test-retest reliability for a single measure, limb volume difference, and agreement between methods was analyzed for 30 participants. A method-comparison analysis was also used to assess convergent validity between methods. Results Both LymphaTech and perometer methods displayed intraclass correlation coefficients (ICCs) of ≥0.99. The standard errors of measurement for the LymphaTech and length-matched perometer measurements were nearly identical. Similar intraclass correlation coefficients (0.97) and standard errors of measurement (38.0–40.7 mL) were obtained for the between-limb volume difference for both methods. The convergent validity analyses demonstrated no systematic difference between methods. Limitations The sample size was not based on a formal sample size calculation. LymphaTech measurements included interrater variance, and perometer measurements contained intrarater variance. Conclusions The LymphaTech had excellent test-retest reliability, and convergent validity was supported. This technology is efficient and portable and has a potential role in prospective surveillance and management of lymphedema in clinical, research, and home settings.

Download Full-text

Pengaruh ukuran sampel dan intraclass correlation coefficients (ICC) terhadap bias estimasi parameter multilevel latent variable modeling: studi dengan simulasi Monte Carlo

Jurnal Penelitian dan Evaluasi Pendidikan ◽

10.21831/pep.v21i1.12895 ◽

2017 ◽

Vol 21 (1) ◽

pp. 34-50

Author(s):

Muhammad Dwirifqi Kharisma Putra ◽

Jahja Umar ◽

Bahrul Hayat ◽

Agung Priyo Utomo

Keyword(s):

Monte Carlo ◽

Markov Chain ◽

Markov Chain Monte Carlo ◽

Sample Size ◽

Latent Variable ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Latent Variable Modeling ◽

Intraclass Correlation Coefficients ◽

The Impact

Studi ini menggunakan simulasi Monte Carlo dilakukan untuk melihat pengaruh ukuran sampel dan intraclass correlation coefficients (ICC) terhadap bias estimasi parameter multilevel latent variable modeling. Kondisi simulasi diciptakan dengan beberapa faktor yang ditetapkan yaitu lima kondisi ICC (0.05, 0.10, 0.15, 0.20, 0.25), jumlah kelompok (30, 50, 100 dan 150), jumlah observasi dalam kelompok (10, 20 dan 50) dan diestimasi menggunakan lima metode estimasi: ML, MLF, MLR, WLSMV dan BAYES. Jumlah kondisi keseluruhan sebanyak 300 kondisi dimana tiap kondisi direplikasi sebanyak 1000 kali dan dianalisis menggunakan software Mplus 7.4. Kriteria bias yang masih dapat diterima adalah < 10%. Hasil penelitian ini menunjukkan bahwa bias yang terjadi dipengaruhi oleh ukuran sampel dan ICC, penelitian ini juga menujukkan bahwa metode estimasi WLSMV dan BAYES berfungsi lebih baik pada berbagai kondisi dibandingkan dengan metode estimasi berbasis ML.Kata kunci: multilevel latent variable modeling, intraclass correlation coefficients, Metode Markov Chain Monte Carlo THE IMPACT OF SAMPLE SIZE AND INTRACLASS CORRELATION COEFFICIENTS (ICC) ON THE BIAS OF PARAMETER ESTIMATION IN MULTILEVEL LATENT VARIABLE MODELING: A MONTE CARLO STUDYAbstractA monte carlo study was conducted to investigate the effect of sample size and intraclass correlation coefficients (ICC) on the bias of parameter estimates in multilevel latent variable modeling. The design factors included (ICC: 0.05, 0.10, 0.15, 0.20, 0.25), number of groups in between level model (NG: 30, 50, 100 and 150), cluster size (CS: 10, 20 and 50) to be estimated with five different estimator: ML, MLF, MLR, WLSMV and BAYES. Factors were interegated into 300 conditions (4 NG 3 CS 5 ICC 5 Estimator). For each condition, replications with convergence problems were exclude until at least 1.000 replications were generated and analyzed using Mplus 7.4, we also consider absolute percent bias <10% to represent an acceptable level of bias. We find that the degree of bias depends on sample size and ICC. We also show that WLSMV and BAYES estimator performed better than ML-based estimator across varying sample sizes and ICC’s conditions.Keywords: multilevel latent variable modeling, intraclass correlation coefficients, Markov Chain Monte Carlo method

Download Full-text

Interobserver Reliability Using the Phonetic Level Evaluation With Severely and Profoundly Hearing-Impaired Children

Journal of Speech Language and Hearing Research ◽

10.1044/jshr.3405.989 ◽

1991 ◽

Vol 34 (5) ◽

pp. 989-999 ◽

Cited By ~ 6

Author(s):

Stephanie Shaw ◽

Truman E. Coggins

Keyword(s):

Interrater Reliability ◽

Interobserver Reliability ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Hearing Impaired ◽

Intraclass Correlation Coefficients ◽

Assessment Measure ◽

Impaired Children ◽

Speech Assessment ◽

Hearing Impaired Children

This study examines whether observers reliably categorize selected speech production behaviors in hearing-impaired children. A group of experienced speech-language pathologists was trained to score the elicited imitations of 5 profoundly and 5 severely hearing-impaired subjects using the Phonetic Level Evaluation (Ling, 1976). Interrater reliability was calculated using intraclass correlation coefficients. Overall, the magnitude of the coefficients was found to be considerably below what would be accepted in published behavioral research. Failure to obtain acceptably high levels of reliability suggests that the Phonetic Level Evaluation may not yet be an accurate and objective speech assessment measure for hearing-impaired children.

Download Full-text

Is there a relationship between the overhead press and split jerk maximum performance? Influence of sex

International Journal of Sports Science & Coaching ◽

10.1177/17479541211020452 ◽

2021 ◽

pp. 174795412110204

Author(s):

Marcos A Soriano ◽

G Gregory Haff ◽

Paul Comfort ◽

Francisco J Amaro-Gahete ◽

Antonio Torres-González ◽

...

Keyword(s):

Confidence Intervals ◽

Body Mass ◽

Upper Limb ◽

High Reliability ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Training Experience ◽

Maximum Performance ◽

Repetition Maximum ◽

Intraclass Correlation Coefficients

The aims of this study were to (I) determine the differences and relationship between the overhead press and split jerk performance in athletes involved in weightlifting training, and (II) explore the magnitude of these differences in one-repetition maximum (1RM) performances between sexes. Sixty-one men (age: 30.4 ± 6.7 years; height: 1.8 ± 0.5 m; body mass 82.5 ± 8.5 kg; weightlifting training experience: 3.7 ± 3.5 yrs) and 21 women (age: 29.5 ± 5.2 yrs; height: 1.7 ± 0.5 m; body mass: 62.6 ± 5.7 kg; weightlifting training experience: 3.0 ± 1.5 yrs) participated. The 1RM performance of the overhead press and split jerk were assessed for all participants, with the overhead press assessed on two occasions to determine between-session reliability. The intraclass correlation coefficients (ICC) and 95% confidence intervals showed a high reliability for the overhead press ICC = 0.98 (0.97 – 0.99). A very strong correlation and significant differences were found between the overhead press and split jerk 1RM performances for all participants (r = 0.90 [0.93 – 0.85], 60.2 ± 18.3 kg, 95.7 ± 29.3 kg, p ≤ 0.001). Men demonstrated stronger correlations between the overhead press and split jerk 1RM performances (r = 0.83 [0.73-0.90], p ≤ 0.001) compared with women (r = 0.56 [0.17-0.80], p = 0.008). These results provide evidence that 1RM performance of the overhead press and split jerk performance are highly related, highlighting the importance of upper-limb strength in the split jerk maximum performance.

Download Full-text

Patient-Reported Dysphagia in Adults with Eosinophilic Esophagitis: Translation and Validation of the Swedish Eosinophilic Esophagitis Activity Index

Dysphagia ◽

10.1007/s00455-021-10277-5 ◽

2021 ◽

Author(s):

Sofie Albinsson ◽

Lisa Tuomi ◽

Christine Wennerås ◽

Helen Larsson

Keyword(s):

Eosinophilic Esophagitis ◽

Activity Index ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Control Group ◽

Cronbach’S Alpha ◽

Intraclass Correlation Coefficients ◽

Cronbach's Alpha ◽

Patient Reported ◽

Eortc Qlq

AbstractThe lack of a Swedish patient-reported outcome instrument for eosinophilic esophagitis (EoE) has limited the assessment of the disease. The aims of the study were to translate and validate the Eosinophilic Esophagitis Activity Index (EEsAI) to Swedish and to assess the symptom severity of patients with EoE compared to a nondysphagia control group. The EEsAI was translated and adapted to a Swedish cultural context (S-EEsAI) based on international guidelines. The S-EEsAI was validated using adult Swedish patients with EoE (n = 97) and an age- and sex-matched nondysphagia control group (n = 97). All participants completed the S-EEsAI, the European Organization for Research and Treatment of Cancer Quality of Life Questionnaire-Oesophageal Module 18 (EORTC QLQ-OES18), and supplementary questions regarding feasibility and demographics. Reliability and validity of the S-EEsAI were evaluated by Cronbach’s alpha and Spearman correlation coefficients between the domains of the S-EEsAI and the EORTC QLQ-OES18. A test–retest analysis of 29 patients was evaluated through intraclass correlation coefficients. The S-EEsAI had sufficient reliability with Cronbach’s alpha values of 0.83 and 0.85 for the “visual dysphagia question” and the “avoidance, modification and slow eating score” domains, respectively. The test–retest reliability was sufficient, with good to excellent intraclass correlation coefficients (0.60–0.89). The S-EEsAI domains showed moderate correlation to 6/10 EORTC QLQ-OES18 domains, indicating adequate validity. The patient S-EEsAI results differed significantly from those of the nondysphagia controls (p < 0.001). The S-EEsAI appears to be a valid and reliable instrument for monitoring adult patients with EoE in Sweden.

Download Full-text

Diagnosis of left ventricular hypertrophy using non-ECG-gated 15O-water PET

Journal of Nuclear Cardiology ◽

10.1007/s12350-021-02734-3 ◽

2021 ◽

Author(s):

Jens Sörensen ◽

Jonny Nordström ◽

Tomasz Baron ◽

Stellan Mörner ◽

Sven-Olof Granstam ◽

...

Keyword(s):

Method Development ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Roc Curves ◽

Left Ventricular ◽

Intraclass Correlation Coefficients ◽

Concentric Hypertrophy ◽

2D Echocardiography ◽

Positron Emission ◽

Septal Wall Thickness

Abstract Aim To develop a method for diagnosing left ventricular (LV) hypertrophy from cardiac perfusion 15O-water positron emission tomography (PET). Methods We retrospectively pooled data from 139 subjects in four research cohorts. LV remodeling patterns ranged from normal to severe eccentric and concentric hypertrophy. 15O-water PET scans (n = 197) were performed with three different PET devices. A low-end scanner (66 scans) was used for method development, and remaining scans with newer devices for a blinded evaluation. Dynamic data were converted into parametric images of perfusable tissue fraction for semi-automatic delineation of the LV wall and calculation of LV mass (LVM) and septal wall thickness (WT). LVM and WT from PET were compared to cardiac magnetic resonance (CMR, n = 47) and WT to 2D-echocardiography (2DE, n = 36). PET accuracy was tested using linear regression, Bland–Altman plots, and ROC curves. Observer reproducibility were evaluated using intraclass correlation coefficients. Results High correlations were found in the blinded analyses (r ≥ 0.87, P < 0.0001 for all). AUC for detecting increased LVM and WT (> 12 mm and > 15 mm) was ≥ 0.95 (P < 0.0001 for all). Reproducibility was excellent (ICC ≥ 0.93, P < 0.0001). Conclusion 15O-water PET might detect LV hypertrophy with high accuracy and precision.

Download Full-text

Intersession reliability of GPS-based and accelerometer-based physical variables in small-sided games with and without the offside rule

Proceedings of the Institution of Mechanical Engineers Part P Journal of Sports Engineering and Technology ◽

10.1177/1754337120987646 ◽

2021 ◽

pp. 175433712098764

Author(s):

Igor Junio de Oliveira Custódio ◽

Gibson Moreira Praça ◽

Leandro Vinhas de Paula ◽

Sarah da Glória Teles Bredt ◽

Fabio Yuzo Nakamura ◽

...

Keyword(s):

Root Mean Square ◽

Global Positioning System ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Mean Square ◽

Physical Demands ◽

Intraclass Correlation Coefficients ◽

Total Distance ◽

Global Positioning ◽

High Level

This study aimed to analyze the intersession reliability of global positioning system (GPS-based) distances and accelerometer-based (acceleration) variables in small-sided soccer games (SSG) with and without the offside rule, as well as compare variables between the tasks. Twenty-four high-level U-17 soccer athletes played 3 versus 3 (plus goalkeepers) SSG in two formats (with and without the offside rule). SSG were performed on eight consecutive weeks (4 weeks for each group), twice a week. The physical demands were recorded using a GPS with an embedded triaxial accelerometer. GPS-based variables (total distance, average speed, and distances covered at different speeds) and accelerometer-based variables (Player Load™, root mean square of the acceleration recorded in each movement axis, and the root mean square of resultant acceleration) were calculated. Results showed that the inclusion of the offside rule reduced the total distance covered (large effect) and the distances covered at moderate speed zones (7–12.9 km/h – moderate effect; 13–17.9 km/h – large effect). In both SSG formats, GPS-based variables presented good to excellent reliability (intraclass correlation coefficients – ICC > 0.62) and accelerometer-based variables presented excellent reliability (ICC values > 0.89). Based on the results of this study, the offside rule decreases the physical demand of 3 versus 3 SSG and the physical demands required in these SSG present high intersession reliability.

Download Full-text

Assessment of reliability and validity of the 5-scale grading system of the point-of-care immunoassay for tear matrix metalloproteinase-9

Scientific Reports ◽

10.1038/s41598-021-92020-6 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Minjeong Kim ◽

Ja Young Oh ◽

Seon Ha Bae ◽

Seung Hyeun Lee ◽

Won Jun Lee ◽

...

Keyword(s):

Matrix Metalloproteinase ◽

Calibration Curve ◽

Point Of Care ◽

Interobserver Reliability ◽

Intraclass Correlation ◽

Reliability And Validity ◽

Correlation Coefficients ◽

Grading System ◽

Intraclass Correlation Coefficients ◽

The Difference

AbstractWe evaluated the reliability and validity of the 5-scale grading system to interpret the point-of-care immunoassay for tear matrix metalloproteinase (MMP)-9. Six observers graded red bands of photographs of the readout window in MMP-9 immunoassay kit (InflammaDry) two times with 2-week interval based on the 5-scale grading system (i.e. grade 0–4). Interobserver and intraobserver reliability were evaluated using intraclass correlation coefficients. The interobserver agreements were analyzed according to the severity of tear MMP-9 expression. To validate the system, a concentration calibration curve was made using MMP-9 solutions with reference concentrations, then the distribution of MMP-9 concentrations was analyzed according to the 5-scale grading system. Both intraobserver and interobserver reliability was excellent. The readout grades were significantly correlated with the quantified colorimetric densities. The interobserver variance of readout grades had no correlation with the severity of the measured densities. The band density continued to increase up to a maximal concentration (i.e. 5000 ng/mL) according to the calibration curve. The difference of grades reflected the change of MMP-9 concentrations sensitively, especially between grade 2 and 4. Together, our data indicate that the subjective 5-scale grading system in the point-of-care MMP-9 immunoassay is an easy and reliable method with acceptable accuracy.

Download Full-text

Validity of Inertial Sensors for Assessing Balance Kinematics and Mobility during Treadmill-Based Perturbation and Dance Training

Sensors ◽

10.3390/s21093065 ◽

2021 ◽

Vol 21 (9) ◽

pp. 3065

Author(s):

Ernest Kwesi Ofori ◽

Shuaijie Wang ◽

Tanvi Bhatt

Keyword(s):

Concurrent Validity ◽

Inertial Sensors ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Human Motion ◽

Intraclass Correlation Coefficients ◽

Dance Training ◽

Silver Standard ◽

Mean Square Errors ◽

Reactive Balance

Inertial sensors (IS) enable the kinematic analysis of human motion with fewer logistical limitations than the silver standard optoelectronic motion capture (MOCAP) system. However, there are no data on the validity of IS for perturbation training and during the performance of dance. The aim of this present study was to determine the concurrent validity of IS in the analysis of kinematic data during slip and trip-like perturbations and during the performance of dance. Seven IS and the MOCAP system were simultaneously used to capture the reactive response and dance movements of fifteen healthy young participants (Age: 18–35 years). Bland Altman (BA) plots, root mean square errors (RMSE), Pearson’s correlation coefficients (R), and intraclass correlation coefficients (ICC) were used to compare kinematic variables of interest between the two systems for absolute equivalency and accuracy. Limits of agreements (LOA) of the BA plots ranged from −0.23 to 0.56 and −0.21 to 0.43 for slip and trip stability variables, respectively. The RMSE for slip and trip stabilities were from 0.11 to 0.20 and 0.11 to 0.16, respectively. For the joint mobility in dance, LOA varied from −6.98–18.54, while RMSE ranged from 1.90 to 13.06. Comparison of IS and optoelectronic MOCAP system for reactive balance and body segmental kinematics revealed that R varied from 0.59 to 0.81 and from 0.47 to 0.85 while ICC was from 0.50 to 0.72 and 0.45 to 0.84 respectively for slip–trip perturbations and dance. Results of moderate to high concurrent validity of IS and MOCAP systems. These results were consistent with results from similar studies. This suggests that IS are valid tools to quantitatively analyze reactive balance and mobility kinematics during slip–trip perturbation and the performance of dance at any location outside, including the laboratory, clinical and home settings.

Download Full-text

Automatic 3D dense phenotyping provides reliable and accurate shape quantification of the human mandible

Scientific Reports ◽

10.1038/s41598-021-88095-w ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Pieter-Jan Verhelst ◽

H. Matthews ◽

L. Verstraete ◽

F. Van der Cruyssen ◽

D. Mulier ◽

...

Keyword(s):

Repeated Measures ◽

Intraclass Correlation ◽

Three Dimensional ◽

Correlation Coefficients ◽

Surface Registration ◽

Anatomical Landmarks ◽

Centroid Size ◽

Intraclass Correlation Coefficients ◽

Total Variability ◽

Automatic Phenotyping

AbstractAutomatic craniomaxillofacial (CMF) three dimensional (3D) dense phenotyping promises quantification of the complete CMF shape compared to the limiting use of sparse landmarks in classical phenotyping. This study assesses the accuracy and reliability of this new approach on the human mandible. Classic and automatic phenotyping techniques were applied on 30 unaltered and 20 operated human mandibles. Seven observers indicated 26 anatomical landmarks on each mandible three times. All mandibles were subjected to three rounds of automatic phenotyping using Meshmonk. The toolbox performed non-rigid surface registration of a template mandibular mesh consisting of 17,415 quasi landmarks on each target mandible and the quasi landmarks corresponding to the 26 anatomical locations of interest were identified. Repeated-measures reliability was assessed using root mean square (RMS) distances of repeated landmark indications to their centroid. Automatic phenotyping showed very low RMS distances confirming excellent repeated-measures reliability. The average Euclidean distance between manual and corresponding automatic landmarks was 1.40 mm for the unaltered and 1.76 mm for the operated sample. Centroid sizes from the automatic and manual shape configurations were highly similar with intraclass correlation coefficients (ICC) of > 0.99. Reproducibility coefficients for centroid size were < 2 mm, accounting for < 1% of the total variability of the centroid size of the mandibles in this sample. ICC’s for the multivariate set of 325 interlandmark distances were all > 0.90 indicating again high similarity between shapes quantified by classic or automatic phenotyping. Combined, these findings established high accuracy and repeated-measures reliability of the automatic approach. 3D dense CMF phenotyping of the human mandible using the Meshmonk toolbox introduces a novel improvement in quantifying CMF shape.

Download Full-text