Reliability of the NIH toolbox cognitive battery in children and adolescents: a 3-year longitudinal examination

Abstract Background The Cognitive Battery of the National Institutes of Health Toolbox (NIH-TB) is a collection of assessments that have been adapted and normed for administration across the lifespan and is increasingly used in large-scale population-level research. However, despite increasing adoption in longitudinal investigations of neurocognitive development, and growing recommendations that the Toolbox be used in clinical applications, little is known about the long-term temporal stability of the NIH-TB, particularly in youth. Methods The present study examined the long-term temporal reliability of the NIH-TB in a large cohort of youth (9–15 years-old) recruited across two data collection sites. Participants were invited to complete testing annually for 3 years. Results Reliability was generally low-to-moderate, with intraclass correlation coefficients ranging between 0.31 and 0.76 for the full sample. There were multiple significant differences between sites, with one site generally exhibiting stronger temporal stability than the other. Conclusions Reliability of the NIH-TB Cognitive Battery was lower than expected given early work examining shorter test-retest intervals. Moreover, there were very few instances of tests meeting stability requirements for use in research; none of the tests exhibited adequate reliability for use in clinical applications. Reliability is paramount to establishing the validity of the tool, thus the constructs assessed by the NIH-TB may vary over time in youth. We recommend further refinement of the NIH-TB Cognitive Battery and its norming procedures for children before further adoption as a neuropsychological assessment. We also urge researchers who have already employed the NIH-TB in their studies to interpret their results with caution.

Download Full-text

Temporal Stability of Psychophysiological Stress Profiles: A Re-Analysis Using Intraclass Correlation Coefficients

Psychological Reports ◽

10.2466/pr0.1995.76.1.171 ◽

1995 ◽

Vol 76 (1) ◽

pp. 171-175 ◽

Cited By ~ 5

Author(s):

John G. Arena ◽

Stephen H. Hobbs

Keyword(s):

Intraclass Correlation ◽

Temporal Stability ◽

Correlation Coefficients ◽

Electromyographic Activity ◽

Adaptation Period ◽

Statistical Estimates ◽

Intraclass Correlation Coefficients ◽

Psychophysiological Response ◽

Psychophysiological Stress ◽

Simple Change

This is a re-analysis of data from a previous study which examined the temporal stability of three psychophysiological responses [frontal electromyographic activity (EMG), hand surface temperature, and heart rate]. Each response was recorded on 64 subjects over four sessions, each of which consisted of a 20-min. adaptation period, a baseline condition, and two stressors (one cognitive, the other physical). Rather than using Pearson product-moment correlations, as nearly all psychophysiological test-retest reliability studies have, we have now analyzed the data using intraclass correlation coefficients. This type of correlation allows one to incorporate more than two test-retest values on the same subjects. Analysis indicated that, with the exception of EMG during the physical stressor, the absolute values of the responses had quite significant reliability (.70 or greater). Treating the responses as relative measures (percent change from baseline or simple change scores from baseline) produced smaller and frequently less stable coefficients. It is concluded that statistical estimates of psychophysiological response reliability are functions of the design and particular reliability analysis employed.

Download Full-text

Temporal Stability of the Intrinsic Motivation Inventory

Perceptual and Motor Skills ◽

10.2466/pms.2003.97.1.271 ◽

2003 ◽

Vol 97 (1) ◽

pp. 271-280 ◽

Cited By ~ 38

Author(s):

Nikolaos Tsigilis ◽

Argiris Theodosiou

Keyword(s):

Intrinsic Motivation ◽

Undergraduate Students ◽

Intraclass Correlation ◽

Temporal Stability ◽

Correlation Coefficients ◽

Perceived Competence ◽

Stable Measure ◽

Intraclass Correlation Coefficients ◽

Stable Yield ◽

Intrinsic Motivation Inventory

To examine the temporal stability of the Intrinsic Motivation Inventory a Greek version was administered to 144 undergraduate students after an endurance field test. The same procedure was repeated one week later. Factor analysis followed by varimax rotation showed that three factors (Perceived Competence, Interest/enjoyment, and Effort/importance) explained 65.26% of the total variance. Computed intraclass correlation coefficients (ICC) were .61 for the Perceived Competence subscale, .86 for the Interest/enjoyment, .60 for the Effort/importance, and .70 for the overall scale. The results, however, were modified when the sample was divided in two groups. The first represented small changes in perceived competence between the first and the second measurement, while the second one represented large changes between the two measurements. Recalculated intraclass correlation coefficients for individuals whose Perceived Competence score remained relatively stable yield a high value (.92), whereas individuals whose Perceived Competence changed yield an extremely low value (.60). It was concluded that the Intrinsic Motivation Inventory provides a temporally stable measure, given that perceived competence has not been markedly changed.

Download Full-text

Three-Year Reliability of MEG Visual and Somatosensory Responses

Cerebral Cortex ◽

10.1093/cercor/bhaa372 ◽

2020 ◽

Author(s):

Marie C McCusker ◽

Brandon J Lew ◽

Tony W Wilson

Keyword(s):

Intraclass Correlation ◽

Correlation Coefficients ◽

Gamma Activity ◽

Evoked Responses ◽

Good Reliability ◽

Intraclass Correlation Coefficients ◽

Baseline Activity ◽

Somatosensory Responses ◽

High Degree

Abstract A major goal of many translational neuroimaging studies is the identification of biomarkers of disease. However, a prerequisite for any such biomarker is robust reliability, which for magnetoencephalography (MEG) and many other imaging modalities has not been established. In this study, we examined the reliability of visual (Experiment 1) and somatosensory gating (Experiment 2) responses in 19 healthy adults who repeated these experiments for three visits spaced 18 months apart. Visual oscillatory and somatosensory oscillatory and evoked responses were imaged, and intraclass correlation coefficients (ICC) were computed to examine the long-term reliability of these responses. In Experiment 1, ICCs showed good reliability for visual theta and alpha responses in occipital cortices, but poor reliability for gamma responses. In Experiment 2, the time series of somatosensory gamma and evoked responses in the contralateral somatosensory cortex showed good reliability. Finally, analyses of spontaneous baseline activity indicated excellent reliability for occipital alpha, moderate reliability for occipital theta, and poor reliability for visual/somatosensory gamma activity. Overall, MEG responses to visual and somatosensory stimuli show a high degree of reliability across 3 years and therefore may be stable indicators of sensory processing long term and thereby of potential interest as biomarkers of disease.

Download Full-text

Maternal diabetes and fetal cardiac output

Journal of Neonatal-Perinatal Medicine ◽

10.3233/npm-200552 ◽

2021 ◽

pp. 1-6

Author(s):

S.L. Narasimhan ◽

A. Eid ◽

A. Bhatia ◽

C. Davey ◽

J. Steinberger

Keyword(s):

Cardiac Output ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Maternal Diabetes ◽

Left Ventricular ◽

Chi Square ◽

Intraclass Correlation Coefficients ◽

Cardiac Adaptation ◽

Diabetic Mothers

BACKGROUND: The intrauterine environment is a key determinant for long-term health outcomes. Adverse fetal environments, such as maternal diabetes, obesity and placental insufficiency are strongly associated with long-term health risks in children. Little is known about differences in fetal cardiac output hemodynamics of diabetic mothers (DM) vs. non-diabetic mothers (NDM). Our study aims to investigate the left-sided, right-sided, and combined cardiac output (CCO) in fetuses of DM vs. NDM. METHODS: Retrospective data were collected in fetuses of DM (N = 532) and NDM (103) at mean gestational age 24 weeks. Examination included 2D echo and pulse wave Doppler. Wilcoxon rank sum tests and Chi-square tests were used to test for distribution difference of maternal and fetal continuous and categorical measures respectively between DM and NDM. Intraclass correlation coefficients were calculated to assess intra-observer reliability of fetal cardiac measurements. RESULTS: DM mothers had higher mean weight (89.7±22.2 kg) than NDM (76.8±19.8 kg), p < 0.0001 and higher mean BMI (33.4±7.5) than NDM (28.3±5.8), p < 0.0001. C-section delivery occurred in 66% of DM vs. 35% of NDM fetuses. Fetuses of DM mothers had significantly larger semilunar valve diameter, higher left ventricular (LV) output, higher combined cardiac output and lower right ventricle /left ventricle ratio compared to NDM. CONCLUSION: The greater CCO (adjusted for fetal weight), left sided cardiac output in the fetuses of DM, compared to NDM, represent differences in cardiac adaptation to the diabetic environment.

Download Full-text

Interobserver Reliability Using the Phonetic Level Evaluation With Severely and Profoundly Hearing-Impaired Children

Journal of Speech Language and Hearing Research ◽

10.1044/jshr.3405.989 ◽

1991 ◽

Vol 34 (5) ◽

pp. 989-999 ◽

Cited By ~ 6

Author(s):

Stephanie Shaw ◽

Truman E. Coggins

Keyword(s):

Interrater Reliability ◽

Interobserver Reliability ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Hearing Impaired ◽

Intraclass Correlation Coefficients ◽

Assessment Measure ◽

Impaired Children ◽

Speech Assessment ◽

Hearing Impaired Children

This study examines whether observers reliably categorize selected speech production behaviors in hearing-impaired children. A group of experienced speech-language pathologists was trained to score the elicited imitations of 5 profoundly and 5 severely hearing-impaired subjects using the Phonetic Level Evaluation (Ling, 1976). Interrater reliability was calculated using intraclass correlation coefficients. Overall, the magnitude of the coefficients was found to be considerably below what would be accepted in published behavioral research. Failure to obtain acceptably high levels of reliability suggests that the Phonetic Level Evaluation may not yet be an accurate and objective speech assessment measure for hearing-impaired children.

Download Full-text

Is there a relationship between the overhead press and split jerk maximum performance? Influence of sex

International Journal of Sports Science & Coaching ◽

10.1177/17479541211020452 ◽

2021 ◽

pp. 174795412110204

Author(s):

Marcos A Soriano ◽

G Gregory Haff ◽

Paul Comfort ◽

Francisco J Amaro-Gahete ◽

Antonio Torres-González ◽

...

Keyword(s):

Confidence Intervals ◽

Body Mass ◽

Upper Limb ◽

High Reliability ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Training Experience ◽

Maximum Performance ◽

Repetition Maximum ◽

Intraclass Correlation Coefficients

The aims of this study were to (I) determine the differences and relationship between the overhead press and split jerk performance in athletes involved in weightlifting training, and (II) explore the magnitude of these differences in one-repetition maximum (1RM) performances between sexes. Sixty-one men (age: 30.4 ± 6.7 years; height: 1.8 ± 0.5 m; body mass 82.5 ± 8.5 kg; weightlifting training experience: 3.7 ± 3.5 yrs) and 21 women (age: 29.5 ± 5.2 yrs; height: 1.7 ± 0.5 m; body mass: 62.6 ± 5.7 kg; weightlifting training experience: 3.0 ± 1.5 yrs) participated. The 1RM performance of the overhead press and split jerk were assessed for all participants, with the overhead press assessed on two occasions to determine between-session reliability. The intraclass correlation coefficients (ICC) and 95% confidence intervals showed a high reliability for the overhead press ICC = 0.98 (0.97 – 0.99). A very strong correlation and significant differences were found between the overhead press and split jerk 1RM performances for all participants (r = 0.90 [0.93 – 0.85], 60.2 ± 18.3 kg, 95.7 ± 29.3 kg, p ≤ 0.001). Men demonstrated stronger correlations between the overhead press and split jerk 1RM performances (r = 0.83 [0.73-0.90], p ≤ 0.001) compared with women (r = 0.56 [0.17-0.80], p = 0.008). These results provide evidence that 1RM performance of the overhead press and split jerk performance are highly related, highlighting the importance of upper-limb strength in the split jerk maximum performance.

Download Full-text

Patient-Reported Dysphagia in Adults with Eosinophilic Esophagitis: Translation and Validation of the Swedish Eosinophilic Esophagitis Activity Index

Dysphagia ◽

10.1007/s00455-021-10277-5 ◽

2021 ◽

Author(s):

Sofie Albinsson ◽

Lisa Tuomi ◽

Christine Wennerås ◽

Helen Larsson

Keyword(s):

Eosinophilic Esophagitis ◽

Activity Index ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Control Group ◽

Cronbach’S Alpha ◽

Intraclass Correlation Coefficients ◽

Cronbach's Alpha ◽

Patient Reported ◽

Eortc Qlq

AbstractThe lack of a Swedish patient-reported outcome instrument for eosinophilic esophagitis (EoE) has limited the assessment of the disease. The aims of the study were to translate and validate the Eosinophilic Esophagitis Activity Index (EEsAI) to Swedish and to assess the symptom severity of patients with EoE compared to a nondysphagia control group. The EEsAI was translated and adapted to a Swedish cultural context (S-EEsAI) based on international guidelines. The S-EEsAI was validated using adult Swedish patients with EoE (n = 97) and an age- and sex-matched nondysphagia control group (n = 97). All participants completed the S-EEsAI, the European Organization for Research and Treatment of Cancer Quality of Life Questionnaire-Oesophageal Module 18 (EORTC QLQ-OES18), and supplementary questions regarding feasibility and demographics. Reliability and validity of the S-EEsAI were evaluated by Cronbach’s alpha and Spearman correlation coefficients between the domains of the S-EEsAI and the EORTC QLQ-OES18. A test–retest analysis of 29 patients was evaluated through intraclass correlation coefficients. The S-EEsAI had sufficient reliability with Cronbach’s alpha values of 0.83 and 0.85 for the “visual dysphagia question” and the “avoidance, modification and slow eating score” domains, respectively. The test–retest reliability was sufficient, with good to excellent intraclass correlation coefficients (0.60–0.89). The S-EEsAI domains showed moderate correlation to 6/10 EORTC QLQ-OES18 domains, indicating adequate validity. The patient S-EEsAI results differed significantly from those of the nondysphagia controls (p < 0.001). The S-EEsAI appears to be a valid and reliable instrument for monitoring adult patients with EoE in Sweden.

Download Full-text

Diagnosis of left ventricular hypertrophy using non-ECG-gated 15O-water PET

Journal of Nuclear Cardiology ◽

10.1007/s12350-021-02734-3 ◽

2021 ◽

Author(s):

Jens Sörensen ◽

Jonny Nordström ◽

Tomasz Baron ◽

Stellan Mörner ◽

Sven-Olof Granstam ◽

...

Keyword(s):

Method Development ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Roc Curves ◽

Left Ventricular ◽

Intraclass Correlation Coefficients ◽

Concentric Hypertrophy ◽

2D Echocardiography ◽

Positron Emission ◽

Septal Wall Thickness

Abstract Aim To develop a method for diagnosing left ventricular (LV) hypertrophy from cardiac perfusion 15O-water positron emission tomography (PET). Methods We retrospectively pooled data from 139 subjects in four research cohorts. LV remodeling patterns ranged from normal to severe eccentric and concentric hypertrophy. 15O-water PET scans (n = 197) were performed with three different PET devices. A low-end scanner (66 scans) was used for method development, and remaining scans with newer devices for a blinded evaluation. Dynamic data were converted into parametric images of perfusable tissue fraction for semi-automatic delineation of the LV wall and calculation of LV mass (LVM) and septal wall thickness (WT). LVM and WT from PET were compared to cardiac magnetic resonance (CMR, n = 47) and WT to 2D-echocardiography (2DE, n = 36). PET accuracy was tested using linear regression, Bland–Altman plots, and ROC curves. Observer reproducibility were evaluated using intraclass correlation coefficients. Results High correlations were found in the blinded analyses (r ≥ 0.87, P < 0.0001 for all). AUC for detecting increased LVM and WT (> 12 mm and > 15 mm) was ≥ 0.95 (P < 0.0001 for all). Reproducibility was excellent (ICC ≥ 0.93, P < 0.0001). Conclusion 15O-water PET might detect LV hypertrophy with high accuracy and precision.

Download Full-text

Intersession reliability of GPS-based and accelerometer-based physical variables in small-sided games with and without the offside rule

Proceedings of the Institution of Mechanical Engineers Part P Journal of Sports Engineering and Technology ◽

10.1177/1754337120987646 ◽

2021 ◽

pp. 175433712098764

Author(s):

Igor Junio de Oliveira Custódio ◽

Gibson Moreira Praça ◽

Leandro Vinhas de Paula ◽

Sarah da Glória Teles Bredt ◽

Fabio Yuzo Nakamura ◽

...

Keyword(s):

Root Mean Square ◽

Global Positioning System ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Mean Square ◽

Physical Demands ◽

Intraclass Correlation Coefficients ◽

Total Distance ◽

Global Positioning ◽

High Level

This study aimed to analyze the intersession reliability of global positioning system (GPS-based) distances and accelerometer-based (acceleration) variables in small-sided soccer games (SSG) with and without the offside rule, as well as compare variables between the tasks. Twenty-four high-level U-17 soccer athletes played 3 versus 3 (plus goalkeepers) SSG in two formats (with and without the offside rule). SSG were performed on eight consecutive weeks (4 weeks for each group), twice a week. The physical demands were recorded using a GPS with an embedded triaxial accelerometer. GPS-based variables (total distance, average speed, and distances covered at different speeds) and accelerometer-based variables (Player Load™, root mean square of the acceleration recorded in each movement axis, and the root mean square of resultant acceleration) were calculated. Results showed that the inclusion of the offside rule reduced the total distance covered (large effect) and the distances covered at moderate speed zones (7–12.9 km/h – moderate effect; 13–17.9 km/h – large effect). In both SSG formats, GPS-based variables presented good to excellent reliability (intraclass correlation coefficients – ICC > 0.62) and accelerometer-based variables presented excellent reliability (ICC values > 0.89). Based on the results of this study, the offside rule decreases the physical demand of 3 versus 3 SSG and the physical demands required in these SSG present high intersession reliability.

Download Full-text

Assessment of reliability and validity of the 5-scale grading system of the point-of-care immunoassay for tear matrix metalloproteinase-9

Scientific Reports ◽

10.1038/s41598-021-92020-6 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Minjeong Kim ◽

Ja Young Oh ◽

Seon Ha Bae ◽

Seung Hyeun Lee ◽

Won Jun Lee ◽

...

Keyword(s):

Matrix Metalloproteinase ◽

Calibration Curve ◽

Point Of Care ◽

Interobserver Reliability ◽

Intraclass Correlation ◽

Reliability And Validity ◽

Correlation Coefficients ◽

Grading System ◽

Intraclass Correlation Coefficients ◽

The Difference

AbstractWe evaluated the reliability and validity of the 5-scale grading system to interpret the point-of-care immunoassay for tear matrix metalloproteinase (MMP)-9. Six observers graded red bands of photographs of the readout window in MMP-9 immunoassay kit (InflammaDry) two times with 2-week interval based on the 5-scale grading system (i.e. grade 0–4). Interobserver and intraobserver reliability were evaluated using intraclass correlation coefficients. The interobserver agreements were analyzed according to the severity of tear MMP-9 expression. To validate the system, a concentration calibration curve was made using MMP-9 solutions with reference concentrations, then the distribution of MMP-9 concentrations was analyzed according to the 5-scale grading system. Both intraobserver and interobserver reliability was excellent. The readout grades were significantly correlated with the quantified colorimetric densities. The interobserver variance of readout grades had no correlation with the severity of the measured densities. The band density continued to increase up to a maximal concentration (i.e. 5000 ng/mL) according to the calibration curve. The difference of grades reflected the change of MMP-9 concentrations sensitively, especially between grade 2 and 4. Together, our data indicate that the subjective 5-scale grading system in the point-of-care MMP-9 immunoassay is an easy and reliable method with acceptable accuracy.

Download Full-text