scholarly journals Test-retest reliability and sample size estimates after MRI scanner relocation

NeuroImage ◽  
2020 ◽  
Vol 211 ◽  
pp. 116608
Author(s):  
Tracy R. Melzer ◽  
Ross J. Keenan ◽  
Gareth J. Leeper ◽  
Stephen Kingston-Smith ◽  
Simon A. Felton ◽  
...  
2017 ◽  
Author(s):  
Xiao Chen ◽  
Bin Lu ◽  
Chao-Gan Yan

ABSTRACTConcerns regarding reproducibility of resting-state functional magnetic resonance imaging (R-fMRI) findings have been raised. Little is known about how to operationally define R-fMRI reproducibility and to what extent it is affected by multiple comparison correction strategies and sample size. We comprehensively assessed two aspects of reproducibility, test-retest reliability and replicability, on widely used R-fMRI metrics in both between-subject contrasts of sex differences and within-subject comparisons of eyes-open and eyes-closed (EOEC) conditions. We noted permutation test with Threshold-Free Cluster Enhancement (TFCE), a strict multiple comparison correction strategy, reached the best balance between family-wise error rate (under 5%) and test-retest reliability / replicability (e.g., 0.68 for test-retest reliability and 0.25 for replicability of amplitude of low-frequency fluctuations (ALFF) for between-subject sex differences, 0.49 for replicability of ALFF for within-subject EOEC differences). Although R-fMRI indices attained moderate reliabilities, they replicated poorly in distinct datasets (replicability < 0.3 for between-subject sex differences, < 0.5 for within-subject EOEC differences). By randomly drawing different sample sizes from a single site, we found reliability, sensitivity and positive predictive value (PPV) rose as sample size increased. Small sample sizes (e.g., < 80 (40 per group)) not only minimized power (sensitivity < 2%), but also decreased the likelihood that significant results reflect “true” effects (PPV < 0.26) in sex differences. Our findings have implications for how to select multiple comparison correction strategies and highlight the importance of sufficiently large sample sizes in R-fMRI studies to enhance reproducibility.


2007 ◽  
Vol 113 (3) ◽  
pp. 129-130 ◽  
Author(s):  
Gavin Sandercock

HRV (heart rate variability) is a non-invasive maker of cardiac autonomic modulation utilized in many hundreds of scientific studies each year. The reliability of heart rate variability has been frequently investigated yet remains poorly quantified. Assessing the reliability of a measure that assesses dynamic physiological processes and shows large between- and within-subject variation is a complex task. In this issue of Clinical Science, Pinna and co-workers provide excellent insight into the test–retest reliability of commonly used HRV indices and put the values obtained into context by comparing them with levels of between-subject variation and by producing sample size estimates.


Author(s):  
Matthew L. Hall ◽  
Stephanie De Anda

Purpose The purposes of this study were (a) to introduce “language access profiles” as a viable alternative construct to “communication mode” for describing experience with language input during early childhood for deaf and hard-of-hearing (DHH) children; (b) to describe the development of a new tool for measuring DHH children's language access profiles during infancy and toddlerhood; and (c) to evaluate the novelty, reliability, and validity of this tool. Method We adapted an existing retrospective parent report measure of early language experience (the Language Exposure Assessment Tool) to make it suitable for use with DHH populations. We administered the adapted instrument (DHH Language Exposure Assessment Tool [D-LEAT]) to the caregivers of 105 DHH children aged 12 years and younger. To measure convergent validity, we also administered another novel instrument: the Language Access Profile Tool. To measure test–retest reliability, half of the participants were interviewed again after 1 month. We identified groups of children with similar language access profiles by using hierarchical cluster analysis. Results The D-LEAT revealed DHH children's diverse experiences with access to language during infancy and toddlerhood. Cluster analysis groupings were markedly different from those derived from more traditional grouping rules (e.g., communication modes). Test–retest reliability was good, especially for the same-interviewer condition. Content, convergent, and face validity were strong. Conclusions To optimize DHH children's developmental potential, stakeholders who work at the individual and population levels would benefit from replacing communication mode with language access profiles. The D-LEAT is the first tool that aims to measure this novel construct. Despite limitations that future work aims to address, the present results demonstrate that the D-LEAT represents progress over the status quo.


1982 ◽  
Vol 25 (4) ◽  
pp. 521-527 ◽  
Author(s):  
David C. Shepherd

In 1977, Shepherd and colleagues reported significant correlations (–.90, –.91) between speechreading scores and the latency of a selected negative peak (VN 130 measure) on the averaged visual electroencephalic wave form. The primary purpose of this current study was to examine the stability, or repeatability, of this relation between these cognitive and neurophysiologic measures over a period of several months and thus support its test-retest reliability. Repeated speechreading word and sentence scores were gathered during three test-retest sessions from each of 20 normal-hearing adults. An average of 56 days occurred from the end of one to the beginning of another speechreading sessions. During each of four other test-retest sessions, averaged visual electroencephalic responses (AVER s ) were evoked from each subject. An average of 49 clays intervened between AVER sessions. Product-moment correlations computed among repeated word scores and VN l30 measures ranged from –.61 to –.89. Based on these findings, it was concluded that the VN l30 measure of visual neural firing time is a reliable correlate of speech-reading in normal-hearing adults.


2000 ◽  
Vol 16 (1) ◽  
pp. 53-58 ◽  
Author(s):  
Hans Ottosson ◽  
Martin Grann ◽  
Gunnar Kullgren

Summary: Short-term stability or test-retest reliability of self-reported personality traits is likely to be biased if the respondent is affected by a depressive or anxiety state. However, in some studies, DSM-oriented self-reported instruments have proved to be reasonably stable in the short term, regardless of co-occurring depressive or anxiety disorders. In the present study, we examined the short-term test-retest reliability of a new self-report questionnaire for personality disorder diagnosis (DIP-Q) on a clinical sample of 30 individuals, having either a depressive, an anxiety, or no axis-I disorder. Test-retest scorings from subjects with depressive disorders were mostly unstable, with a significant change in fulfilled criteria between entry and retest for three out of ten personality disorders: borderline, avoidant and obsessive-compulsive personality disorder. Scorings from subjects with anxiety disorders were unstable only for cluster C and dependent personality disorder items. In the absence of co-morbid depressive or anxiety disorders, mean dimensional scores of DIP-Q showed no significant differences between entry and retest. Overall, the effect from state on trait scorings was moderate, and it is concluded that test-retest reliability for DIP-Q is acceptable.


2013 ◽  
Author(s):  
Kristen M. Dahlin-James ◽  
Emily J. Hennrich ◽  
E. Grace Verbeck-Priest ◽  
Jan E. Estrellado ◽  
Jessica M. Stevens ◽  
...  

2018 ◽  
Vol 30 (12) ◽  
pp. 1652-1662 ◽  
Author(s):  
Sophie J. M. Rijnen ◽  
Sophie D. van der Linden ◽  
Wilco H. M. Emons ◽  
Margriet M. Sitskoorn ◽  
Karin Gehring

Sign in / Sign up

Export Citation Format

Share Document