scholarly journals Test-Retest Reliability is not a Measure of Reliability or Stability: A Friendly Reminder

2020 ◽  
Author(s):  
Lukas Röseler ◽  
Daniel Wolf ◽  
Johannes Leder ◽  
Astrid Schütz

We argue that the test-retest reliability coefficient, which is the correlation between a measurement and a repeated measurement using the same diagnostic instrument in the same sample (sometimes referred to as repeatability or falsely referred to as stability), is by itself not an appropriate measure of the reliability of the diagnostic instrument or of the stability of the construct in question. In combination with an actual coefficient of reliability such as Cronbach’s alpha, the test-retest reliability coefficient can be used to estimate and compare the stabilities of constructs using a procedure based on the correction for attenuation. However, results from a simulation study showed that classically constructed confidence intervals for the estimator exhibit under-coverage and thus cannot be interpreted correctly.

1982 ◽  
Vol 25 (4) ◽  
pp. 521-527 ◽  
Author(s):  
David C. Shepherd

In 1977, Shepherd and colleagues reported significant correlations (–.90, –.91) between speechreading scores and the latency of a selected negative peak (VN 130 measure) on the averaged visual electroencephalic wave form. The primary purpose of this current study was to examine the stability, or repeatability, of this relation between these cognitive and neurophysiologic measures over a period of several months and thus support its test-retest reliability. Repeated speechreading word and sentence scores were gathered during three test-retest sessions from each of 20 normal-hearing adults. An average of 56 days occurred from the end of one to the beginning of another speechreading sessions. During each of four other test-retest sessions, averaged visual electroencephalic responses (AVER s ) were evoked from each subject. An average of 49 clays intervened between AVER sessions. Product-moment correlations computed among repeated word scores and VN l30 measures ranged from –.61 to –.89. Based on these findings, it was concluded that the VN l30 measure of visual neural firing time is a reliable correlate of speech-reading in normal-hearing adults.


2017 ◽  
Vol 32 (2) ◽  
Author(s):  
Shannon Gwin ◽  
Paul Branscum ◽  
E. Laurette Taylor

The purpose of this study was to create a valid and reliable instrument to evaluate theory-basedbeliefs towards physical activity among clergy members. Data were collected from 174 clergy that par-ticipated in a 15-item online and paper-based survey. Psychometric properties of the instrument includedconfirmatory factor analysis (construct validity), and cronbach’s alpha (internal consistency reliability).In addition, the stability (test-retest reliability) of each subscale was evaluated with a sub-sample of 30participants. Results show the instrument was both valid and reliable, and will be useful in future studiestargeting this population. Future implications are discussed.


2021 ◽  
Author(s):  
Eranthi Weeratunga ◽  
Chandanie Senadheera ◽  
Manjula Hettiarachchi ◽  
Bilesha Perera

Abstract Background Coping strategies are essential in the cancer management/recovery process and show an integral part in patients with cancer globally. In Sri Lanka, validated scales to measure coping are scarce. This study was examined the Sinhalese version of the Brief COPE for its psychometric properties. Methods This scale is self-administered (28 items) and consists of adaptive and maladaptive coping strategies; divided into 14 subscales. Cancer patients were registered ‘first come - first serve’ basis using their appointment register at the Radiotherapy Unit, Oncology ward, Teaching Hospital, Karapitiya, Galle, Sri Lanka. They were requested to complete the Sinhalese version of the Brief COPE and demographic details. Test-retest reliability was checked using the same subjects two weeks later. Factorial validity was performed using exploratory factor and principal component analysis. Results were regarded as statistically significant if p < 0.05. Results The mean (±SD) age of the sample was 61(±12) years. The mean adaptive coping (±SD) and maladaptive coping were 37.50 (±8.14) and 17.10 (±2.44) respectively. The internal consistency of the overall scale was good (Cronbach’s alpha - 0.819). Adaptive and maladaptive coping showed a high Cronbach’s alpha (0.861 and 0.396). The test-retest reliability was found to be 0.66. The Sinhala version of BC was found to have a negative correlation with the CES-D scale but was positively correlated with the WHOQOL-BREF questionnaire. Seven factors were extracted. Conclusion The Sinhala version of the Brief COPE is a valid and reliable tool to assess coping strategies among patients with cancer. The findings of this study would let the health authorities get an understanding of coping strategies among patients with cancer; and the impact on cancer victims and family members to relieve their suffering.


2019 ◽  
Vol 24 (3) ◽  
pp. 95-107
Author(s):  
N.A. Khokhlov ◽  
G.D. Laskov

This article focuses on the development of methods to measure personality and cognitive predisposition to monosemantic or polysemantic context generation (PCG).In accordance with the concept of V.S. Rotenberg, we assumed that PCG was connected with manual functional asymmetry. We developed four tests: one was designed to measure personality PCG, the other three measure cognitive PCG. Approbation samples consisted of 160—736 participants. Cronbach's alpha (0.67—0.93) and split-half coefficient (0.72—0.93) were calculated for all tests, for two of them test-retest reliability (0.47—0.91) was measured. Variance of personal PCG on 21.7% is explained by the variance of personality traits “reticence-sociability” and “concreteness-abstractness”. Personality and cognitive PCG are interconnected, but they have a fair amount of specificity. Manual functional asymmetry is weakly connected with personal PCG (not more than 1.5% of the common variance) and is not connected with cognitive PCG


AAOHN Journal ◽  
2006 ◽  
Vol 54 (5) ◽  
pp. 201-209 ◽  
Author(s):  
Arnold B. de Castro ◽  
Barbara Curbow ◽  
Jacqueline Agnew ◽  
Jennifer A. Haythornthwaite ◽  
Sheila T. Fitzgerald

This analysis examines the applicability of the emotional labor scale from the Emotions at Work Scale (EWS) through the assessment of its psychometric properties in a sample of young workers. Factor analysis and test—retest reliability were conducted on a 13-item scale measuring emotional labor. The EWS 13-item emotional labor scale was refined to 9 items. Two subscales were delineated: 5 items measured surface acting and 4 items measured deep acting, each with a mean inter-item correlation of 0.33. Cronbach's alpha was .96 for the 9-item scale, and .71 and .67 for the surface acting and deep acting subscales, respectively. Test—retest reliability was 0.64 for surface acting and 0.51 for deep acting during a mean interval of 3 months. Emotional labor can be quantitatively measured among young workers using the derived 9-item scale, although additional studies further evaluating its use should be conducted.


2020 ◽  
Vol 9 (3) ◽  
Author(s):  
Anggi Setyowati ◽  
Min-Huey Chung ◽  
Ah. Yusuf ◽  
Setya Haksama

Background: Curiosity is a personality characteristic, which fits with wellbeing and positive functioning. The objective of this study was to assess the construct validity of the Curiosity and Exploration Inventory II (CEI-II) in Indonesia.Design and Methods: The study included 256 undergraduate students who lived in Indonesia, mean age 19.8 years old. The CEI-II measures stretching and embracing using 11 items. The English version of CEI-II was translated into Bahasa. The Cronbach’s alpha coefficient and intraclass correlation coefficient (ICC) were addressed to examine internal consistency reliability and the test-retest reliability. To evaluate construct validity, exploratory factor analysis (EFA) was used to assess factor structure and confirmatory factor analysis (CFA) was used to evaluate the structural model fit of the CEI-II Indonesia version.Results: The study showed Cronbach’s alpha for the internal consistency of the overall CEI-II Indonesia version was 0.77. The ICC for the test-retest reliability ranged between 0.753-0.829. EFA showed adequate with the Kaiser-Meyer-Olkin value of 0.86 and the Bartlett’s test of sphericity was statistically significant. CFA tested the second-order model with two-order factors and showed a model fit.Conclusions: The CEI-II Indonesia version indicated acceptable construct validity to evaluate curiosity in Indonesia.


2020 ◽  
Author(s):  
Laura Elizabeth Bedford ◽  
Maegan Hon Yan Yeung ◽  
Chi Ho Au ◽  
Emily Tsui Yee Tse ◽  
Wing Yee Yim ◽  
...  

Abstract Background Patient enablement is a core tenet of patient-centred and holistic primary care. The Patient Enablement Instrument (PEI) is a transitional measure limited in its ability to measure changes over time. A modified version, PEI-2, has been developed to measure enablement at a given time-point without comparison to a recalled baseline. Objective To assess the validity, reliability, sensitivity and responsiveness of PEI-2. Methods PEI-2 was modified from the Chinese PEI to assess enablement over 4 weeks in a prospective cohort study nested within a community support programme [Trekkers Family Enhancement Scheme (TFES)] in Hong Kong. Construct validity was assessed by factor analysis and convergent validity by Spearman’s correlations with health-related quality of life and depressive symptoms. Internal reliability was assessed using Cronbach’s alpha. Test–retest reliability was assessed by intraclass correlation (ICC), responsiveness by 12–24-month change in PEI-2 score and sensitivity by differences in change of PEI-2 score between TFES participants and a control group. Results PEI-2 demonstrated construct validity with all items loading on one factor (factor loadings &gt;0.7). Convergent validity was confirmed by significant correlations with 12-item Short Form Questionnaire, version 2 (r = 0.1089–0.1919) and Patient Health Questionnaire-9 (r = −0.2030). Internal reliability was high (Cronbach’s alpha = 0.9095) and test–retest reliability moderate (ICC = 0.520, P = 0.506). Significant improvements in PEI-2 scores among the TFES group suggested good responsiveness (P &lt; 0.001). The difference in change of PEI-2 scores between TFES and control was significant (P = 0.008), indicating good sensitivity. Conclusions This study supports the validity, reliability, sensitivity and responsiveness of PEI-2 in measuring changes in enablement, making it a promising tool for evaluating enablement in cohort and intervention studies.


2017 ◽  
Vol 26 (3) ◽  
Author(s):  
Richelle M. Williams ◽  
Matthew A. Corvo ◽  
Kenneth C. Lam ◽  
Travis A. Williams ◽  
Lesley K. Gilmer ◽  
...  

Context: Postural control plays an essential role in concussion evaluation. The Stability Evaluation Test (SET) aims to objectively analyze postural control by measuring sway velocity on the NeuroCom VSR portable force platform (Natus, San Carlos, CA). Objective: To assess the test-retest reliability and practice effects of the SET protocol. Design: Cohort. Setting: Research laboratory. Patients or Other Participants: 50 healthy adults (20 men, 30 women, age 25.30 ± 3.60 y, height 166.60± 12.80 cm, mass 68.80 ± 13.90 kg). Interventions: All participants completed 4 trials of the SET. Each trial consisted of six 20-s balance tests with eyes closed, under the following conditions: double-leg firm (DFi), single-leg firm (SFi), tandem firm (TFi), double-leg foam (DFo), single-leg foam (SFo), and tandem foam (TFo). Each trial was separated by a 5-min seated rest period. Main Outcome Measures: The dependent variable was sway velocity (deg/s), with lower values indicating better balance. Sway velocity was recorded for each of the 6 conditions as well as a composite score for each trial. Test-retest reliability was analyzed across 4 trials with intraclass correlation coefficients (ICCs). Practice effects analyzed with repeated measures analysis of variance, followed by Tukey post hoc comparisons for any significant main effects (P < .05). Results: Sway-velocity reliability values were good to excellent: DFi (ICC = .88; 95%CI: .81, .92), SFi (ICC = .75; 95%CI: .61, .85), TFi (ICC = .84; 95%CI: .75, .90), DFo (ICC = .83; 95%CI: .74, .90), SFo (ICC = .82; 95%CI: .72, .89), TFo (ICC = .81; 95%CI: .69, .88), and composite score (ICC = .93; 95%CI: .88, .95). Significant practice effects (P < .05) were noted on the SFi, DFo, SFo, TFo conditions and composite scores. Conclusions: Our results suggest the SET has good to excellent reliability for the assessment of postural control in healthy adults. Due to the practice effects noted, a familiarization session is recommended (ie, all 6 conditions) before data are recorded. Future studies should evaluate injured patients to determine meaningful change scores during various injuries.


2005 ◽  
Vol 16 (08) ◽  
pp. 585-595 ◽  
Author(s):  
Melissa N. Ruscetta ◽  
Catherine V. Palmer ◽  
John D. Durrant ◽  
Judith Grayhack ◽  
Carey Ryan

Psychometric evaluations were performed on a self-perceived localization disabilities and handicaps questionnaire. Twenty individuals with normal hearing bilaterally, twenty with profound unilateral hearing impairment (UHI), and ten with any degree of bilateral hearing impairment participated. Each subject completed the questionnaire. Comparisons of the responses of the subjects with normal hearing and those with UHI revealed significant differences among the groups for both disabilities and handicaps, establishing construct validity. Cronbach's Alpha correlational analyses of the responses of all subjects with hearing impairment revealed correlations of .900 (disabilities) and .800 (handicaps), establishing internal consistency. Each participant with hearing impairment was asked to complete the questionnaire again after three weeks. Pearson's correlational analyses of the responses at time one versus time two revealed correlations of .900 (disabilities) and .700 (handicaps), establishing test/retest reliability. This questionnaire is an appropriate tool for investigating the self-perceived localization disabilities and handicaps of individuals with hearing impairment.


2016 ◽  
Vol 23 (4) ◽  
pp. 16-26 ◽  
Author(s):  
Gururaghavendran Rajesh ◽  
Monica Eriksson ◽  
Keshava Pai ◽  
S Seemanthini ◽  
Dilip G. Naik ◽  
...  

Background: The importance of salutogenesis, with the focus of what creates health rather than what causes diseases, has been highlighted for a long time. This has been operationalized by Aaron Antonovsky as the Sense of Coherence (SOC-13) scale. The aim of this study was to further examine the psychometric properties of the SOC-13 in India. Methods: The present study was carried out among second-year degree students at three randomly chosen institutions at Mangalore University. Investigators assessed the appropriateness, relevance, comprehensiveness and understandability of the scale. Further, the scale was assessed by five subject experts. The SOC-13 was then pretested by administering it to peers, individuals and a few of the study subjects. Internal consistency was assessed by Cronbach’s alpha and split-half reliability. Test-retest reliability was assessed by administering the instrument to the same study participants after two weeks. Confirmatory factor analysis employing varimax rotation was employed. Results: The SOC-13 revealed a Cronbach’s alpha value of 0.76. Split-half reliability and Guttman split-half reliability were found to be 0.71 and 0.70, respectively. Test-retest reliability was found to be 0.71 ( p < 0.01). Factor analysis revealed a three-factor solution explaining 40.53% of the variation in SOC. Conclusions: SOC-13 was found to be a reliable and valid instrument for measuring SOC in an Indian context. The present study contributes to health promotion in an Indian context, and could be useful even in other developing countries and for further research in India.


Sign in / Sign up

Export Citation Format

Share Document