Reliability of Self-Report Measures of Drug Use in Prevention Research: Evaluation of the Project Smart Questionnaire via the Test-Retest Reliability Matrix

John W. Graham; Brian R. Flay; C. Anderson Johnson; William B. Hansen; Linda Grossman; Judith L. Sobel

doi:10.2190/cyv0-7dpb-djfa-ej5u

Reliability of Self-Report Measures of Drug Use in Prevention Research: Evaluation of the Project Smart Questionnaire via the Test-Retest Reliability Matrix

Journal of Drug Education ◽

10.2190/cyv0-7dpb-djfa-ej5u ◽

1984 ◽

Vol 14 (2) ◽

pp. 175-193 ◽

Cited By ~ 117

Author(s):

John W. Graham ◽

Brian R. Flay ◽

C. Anderson Johnson ◽

William B. Hansen ◽

Linda Grossman ◽

...

Keyword(s):

Drug Use ◽

Present Article ◽

Research Evaluation ◽

Self Report ◽

Prevention Research ◽

Good Reliability ◽

Single Measure ◽

Retest Reliability ◽

Project Smart ◽

Test Retest Reliability

The present article describes an evaluation of a self-report questionnaire administered to whole classrooms of 7th graders. Using the test-retest reliability matrix (based on concepts of Cronbach [1] and Campbell and Fiske [2]), eight of nine drug-use indices appeared to have acceptable to good reliability. The three measures included in the test-retest reliability matrix provide stronger evidence for good reliability than could any single measure.

Download Full-text

Test–Retest Reliability of Common Measures of Eating Disorder Symptoms in Men Versus Women

Assessment ◽

10.1177/1073191117700267 ◽

2017 ◽

Vol 26 (3) ◽

pp. 419-431 ◽

Cited By ~ 10

Author(s):

Kelsie T. Forbush ◽

Lindsay A. Hilderbrand ◽

Brittany K. Bohrer ◽

Danielle A. N. Chapa

Keyword(s):

Binge Eating ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Self Report ◽

Good Reliability ◽

Retest Reliability ◽

Intraclass Correlation Coefficients ◽

Eating Disorder Symptoms ◽

Disinhibited Eating ◽

Test Retest Reliability

Approximately 10% to 30% of individuals with eating disorders (EDs) are male, yet because measures often have not been tested among male participants, it is unclear whether the psychometric properties of ED measures are equivalent between sexes. The purpose of this study was to compare the test–retest reliability of common ED measures in men versus women. Participants ( N = 227; 58.1% female) completed self-report measures of body dissatisfaction, restrained eating, disinhibited eating, bulimic symptoms, and desire-for-muscularity at baseline and 2-to-4 weeks later. Intraclass correlations were used to compute retest correlations. Spearman’s rho was used to compute retest correlations for skewed and kurtotic variables. We compared 95% confidence intervals for intraclass correlation coefficients to determine whether measures differed in reliability between sexes. Most ED measures had at least acceptable test–retest reliabilities. However, few measures of disinhibited and binge eating demonstrated good reliability in men. Results highlight the utility of several ED measures for assessing symptom change over time, and the need for additional research to identify and correct for sources of gender unreliability among ED self-report measures in men—particularly for assessing constructs that include binge-eating behavior.

Download Full-text

Test-Retest Reliability of a Self-Report Questionnaire for DSM-IV and ICD-10 Personality Disorders

European Journal of Psychological Assessment ◽

10.1027//1015-5759.16.1.53 ◽

2000 ◽

Vol 16 (1) ◽

pp. 53-58 ◽

Cited By ~ 11

Author(s):

Hans Ottosson ◽

Martin Grann ◽

Gunnar Kullgren

Keyword(s):

Personality Disorder ◽

Anxiety Disorders ◽

Personality Disorders ◽

Clinical Sample ◽

Self Report ◽

Anxiety State ◽

Short Term ◽

Retest Reliability ◽

Axis I ◽

Test Retest Reliability

Summary: Short-term stability or test-retest reliability of self-reported personality traits is likely to be biased if the respondent is affected by a depressive or anxiety state. However, in some studies, DSM-oriented self-reported instruments have proved to be reasonably stable in the short term, regardless of co-occurring depressive or anxiety disorders. In the present study, we examined the short-term test-retest reliability of a new self-report questionnaire for personality disorder diagnosis (DIP-Q) on a clinical sample of 30 individuals, having either a depressive, an anxiety, or no axis-I disorder. Test-retest scorings from subjects with depressive disorders were mostly unstable, with a significant change in fulfilled criteria between entry and retest for three out of ten personality disorders: borderline, avoidant and obsessive-compulsive personality disorder. Scorings from subjects with anxiety disorders were unstable only for cluster C and dependent personality disorder items. In the absence of co-morbid depressive or anxiety disorders, mean dimensional scores of DIP-Q showed no significant differences between entry and retest. Overall, the effect from state on trait scorings was moderate, and it is concluded that test-retest reliability for DIP-Q is acceptable.

Download Full-text

A new visual feedback-based system for the assessment of pinch force, endurance, accuracy and precision. A test-retest reliability study

Hand Therapy ◽

10.1177/17589983211002550 ◽

2021 ◽

pp. 175899832110025

Author(s):

Alberto Dottor ◽

Eleonora Camerone ◽

Mirko Job ◽

Diletta Barbiani ◽

Elisa Frisaldi ◽

...

Keyword(s):

Visual Feedback ◽

Intraclass Correlation ◽

Precision Grip ◽

Voluntary Contraction ◽

Good Reliability ◽

Retest Reliability ◽

Relative Reliability ◽

Time Accuracy ◽

In The Beginning ◽

Test Retest Reliability

Introduction Given that pinch is a precision grip involved in sustained submaximal activities, a Sustained Contraction (SC) task could be associated to Maximal Voluntary Contraction (MVC). To better evaluate the thumb-index system, the test-retest reliability of pinch MVC and SC, measured by a visual feedback-based pinch gauge was assessed. Methods 26 healthy participants performed MVC and SC in two separate sessions. SC required to maintain 40%MVC as long as possible and it was evaluated in terms of time, accuracy (Mean Distance between force trace and target force, MD), precision (Coefficient of Variability of force trace, CV). MD and CV analyses were conducted dividing the SC task into three equivalent time stages (beginning, middle, exhaustion). Relative Reliability (RR) was measured by Intraclass Correlation Coefficient, and Absolute Reliability (AR) was measured by Standard Error of Measurement and by Bland-Altman plot. Results MVC and Time showed high RR and AR in both hands. RR of MD and CV in right hand was excellent in the beginning and middle stages, and fair in the exhaustion one, showing decreasing reliability as fatigue increases. In the left hand RR of MD and CV was generally lower. MD showed excellent reliability in the beginning stage and good reliability in the other stages. CV showed fair relative reliability at both beginning and middle stages, excellent in the last one. Conversely, it was observed high AR of MD and CV in all stages in both hands. Conclusions All indices are reliable to assess motor control of thumb-index pinch in both hands.

Download Full-text

Test–Retest Reliability of the Twin Cities Walking Survey

Journal of Physical Activity and Health ◽

10.1123/jpah.6.1.119 ◽

2009 ◽

Vol 6 (1) ◽

pp. 119-131 ◽

Cited By ~ 20

Author(s):

Ann Forsyth ◽

J. Michael Oakes ◽

Kathryn H. Schmitz

Keyword(s):

Physical Activity ◽

International Physical Activity Questionnaire ◽

Total Activity ◽

Twin Cities ◽

Kappa Statistics ◽

Good Reliability ◽

Retest Reliability ◽

Social Features ◽

Test Retest Reliability ◽

Activity Questionnaire

Background:The Twin Cities Walking Study measured the associations of built environment versus socioeconomic and psychosocial variables with total physical activity and walking for 716 adults.Methods:This article reports on the test–retest reliability of the survey portion of the study. To test the reliability of the study measures, 158 respondents completed measures twice within 1 to 4 weeks. Agreement between participants’ responses was measured using Pearson r and Spearman rho, and kappa statistics.Results:Demographic questions are highly reliable (R > .8). Questions about environmental and social features are typically less reliable (rho range = 0.42– 0.91). Reliability of the International Physical Activity Questionnaire (last 7 days version) was low (rho = 0.15 for total activity).Conclusions:Much of the survey has acceptable-to-good reliability. The low test–retest reliability points to potential limitations of using a single administration of the IPAQ to characterize habitual physical activity. Implications for sound inference are accordingly complicated.

Download Full-text

Developing a measure to assess clinicians’ ability to reflect on key staff–patient dynamics in forensic settings

Journal of Forensic Practice ◽

10.1108/jfp-07-2021-0041 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Adam Polnay ◽

Helen Walker ◽

Christopher Gallacher

Keyword(s):

Reflective Practice ◽

Factor Structure ◽

Internal Consistency ◽

Self Report ◽

Face Validity ◽

Good Test ◽

Data Set ◽

Retest Reliability ◽

Content Type ◽

Test Retest Reliability

Purpose Relational dynamics between patients and staff in forensic settings can be complicated and demanding for both sides. Reflective practice groups (RPGs) bring clinicians together to reflect on these dynamics. To date, evaluation of RPGs has lacked quantitative focus and a suitable quantitative tool. Therefore, a self-report tool was designed. This paper aims to pilot The Relational Aspects of CarE (TRACE) scale with clinicians in a high-secure hospital and investigate its psychometric properties. Design/methodology/approach A multi-professional sample of 80 clinicians were recruited, completing TRACE and attitudes to personality disorder questionnaire (APDQ). Exploratory factor analysis (EFA) determined factor structure and internal consistency of TRACE. A subset was selected to measure test–retest reliability. TRACE was cross-validated against the APDQ. Findings EFA found five factors underlying the 20 TRACE items: “awareness of common responses,” “discussing and normalising feelings;” “utilising feelings,” “wish to care” and “awareness of complicated affects.” This factor structure is complex, but items clustered logically to key areas originally used to generate items. Internal consistency (α = 0.66, 95% confidence interval (CI) = 0.55–0.76) demonstrated borderline acceptability. TRACE demonstrated good test–retest reliability (intra-class correlation = 0.94, 95% CI = 0.78–0.98) and face validity. TRACE indicated a slight negative correlation with APDQ. A larger data set is needed to substantiate these preliminary findings. Practical implications Early indications suggested TRACE was valid and reliable, suitable to measure the effectiveness of reflective practice. Originality/value The TRACE was a distinctive measure that filled a methodological gap in the literature.

Download Full-text

Validation of the NARCOMS Registry: pain assessment

Multiple Sclerosis Journal ◽

10.1191/1352458505ms1167oa ◽

2005 ◽

Vol 11 (3) ◽

pp. 338-342 ◽

Cited By ~ 28

Author(s):

Ruth Ann Marrie ◽

Gary Cutter ◽

Tuula Tyry ◽

Olympia Hadjimichael ◽

Timothy Vollmer

Keyword(s):

Multiple Sclerosis ◽

Convergent Validity ◽

Disease Duration ◽

Self Report ◽

Research Committee ◽

Retest Reliability ◽

The North ◽

Disability Status ◽

Pain Question ◽

Test Retest Reliability

The North American Research Committee on Multiple Sclerosis (NARCOMS) Registry is a multiple sclerosis (MS) self-report registry with more than 24 000 participants. Participants report disability status upon enrolment, and semi-annually using Performance Scales (PS), Patient Determined Disease Steps (PDDS) and a pain question. In November 2000 and 2001, we also collected the Pain Effects Scale (PES). Our aim was to validate the NARCOMS pain question using the PES as our criterion measure. We measured correlations between the pain question and age, disease duration, various PS subscales and PDDS to assess construct validity. We correlated pain question responses in participants who reported no change in PDSS or the PS subscales between questionnaires to determine test—retest reliability. We measured responsiveness in participants who reported a substantial change in the sensory, spasticity PS subscales. The correlation between the pain question and PES was r=0.61 in November 2000, and r=0.64 in November 2001 (both P<0.0001). Correlations between the pain question and age, and disease duration were low, indicating divergent validity. Correlations between the pain question and spasticity, sensory PS subscales and PDSS were moderate, indicating convergent validity. Test—retest reliability was r=0.84 (P<0.0001). Responsiveness was 70.7%. The pain question is a valid self-report measure of pain in MS.

Download Full-text

Psychometric properties of a self-report version of the Sexual Interest and Desire Inventory for Women (SIDI-F-SR)

10.31219/osf.io/8ghda ◽

2020 ◽

Cited By ~ 1

Author(s):

Julia Velten ◽

Gerrit Hirschfeld ◽

Milena Meyers ◽

Jürgen Margraf

Keyword(s):

Psychometric Properties ◽

Internal Consistency ◽

Sexual Interest ◽

Intraclass Correlation ◽

Self Report ◽

Clinical Psychologist ◽

Retest Reliability ◽

Absolute Agreement ◽

Test Retest Reliability ◽

Restriction Of Range

Background: The Sexual Interest and Desire Inventory Female (SIDI-F) is a clinician-administered scale that allows for a comprehensive assessment of symptoms related to Hypoactive Sexual Desire Dysfunction (HSDD). As self-report questionnaires may facilitate less socially desirable responding and as time and resources are scarce in many clinical and research settings, a self-report version was developed (SIDI-F-SR). Aim: To investigate the agreement between the SIDI-F and a self-report version (SIDI-F-SR) and assess psychometric properties of the SIDI-F-SR. Methods: A total of 170 women (Mage=36.61, SD=10.61, range=20-69) with HSDD provided data on the SIDI-F, administered by a clinical psychologist via telephone, and the SIDI-F-SR, delivered as an Internet-based questionnaire. A subset of 19 women answered the SIDI-F-SR twice over a period of 14 weeks. Outcomes: Intraclass correlation as well as predictors of absolute agreement between SIDI-F and SIDI-F-SR, as well as internal consistency, test-retest reliability, and criterion-related validity of the SIDI-F-SR were examined. Results: There was high agreement between SIDI-F and SIDI-F-SR (ICC=.86). On average, women scored about one point higher in the self-report vs. the clinician-administered scale. Agreement was higher in young women and those with severe symptoms. Internal consistency of the SIDI-F-SR was acceptable (α=.76) and comparable to the SIDI-F (α=.74). When corrections for the restriction of range were applied, internal consistency of the SIDI-F-SR increased to .91. Test-retest-reliability was good (r=.74). Criterion-related validity was low but comparable between SIDI-F and SIDI-F-SR.

Download Full-text

Test-Retest Reliability of the Trauma and Life Events Self-Report Inventory

Psychological Reports ◽

10.2466/pr0.2000.87.3.750 ◽

2000 ◽

Vol 87 (3) ◽

pp. 750-752 ◽

Cited By ~ 9

Author(s):

J. E. Hovens ◽

I. Bramsen ◽

H. M. van der Ploeg ◽

I. E. W. Reuling

Keyword(s):

Medical Students ◽

Life Events ◽

Self Report ◽

First Year ◽

Total N ◽

Male And Female ◽

Retest Reliability ◽

Time Periods ◽

Test Retest Reliability

Three groups of first-year male and female medical students (total N = 90) completed the Trauma and Life Events Self-report Inventory twice. Test-retest reliability for the three different time periods was .82, .89, and .75, respectively.

Download Full-text

Measuring Symptoms of Psychopathology in Zambian Orphans and Vulnerable Children: Scale Validation and Psychometric Evaluation

Assessment ◽

10.1177/1073191118780455 ◽

2018 ◽

Vol 27 (6) ◽

pp. 1335-1348 ◽

Cited By ~ 1

Author(s):

Sarah McIvor Murray ◽

Paul Bolton ◽

Jeremy C. Kane ◽

Daniel P. Lakin ◽

Stephanie Skavenski Van Wyk ◽

...

Keyword(s):

Psychosocial Problems ◽

Well Being ◽

Self Report ◽

Computer Assisted ◽

Symptom Scale ◽

Vulnerable Children ◽

Internal Reliability ◽

Retest Reliability ◽

Orphans And Vulnerable Children ◽

Test Retest Reliability

There is a paucity of validated mental health measures for assessing psychological well-being among HIV-affected youth. We sought to explore the psychometric properties and validity of the Achenbach Youth Self-Report and Child Posttraumatic Stress Disorder Symptom Scale among orphans and vulnerable children (OVC) living in Lusaka, Zambia. These scales were administered to 210 OVC aged 13 to 17 years via audio computer-assisted self-interview. Confirmatory factor analysis was used to assess scale structure, Cronbach’s alpha for internal consistency, and correlations between scales related to mental or psychosocial health for construct validity. A known-groups validation was conducted using local identifications of youth with and without significant psychosocial problems, and test–retest reliability was assessed. Scales exhibited good internal reliability (α > .80), adequate criterion validity (area under the curve > .70), and moderate test–retest reliability (.62-.68). Findings support the utility of these symptom scales for identifying OVC experiencing significant psychosocial problems in Zambia.

Download Full-text

Measurement of Social Engagement of Nursing Home Residents During an Exercise Program

Perspectives of the ASHA Special Interest Groups ◽

10.1044/2021_persp-20-00302 ◽

2021 ◽

pp. 1-9

Author(s):

Jean Neils-Strunjas ◽

K. Jason Crandall ◽

Brian Weiler ◽

Annika Gabbard ◽

Caroline Wood ◽

...

Keyword(s):

Health Care ◽

Nursing Home ◽

Social Engagement ◽

Nursing Home Residents ◽

Exercise Program ◽

Self Report ◽

Validity And Reliability ◽

Retest Reliability ◽

Bivariate Correlation ◽

Test Retest Reliability

Purpose The purpose of this article was to describe the validity and reliability of the Fun and Social Engagement Evaluation (FUSE) developed to evaluate and measure social engagement displayed by nursing home residents during Bingocize. The FUSE combines health care worker observation and a resident self-report measure to produce a score that represents a resident's total engagement. Method To describe validity, trained health care workers who implement Bingocize were surveyed about the items on the FUSE. Visual inspection of bar graphs of responses to survey questions were used to determine content validity. To assess reliability of the FUSE, nursing home residents were evaluated by trained research assistants. Test–retest reliability of the participant scores 1 week apart was determined with the bivariate correlation (Pearson product–moment correlation coefficient). Results For validity, the majority of survey respondents indicated that the behaviors were representative of nursing home residents during Bingocize. For reliability, there was moderate–strong test–retest reliability over 1 week ( r = .60). Interrater reliability between two raters observing eight participants across two sessions was significant, κ = .68 (95% CI [.504,.848]), p < .0001. Conclusion Results offer evidence that the FUSE is a valid and reliable method for determining social engagement during Bingocize.

Download Full-text