scholarly journals Test-retest reliability of the HEXACO-100—And the value of multiple measurements for assessing reliability

PLoS ONE ◽  
2022 ◽  
Vol 17 (1) ◽  
pp. e0262465
Author(s):  
Sam Henry ◽  
Isabel Thielmann ◽  
Tom Booth ◽  
René Mõttus

Despite the widespread use of the HEXACO model as a descriptive taxonomy of personality traits, there remains limited information on the test-retest reliability of its commonly-used inventories. Studies typically report internal consistency estimates, such as alpha or omega, but there are good reasons to believe that these do not accurately assess reliability. We report 13-day test-retest correlations of the 100- and 60-item English HEXACO Personality Inventory-Revised (HEXACO-100 and HEXACO-60) domains, facets, and items. In order to test the validity of test-retest reliability, we then compare these estimates to correlations between self- and informant-reports (i.e., cross-rater agreement), a widely-used validity criterion. Median estimates of test-retest reliability were .88, .81, and .65 (N = 416) for domains, facets, and items, respectively. Facets’ and items’ test-retest reliabilities were highly correlated with their cross-rater agreement estimates, whereas internal consistencies were not. Overall, the HEXACO Personality Inventory-Revised demonstrates test-retest reliability similar to other contemporary measures. We recommend that short-term retest reliability should be routinely calculated to assess reliability.

2021 ◽  
Author(s):  
Samuel Henry ◽  
Isabel Thielmann ◽  
Tom Booth ◽  
René Mõttus

Despite the widespread use of the HEXACO as a descriptive taxonomy of human personality, there remains very limited information on the test-retest reliability of commonly used tools to measure the six traits. We report 12-day test-retest of the 100-item HEXACO-PI-R (HEXACO-100) at the level of domains, facets and items. We compare test-retest estimates to internal consistency for domains and facets, and to cross-rater agreement for all levels of measurement. Median rTTs were r = .65, .81, and .88 (n = 416) for items, facets, and domains, respectively. Facets’ rCAs were highly correlated with rTTs but not s. We conclude that the HEXACO-100 demonstrates rTT similar to other contemporary measures, and that rTT data should be routinely collected for scales.


2000 ◽  
Vol 16 (1) ◽  
pp. 53-58 ◽  
Author(s):  
Hans Ottosson ◽  
Martin Grann ◽  
Gunnar Kullgren

Summary: Short-term stability or test-retest reliability of self-reported personality traits is likely to be biased if the respondent is affected by a depressive or anxiety state. However, in some studies, DSM-oriented self-reported instruments have proved to be reasonably stable in the short term, regardless of co-occurring depressive or anxiety disorders. In the present study, we examined the short-term test-retest reliability of a new self-report questionnaire for personality disorder diagnosis (DIP-Q) on a clinical sample of 30 individuals, having either a depressive, an anxiety, or no axis-I disorder. Test-retest scorings from subjects with depressive disorders were mostly unstable, with a significant change in fulfilled criteria between entry and retest for three out of ten personality disorders: borderline, avoidant and obsessive-compulsive personality disorder. Scorings from subjects with anxiety disorders were unstable only for cluster C and dependent personality disorder items. In the absence of co-morbid depressive or anxiety disorders, mean dimensional scores of DIP-Q showed no significant differences between entry and retest. Overall, the effect from state on trait scorings was moderate, and it is concluded that test-retest reliability for DIP-Q is acceptable.


Author(s):  
Hannah Keppler ◽  
Sofie Degeest ◽  
Bart Vinck

Purpose The objective of the current study was to investigate the short-term test–retest reliability of contralateral suppression (CS) of click-evoked otoacoustic emissions (CEOAEs) using commercially available otoacoustic emission equipment. Method Twenty-three young normal-hearing subjects were tested. An otoscopic evaluation, admittance measures, pure-tone audiometry, measurements of CEOAEs without and with contralateral acoustic stimulation (CAS) to determine CS were performed at baseline ( n = 23), an immediate retest without and with refitting of the probe (only CS of CEOAEs; n = 11), and a retest after 1 week ( n = 23) were performed. Test–retest reliability parameters were determined on CEOAE response amplitudes without and with CAS, and on raw and normalized CS indices between baseline and the other test moments. Results Repeated-measures analysis of variance indicated no random or systematic changes in CEOAE response amplitudes without and with CAS, and in raw and normalized CS indices between the test moments. Moderate-to-high intraclass correlation coefficients with mostly high significant between-subjects variability between baseline and each consecutive test moment were found for CEOAE response amplitude without and with CAS, and for the raw and normalized CS indices. Other reliability parameters deteriorated between CEOAE response amplitudes with CAS as compared to without CAS, between baseline and retest with probe refitting, and after 1 week, as well as for frequency-specific raw and normalized CS indices as compared to global CS indices. Conclusions There was considerable variability in raw and normalized CS indices as measured using CEOAEs with CAS using commercially available otoacoustic emission equipment. More research is needed to optimize the measurement of CS of CEOAEs and to reduce influencing factors, as well as to make generalization of test–retest reliability data possible.


2019 ◽  
Vol 43 (3) ◽  
pp. 271-277
Author(s):  
Joyce J. Endendijk ◽  
Marleen G. Groeneveld ◽  
Maja Deković ◽  
Carlijn van den Boomen

The emotional availability scales (EAS), 4th edition, are widely used in research and clinical practice to assess the quality of parent–child interaction. This study examined the short-term reliability and continuity of the EAS (4th ed.) assessed in two similar observational contexts over a one-week interval. Sixty-two Dutch parents (85% mothers) and their 9- to 12-month-old infants ( Mage = 10.07 months, SD = 0.47, 53% boys) were videotaped twice while they interacted with each other during several tasks (free play, structured play, book reading, toys taken away). The videotapes were coded with the EAS 4th edition by two reliable coders. Moderate to strong test–retest reliability was found for the three EA parent-dimensions: sensitivity, structuring, and nonintrusiveness. Child involvement was not reliable over a one-week period, and child responsiveness could only be reliably assessed in boys. Test–retest reliability of structuring was also higher for boys than for girls. Regarding continuity, mean levels of sensitivity, structuring, nonintrusiveness, and involvement did not change over a one-week interval, but responsiveness increased for girls only. Thus, the parenting dimensions of the 4th edition of the EAS reflect stable and consistent characteristics of the parent–child dyad on the short term, but the child measures do not.


2017 ◽  
Vol 7 (3) ◽  
pp. 208-216 ◽  
Author(s):  
Amanda M. O’Brien ◽  
Joseph E. Casey ◽  
Rachel M. Salmon

2017 ◽  
Vol 25 (3) ◽  
pp. 431-458 ◽  
Author(s):  
Janine E. Hinton ◽  
Mary Z. Mays ◽  
Debra Hagler ◽  
Pamela Randolph ◽  
Ruth Brooks ◽  
...  

Background and Purpose: There is growing evidence that simulation testing is appropriate for assessing nursing competence. We compiled evidence on the validity and reliability of the Nursing Performance Profile (NPP) method for assessing competence. Methods: Participants (N = 67) each completed 3 high-fidelity simulation tests; raters (N = 31) scored the videotaped tests using a 41-item competency rating instrument. Results: The test identified areas of practice breakdown and distinguished among subgroups differing in age, education, and simulation experience. Supervisor assessments were positively correlated, r = .31. Self-assessments were uncorrelated, r = .07. Inter-rater agreement ranged from 93% to 100%. Test–retest reliability ranged from r = .57 to .69. Conclusions: The NPP can be used to assess competence and make decisions supporting public safety.


2021 ◽  
Vol 42 (Supplement_1) ◽  
Author(s):  
M Thunestvedt Hansen ◽  
T Roemer ◽  
A Hoejgaard ◽  
K Husted ◽  
K Soerensen ◽  
...  

Abstract Introduction Low cardiorespiratory fitness expressed as a low maximal oxygen consumption (V̇O2max) is associated with cardiovascular disease and all-cause mortality (1). Thus, V̇O2max is recognised as an important clinical tool in the assessment of patients (1,2). However, assessment of V̇O2max by exercise testing is both physically demanding and methodologically challenging and hence the clinical applicability is limited. Purpose Therefore, the aim of this study was to investigate the accuracy and precision of a clinical non-exercise method for assessment of V̇O2max. Methods On three separate days 20 healthy men (n=10) and women (n=10) with varying age (22–72 years) and fitness levels performed two tests for determination of V̇O2max; (a) a non-exercise test using seismocardiography (SCG V̇O2max) and (b) a graded exercise test to voluntary exhaustion on a cycle ergometer based on indirect calorimetry (IC V̇O2max). These tests were performed in order to examine the day-to-day reliability and the validity of SCG V̇O2max, respectively. Furthermore, SCG V̇O2max was assessed twice on each test day to investigate test-retest reliability. The SCG V̇O2max was performed in prone position following a short resting period by placing the SCG recording device on the xiphisternal joint with double adhesive tape. V̇O2max was assessed during a 5-minute recording of the sternal movement using SCG in combination with demographic data of the participants (3). In addition, body composition was measured and a resting blood sample collected each test day. Results On average SCG V̇O2max was 3.3±2.4 ml/min/kg (mean ± 95% CI) lower than IC V̇O2max (p=0.013, SCG V̇O2max: 36.6±3.3 ml/min/kg, IC V̇O2max: 39.9±3.0 ml/min/kg). A significant positive correlation was found between SCG V̇O2max and IC V̇O2max (Pearson, r=0.72, p<0.001). Both SCG V̇O2max and IC V̇O2max was similar between test days (p=0.972) and the intra-individual coefficient of variation was 4.5±2.9% and 4.0±2.5%, respectively. Within each test day SCG V̇O2max was highly correlated (r=0.99, p<0.0001) and no difference was observed between tests (p=0.993). Conclusions The accuracy of the current non-exercise assessment of cardiorespiratory fitness based on seismocardiography is not optimal as SCG V̇O2max was systematically lower than the gold standard assessment applying indirect calorimetry during a graded exercise test. Despite the abovementioned difference, SCG V̇O2max and IC V̇O2max were highly correlated. Furthermore, the precision of SCG V̇O2max is very high as both day-to-day and test-retest reliability were high. FUNDunding Acknowledgement Type of funding sources: Private company. Main funding source(s): VentriJect ApS, Copenhagen, Denmark


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
En-Chi Chiu ◽  
Ya-Chen Lee ◽  
Shu-Chun Lee ◽  
I-Ping Hsueh

Abstract Background The Performance-based measure of Executive Functions (PEF) with four domains is designed to assess executive functions in people with schizophrenia. The purpose of this study was to examine the test-retest reliability of the PEF administered by the same rater (intra-rater agreement) and by different raters (inter-rater agreement) in people with schizophrenia and to estimate the values of minimal detectable change (MDC) and MDC%. Methods Two convenience samples (each sample, n = 60) with schizophrenia were conducted two assessments (two weeks apart). The intraclass correlation coefficient (ICC) was analyzed to examine intra-rater and inter-rater agreements of the test-retest reliability of the PEF. The MDC was calculated through standard error of measurement. Results For the intra-rater agreement study, the ICC values of the four domains were 0.88–0.92. The MDC (MDC%) of the four domains (volition, planning, purposive action, and perfromance effective) were 13.0 (13.0%), 12.2 (16.4%), 16.2 (16.2%), and 16.3 (18.8%), respectively. For the inter-rater agreement study, the ICC values of the four domains were 0.82–0.89. The MDC (MDC%) were 15.8 (15.8%), 17.4 (20.0%), 20.9 (20.9%), and 18.6 (18.6%) for the volition, planning, purposive action, and performance effective domains, respectively. Conclusions The PEF has good test-retest reliability, including intra-rater and inter-rater agreements, for people with schizophrenia. Clinicians and researchers can use the MDC values to verify whether an individual with schizophrenia shows any real change (improvement or deterioration) between repeated PEF assessments by the same or different raters.


Sign in / Sign up

Export Citation Format

Share Document