scholarly journals Enhancing the Psychometric Properties of the Iowa Gambling Task Using Full Generative Modeling

2021 ◽  
Author(s):  
Holly Sullivan-Toole ◽  
Nathaniel Haines ◽  
Thomas M Olino

The current study examined whether generative modeling could improve the psychometric properties of IGT metrics compared to the traditional two-stage summary approach. Across four models, we examined how different assumptions at the person-level and the group-level affected inference. More specifically, two person-level modeling approaches (summary score vs. ORL computational model) were “crossed” against two group-level modeling approaches (two-stage approach vs. full generative modeling across both testing sessions) to create four models of increasing complexity (see Fig 1). Model 1 relies on the two-stage summary approach that is conventionally applied in studies of the IGT. Model 2 estimates a generative model version of Model 1 that jointly estimates person-level summary score (probabilities of choosing good versus bad decks) across both testing sessions while simultaneously estimating the test-retest correlation. Thus, Model 2 accounts for uncertainty in person-level estimates that Model 1 ignores but estimates a person-level metric analogous to that of Model 1. Model 3 estimates the person-level ORL parameters independently within each testing session and then estimates the test-retest correlation for each model parameter using a two-stage approach. Model 4 estimates the person-level ORL parameters jointly across both testing sessions while simultaneously estimating the test-retest correlations for each parameter. Thus, Model 4 estimates the same person-level metrics (ORL parameters) as Model 3 but accounts for uncertainty in the person-level estimates. Our overarching hypothesis was that both the use of a more theoretically informative person-level model (i.e., going from Model 1 to Model 3, and from Model 2 to Model 4) and the use of generative models to jointly estimate person-level parameters and their test-retest correlations (i.e. going from Model 1 to Model 2, and from Model 3 to Model 4) would yield behavioral estimates with increased utility for use in individual differences research. More specifically, we predicted that the behavioral estimates from Model 4 would have the highest test-retest reliability. Further, we had a general prediction that the Model 4 estimates would show improved construct validity in relation to an a priori set of trait and state self-report measures commonly associated with IGT performance as well as measures of internalizing symptoms; however, this set of analyses was largely exploratory as no particular associations between the ORL parameters and self-report measures were specified.

2020 ◽  
Author(s):  
Nathaniel Haines ◽  
Peter D. Kvam ◽  
Louis H. Irving ◽  
Colin Smith ◽  
Theodore P. Beauchaine ◽  
...  

Behavioral tasks (e.g., Stroop task) that produce replicable group-level effects (e.g., Stroop effect) often fail to reliably capture individual differences between participants (e.g., low test-retest reliability). This “reliability paradox” has led many researchers to conclude that most behavioral tasks cannot be used to develop and advance theories of individual differences. However, these conclusions are derived from statistical models that provide only superficial summary descriptions of behavioral data, thereby ignoring theoretically-relevant data-generating mechanisms that underly individual-level behavior. More generally, such descriptive methods lack the flexibility to test and develop increasingly complex theories of individual differences. To resolve this theory-description gap, we present generative modeling approaches, which involve using background knowledge to specify how behavior is generated at the individual level, and in turn how the distributions of individual-level mechanisms are characterized at the group level—all in a single joint model. Generative modeling shifts our focus away from estimating descriptive statistical “effects” toward estimating psychologically meaningful parameters, while simultaneously accounting for measurement error that would otherwise attenuate individual difference correlations. Using simulations and empirical data from the Implicit Association Test and Stroop, Flanker, Posner Cueing, and Delay Discounting tasks, we demonstrate how generative models yield (1) higher test-retest reliability estimates, and (2) more theoretically informative parameter estimates relative to traditional statistical approaches. Our results reclaim optimism regarding the utility of behavioral paradigms for testing and advancing theories of individual differences, and emphasize the importance of formally specifying and checking model assumptions to reduce theory-description gaps and facilitate principled theory development.


2004 ◽  
Vol 34 (1) ◽  
pp. 73-82 ◽  
Author(s):  
M. H. TRIVEDI ◽  
A. J. RUSH ◽  
H. M. IBRAHIM ◽  
T. J. CARMODY ◽  
M. M. BIGGS ◽  
...  

Background. The present study provides additional data on the psychometric properties of the 30-item Inventory of Depressive Symptomatology (IDS) and of the recently developed Quick Inventory of Depressive Symptomatology (QIDS), a brief 16-item symptom severity rating scale that was derived from the longer form. Both the IDS and QIDS are available in matched clinician-rated (IDS-C30; QIDS-C16) and self-report (IDS-SR30; QIDS-SR16) formats.Method. The patient samples included 544 out-patients with major depressive disorder (MDD) and 402 out-patients with bipolar disorder (BD) drawn from 19 regionally and ethnicically diverse clinics as part of the Texas Medication Algorithm Project (TMAP). Psychometric analyses including sensitivity to change with treatment were conducted.Results. Internal consistencies (Cronbach's alpha) ranged from 0·81 to 0·94 for all four scales (QIDS-C16, QIDS-SR16, IDS-C30 and IDS-SR30) in both MDD and BD patients. Sad mood, involvement, energy, concentration and self-outlook had the highest item-total correlations among patients with MDD and BD across all four scales. QIDS-SR16 and IDS-SR30 total scores were highly correlated among patients with MDD at exit (c=0·83). QIDS-C16 and IDS-C30 total scores were also highly correlated among patients with MDD (c=0·82) and patients with BD (c=0·81). The IDS-SR30, IDS-C30, QIDS-SR16, and QIDS-C16 were equivalently sensitive to symptom change, indicating high concurrent validity for all four scales. High concurrent validity was also documented based on the SF-12 Mental Health Summary score for the population divided in quintiles based on their IDS or QIDS score.Conclusion. The QIDS-SR16 and QIDS-C16, as well as the longer 30-item versions, have highly acceptable psychometric properties and are treatment sensitive measures of symptom severity in depression.


2016 ◽  
Vol 24 (12) ◽  
pp. 1648-1660 ◽  
Author(s):  
Will H. Canu ◽  
Cynthia M. Hartung ◽  
Anne E. Stevens ◽  
Elizabeth K. Lefler

Objective: The current study examines psychometric properties of the Weiss Functional Impairment Rating Scale (WFIRS), a measure of adult ADHD-related impairment. It is a self-report questionnaire that provides a metric of overall life impairment and domain-specific dysfunction. Method: Using data from a large ( N = 2,093), multi-institution sample of college students and including a subsample of collateral informants ( n = 262), a series of analyses were conducted. Results: The WFIRS demonstrated robust internal reliability, cross-informant agreement on par or superior to other measures of ADHD symptomatology and impairment, and concurrent validity. The WFIRS was not shown to be uniquely associated with ADHD, as internalizing symptoms also associated with the total and domain scores. Conclusion: The use of the WFIRS in identifying ADHD-related impairment in emerging adults appears to be psychometrically supported, and will prove useful to clinicians and researchers.


2012 ◽  
Vol 28 (1) ◽  
pp. 51-59 ◽  
Author(s):  
Anna Ogliari ◽  
Simona Scaini ◽  
Michael J. Kofler ◽  
Valentina Lampis ◽  
Annalisa Zanoni ◽  
...  

Reliable and valid self-report questionnaires could be useful as initial screening instruments for social phobia in both clinical settings and general populations. The present study investigates the factor structure and psychometric properties of the Social Phobia and Anxiety Inventory for Children (SPAI-C) in a sample of 228 children from the Italian general population aged 8 to 11. The children were asked to complete the Italian version of the SPAI-C and the Screen for Child Anxiety Related Emotional Disorders (SCARED) questionnaire. Confirmatory factor analyses revealed that social phobia can be conceptualized as a unitary construct consisting of five distinct but interrelated symptom clusters named Assertiveness, General Conversation, Physical/Cognitive Symptoms, Avoidance, and Public Performance. Internal consistency of the SPAI-C total scores and two subscales was good; correlations between SPAI-C total scores and SCARED total scores/subscales ranged from moderate to high (Generalized Anxiety Disorder, for social phobia), with the SCARED Social Phobia subscale as the best predictor of SPAI-C total scores. The results indicate that the SPAI-C is a reliable and sensitive instrument suitable for identifying Social Phobia in the young Italian general population.


2011 ◽  
Vol 27 (3) ◽  
pp. 164-170 ◽  
Author(s):  
Anna Sundström

This study evaluated the psychometric properties of a self-report scale for assessing perceived driver competence, labeled the Self-Efficacy Scale for Driver Competence (SSDC), using item response theory analyses. Two samples of Swedish driving-license examinees (n = 795; n = 714) completed two versions of the SSDC that were parallel in content. Prior work, using classical test theory analyses, has provided support for the validity and reliability of scores from the SSDC. This study investigated the measurement precision, item hierarchy, and differential functioning for males and females of the items in the SSDC as well as how the rating scale functions. The results confirmed the previous findings; that the SSDC demonstrates sound psychometric properties. In addition, the findings showed that measurement precision could be increased by adding items that tap higher self-efficacy levels. Moreover, the rating scale can be improved by reducing the number of categories or by providing each category with a label.


2012 ◽  
Author(s):  
Peter D. Marle ◽  
Alisa J. Estey ◽  
Laura J. Finan ◽  
Karenleigh A. Overmann

2020 ◽  
Author(s):  
Lili Zhang ◽  
Himanshu Vashisht ◽  
Alekhya Nethra ◽  
Brian Slattery ◽  
Tomas Ward

BACKGROUND Chronic pain is a significant world-wide health problem. It has been reported that people with chronic pain experience decision-making impairments, but these findings have been based on conventional lab experiments to date. In such experiments researchers have extensive control of conditions and can more precisely eliminate potential confounds. In contrast, there is much less known regarding how chronic pain impacts decision-making captured via lab-in-the-field experiments. Although such settings can introduce more experimental uncertainty, it is believed that collecting data in more ecologically valid contexts can better characterize the real-world impact of chronic pain. OBJECTIVE We aim to quantify decision-making differences between chronic pain individuals and healthy controls in a lab-in-the-field environment through taking advantage of internet technologies and social media. METHODS A cross-sectional design with independent groups was employed. A convenience sample of 45 participants were recruited through social media - 20 participants who self-reported living with chronic pain, and 25 people with no pain or who were living with pain for less than 6 months acting as controls. All participants completed a self-report questionnaire assessing their pain experiences and a neuropsychological task measuring their decision-making, i.e. the Iowa Gambling Task (IGT) in their web browser at a time and location of their choice without supervision. RESULTS Standard behavioral analysis revealed no differences in learning strategies between the two groups although qualitative differences could be observed in learning curves. However, computational modelling revealed that individuals with chronic pain were quicker to update their behavior relative to healthy controls, which reflected their increased learning rate (95% HDI from 0.66 to 0.99) when fitted with the VPP model. This result was further validated and extended on the ORL model because higher differences (95% HDI from 0.16 to 0.47) between the reward and punishment learning rates were observed when fitted on this model, indicating that chronic pain individuals were more sensitive to rewards. It was also found that they were less persistent in their choices during the IGT compared to controls, a fact reflected by their decreased outcome perseverance (95% HDI from -4.38 to -0.21) when fitted using the ORL model. Moreover, correlation analysis revealed that the estimated parameters had predictive value for the self-reported pain experiences, suggesting that the altered cognitive parameters could be potential candidates for inclusion in chronic pain assessments. CONCLUSIONS We found that individuals with chronic pain were more driven by rewards and less consistent when making decisions in our lab-in-the-field experiment. In this case study, it was demonstrated that compared to standard statistical summaries of behavioral performance, computational approaches offered superior ability to resolve, understand and explain the differences in decision- making behavior in the context of chronic pain outside the lab.


2021 ◽  
Vol 34 (1) ◽  
Author(s):  
Evandro Morais Peixoto ◽  
Daniela Sacramento Zanini ◽  
Josemberg Moura de Andrade

Abstract Background The Kessler Distress Scale (K10) is a self-report scale for the assessment of non-specific psychological distress in the general and clinical population. Because of its ease of application and good psychometric properties, the K10 has been adapted to several cultures. The present study seeks to adapt the K10 to Brazilian Portuguese and estimate its validity evidence and reliability. Methods A total of 1914 individuals from the general population participated in the study (age = 34.88, SD = 13.61, 77.7% female). The adjustment indices were compared among three different measurement models proposed for the K10 through confirmatory factor analysis (CFA). The items’ properties were analyzed by Andrich’s Rating Scale Model (RSM). Furthermore, evidence based on relations to other variables (depression, stress, anxiety, positive and negative affects, and satisfaction with life) was estimated. Results CFA indicated the adequacy of the bifactor model (CFI= 0.985; TLI= 0.973; SMR= 0.019; RMSEA= 0.050), composed of two specific factors (depression and anxiety) and one general factor (psychological distress), corresponding to the theoretical hypothesis. Additionally, it was observed multiple-group invariance by gender and age range. The RSM provided an understanding of the organization of the continuum represented by the psychological distress construct (items difficulty), which varied from −0.89 to 1.00; good adjustment indexes; infit between 0.67 and 1.32; outfit between 0.68 and 1.34; and desirable reliability, α= 0.87. Lastly, theoretically coherent associations with the external variables were observed. Conclusions It is concluded that the Brazilian version of the K10 is a suitable measure of psychological distress for the Brazilian population.


Author(s):  
Marco Fabbri ◽  
Alessia Beracci ◽  
Monica Martoni ◽  
Debora Meneo ◽  
Lorenzo Tonetti ◽  
...  

Sleep quality is an important clinical construct since it is increasingly common for people to complain about poor sleep quality and its impact on daytime functioning. Moreover, poor sleep quality can be an important symptom of many sleep and medical disorders. However, objective measures of sleep quality, such as polysomnography, are not readily available to most clinicians in their daily routine, and are expensive, time-consuming, and impractical for epidemiological and research studies., Several self-report questionnaires have, however, been developed. The present review aims to address their psychometric properties, construct validity, and factorial structure while presenting, comparing, and discussing the measurement properties of these sleep quality questionnaires. A systematic literature search, from 2008 to 2020, was performed using the electronic databases PubMed and Scopus, with predefined search terms. In total, 49 articles were analyzed from the 5734 articles found. The psychometric properties and factor structure of the following are reported: Pittsburgh Sleep Quality Index (PSQI), Athens Insomnia Scale (AIS), Insomnia Severity Index (ISI), Mini-Sleep Questionnaire (MSQ), Jenkins Sleep Scale (JSS), Leeds Sleep Evaluation Questionnaire (LSEQ), SLEEP-50 Questionnaire, and Epworth Sleepiness Scale (ESS). As the most frequently used subjective measurement of sleep quality, the PSQI reported good internal reliability and validity; however, different factorial structures were found in a variety of samples, casting doubt on the usefulness of total score in detecting poor and good sleepers. The sleep disorder scales (AIS, ISI, MSQ, JSS, LSEQ and SLEEP-50) reported good psychometric properties; nevertheless, AIS and ISI reported a variety of factorial models whereas LSEQ and SLEEP-50 appeared to be less useful for epidemiological and research settings due to the length of the questionnaires and their scoring. The MSQ and JSS seemed to be inexpensive and easy to administer, complete, and score, but further validation studies are needed. Finally, the ESS had good internal consistency and construct validity, while the main challenges were in its factorial structure, known-group difference and estimation of reliable cut-offs. Overall, the self-report questionnaires assessing sleep quality from different perspectives have good psychometric properties, with high internal consistency and test-retest reliability, as well as convergent/divergent validity with sleep, psychological, and socio-demographic variables. However, a clear definition of the factor model underlying the tools is recommended and reliable cut-off values should be indicated in order for clinicians to discriminate poor and good sleepers.


Sign in / Sign up

Export Citation Format

Share Document