Enhancing the Psychometric Properties of the Iowa Gambling Task Using Full Generative Modeling
The current study examined whether generative modeling could improve the psychometric properties of IGT metrics compared to the traditional two-stage summary approach. Across four models, we examined how different assumptions at the person-level and the group-level affected inference. More specifically, two person-level modeling approaches (summary score vs. ORL computational model) were “crossed” against two group-level modeling approaches (two-stage approach vs. full generative modeling across both testing sessions) to create four models of increasing complexity (see Fig 1). Model 1 relies on the two-stage summary approach that is conventionally applied in studies of the IGT. Model 2 estimates a generative model version of Model 1 that jointly estimates person-level summary score (probabilities of choosing good versus bad decks) across both testing sessions while simultaneously estimating the test-retest correlation. Thus, Model 2 accounts for uncertainty in person-level estimates that Model 1 ignores but estimates a person-level metric analogous to that of Model 1. Model 3 estimates the person-level ORL parameters independently within each testing session and then estimates the test-retest correlation for each model parameter using a two-stage approach. Model 4 estimates the person-level ORL parameters jointly across both testing sessions while simultaneously estimating the test-retest correlations for each parameter. Thus, Model 4 estimates the same person-level metrics (ORL parameters) as Model 3 but accounts for uncertainty in the person-level estimates. Our overarching hypothesis was that both the use of a more theoretically informative person-level model (i.e., going from Model 1 to Model 3, and from Model 2 to Model 4) and the use of generative models to jointly estimate person-level parameters and their test-retest correlations (i.e. going from Model 1 to Model 2, and from Model 3 to Model 4) would yield behavioral estimates with increased utility for use in individual differences research. More specifically, we predicted that the behavioral estimates from Model 4 would have the highest test-retest reliability. Further, we had a general prediction that the Model 4 estimates would show improved construct validity in relation to an a priori set of trait and state self-report measures commonly associated with IGT performance as well as measures of internalizing symptoms; however, this set of analyses was largely exploratory as no particular associations between the ORL parameters and self-report measures were specified.