Pre-Statistical Harmonization of Behavioral Instruments Across Eight Surveys and Trials

Abstract Data harmonization methods facilitate further use of existing studies and research resources. Most statistical harmonization methods require pooling data across studies, which is complex and requires careful scrutiny of source data. Most methods (e.g., item response theory) require datasets to have common items for linking a common construct across studies: this necessitates the qualitative process of pre-statistical harmonization. Here, we document pre-statistical harmonization of items measuring behavioral and psychological symptoms (e.g., agitation, wandering, etc.) which represent problematic behaviors among people with dementia administered in a national survey (ADAMS), evaluations conducted at Alzheimer’s Disease Research Centers (NACC), and in six randomized trials (COPE, TAP, ALZQOL, ACT, REACH, ADSPlus). We describe our approach to review question content and scoring procedures to establish comparability across items prior to data pooling. We identified 327 items from 15 instruments across these eight studies. We found considerable cross-study heterogeneity in administration and coding procedures for items that measure the same domain. For example, eight items were coded as count variables in some studies but as categorical variables in others. Moreover, of the 359 items, 191 are conditionally dependent on values of another item. These issues around item response heterogeneity and conditional dependency needed to be resolved prior to estimation of item response theory models for statistical co-calibration. We leveraged several rigorous data transformation procedures to address these issues, including re-coding and winsorization. This study provides guidelines for how future research may acknowledge and address similar issues in pooling behavioral and related instruments.

Download Full-text

The measurement of populist attitudes: Testing cross-national scales using item response theory

Politics ◽

10.1177/0263395719859306 ◽

2019 ◽

Vol 40 (1) ◽

pp. 3-21 ◽

Cited By ~ 13

Author(s):

Steven M Van Hauwaert ◽

Christian H Schimpf ◽

Flavio Azevedo

Keyword(s):

Item Response Theory ◽

Item Response ◽

Empirical Evaluation ◽

Measurement Model ◽

Future Research ◽

Response Theory ◽

Individual Level ◽

National Measurement ◽

The Individual ◽

Cross National

Recent research in the populism literature has devoted considerable efforts to the conceptualisation and examination of populism on the individual level, that is, populist attitudes. Despite rapid progress in the field, questions of adequate measurement and empirical evaluation of measures of populist attitudes remain scarce. Seeking to remedy these shortcomings, we apply a cross-national measurement model, using item response theory, to six established and two new populist indicators. Drawing on a cross-national survey (nine European countries, n = 18,368), we engage in a four-folded analysis. First, we examine the commonly used 6-item populism scale. Second, we expand the measurement with two novel items. Third, we use the improved 8-item populism scale to further refine equally comprehensive but more concise and parsimonious populist measurements. Finally, we externally validate these sub-scales and find that some of the proposed sub-scales outperform the initial 6- and 8-item scales. We conclude that existing measures of populism capture moderate populist attitudes, but face difficulties measuring more extreme levels, while the individual information of some of the populist items remains limited. Altogether, this provides several interesting routes for future research, both within and between countries.

Download Full-text

Using Item Response Theory to Develop Measures of Acquisitive and Protective Self-Monitoring From the Original Self-Monitoring Scale

Assessment ◽

10.1177/1073191115615213 ◽

2015 ◽

Vol 24 (5) ◽

pp. 677-691 ◽

Cited By ~ 8

Author(s):

Michael P. Wilmot ◽

Jack W. Kostal ◽

David Stillwell ◽

Michal Kosinski

Keyword(s):

Item Response Theory ◽

Item Response ◽

Future Research ◽

Response Theory ◽

Bivariate Model ◽

Study Item ◽

Self Monitoring ◽

The Past ◽

Gender And Age ◽

Conventional Paradigm

For the past 40 years, the conventional univariate model of self-monitoring has reigned as the dominant interpretative paradigm in the literature. However, recent findings associated with an alternative bivariate model challenge the conventional paradigm. In this study, item response theory is used to develop measures of the bivariate model of acquisitive and protective self-monitoring using original Self-Monitoring Scale (SMS) items, and data from two large, nonstudent samples ( Ns = 13,563 and 709). Results indicate that the new acquisitive (six-item) and protective (seven-item) self-monitoring scales are reliable, unbiased in terms of gender and age, and demonstrate theoretically consistent relations to measures of personality traits and cognitive ability. Additionally, by virtue of using original SMS items, previously collected responses can be reanalyzed in accordance with the alternative bivariate model. Recommendations for the reanalysis of archival SMS data, as well as directions for future research, are provided.

Download Full-text

Comparison of Single-Response Format and Forced-Choice Format Instruments Using Thurstonian Item Response Theory

Educational and Psychological Measurement ◽

10.1177/0013164417752782 ◽

2018 ◽

Vol 79 (1) ◽

pp. 108-128 ◽

Cited By ~ 8

Author(s):

David M. Dueber ◽

Abigail M. A. Love ◽

Michael D. Toland ◽

Trisha A. Turner

Keyword(s):

Item Response Theory ◽

Item Response ◽

Forced Choice ◽

Lessons Learned ◽

Self Report ◽

Future Research ◽

Response Format ◽

Response Theory ◽

Alternative Response ◽

Single Response

One of the most cited methodological issues is with the response format, which is traditionally a single-response Likert response format. Therefore, our study aims to elucidate and illustrate an alternative response format and analytic technique, Thurstonian item response theory (IRT), for analyzing data from surveys using an alternate response format, the forced-choice format. Specifically, we strove to give a thorough introduction of Thurstonian IRT at a more elementary level than previous publications in order to widen the possible audience. This article presents analyses and comparison of two versions of a self-report scale, one version using a single-response format and the other using a forced-choice format. Drawing from lessons learned from our study and literature, we present a number of recommendations for conducting research using the forced-choice format and Thurstonian IRT, as well as suggested avenues for future research.

Download Full-text

The Emergence of Item Response Theory Models and the Patient Reported Outcomes Measurement Information Systems

Austrian Journal of Statistics ◽

10.17713/ajs.v38i4.274 ◽

2016 ◽

Vol 38 (4) ◽

Cited By ~ 3

Author(s):

Steven P. Reise

Keyword(s):

Information Systems ◽

Item Response Theory ◽

Item Response ◽

Patient Reported Outcomes ◽

Cognitive Domain ◽

Future Research ◽

Measurement Information ◽

Response Theory ◽

Outcomes Measurement ◽

Patient Reported

Item response theory (IRT) models emerged to solve practical testing problems in large-scale cognitive achievement and aptitude assessment. Within the last decade, an explosion of IRT applications have occurred in the non-cognitive domain. In this report, I highlight the development, implementation, and results of a single project: Patient Reported Outcomes Measurement Information Systems (PROMIS). The PROMIS projectreflects the state-of-the-art application of IRT in the non-cognitive domain, and has produced important advancements in patient reported outcomes measurement.However, the project also illustrates challenges that confront researchers wishing to apply IRT to non-cognitive constructs. These challenges are: a) selecting a population to set the metric for interpretation of item parameters, b) working with non-normal quasi-continuous latent traits, and c) working with narrow-bandwidth constructs that potentially have a limitedpool of potential indicators. Differences between cognitive and non-cognitive measurement contexts are discussed and directions for future research suggested.

Download Full-text

ITEM RESPONSE THEORY APPROACH: KALIBRASI BUTIR SOAL PENILAIAN AKHIR SEMESTER MATA PELAJARAN BAHASA ARAB

Arabi Journal of Arabic Studies ◽

10.24865/ajas.v6i1.320 ◽

2021 ◽

Vol 6 (1) ◽

pp. 93

Author(s):

Rahmat Danni ◽

Ajeng Wahyuni ◽

Tauratiya Tauratiya

Keyword(s):

Item Response Theory ◽

Item Response ◽

Large Scale ◽

Arabic Language ◽

Thinking Skills ◽

Future Research ◽

Theory Approach ◽

Response Theory ◽

Good Level ◽

Level Of Difficulty

This study describes the item details of the final semester questions in Arabic MAN 1 Pangkalpinang using the item response theory approach. The problem behind this research is that the development of Arabic final assessment items did not go through the correct stages. Therefore, this research is quantitative research. The subjects of this study were 176 students of class XI MAN 1 Pangkalpinang. The answer data is in the form of answers to questions in the final semester in Arabic which are 40 multiple-choice items with five answers. The results showed that the final results of the Arabic semester (1) proved valid, indicated by 40 items (100%) with loading factors; (2) proven to be reliable, indicated by the reliability coefficient of 0.884; (3) there are 33 items (82.5%) of the 40 items that have a good level of difficulty and distinguishing power so that they can be stored in the question bank and used in subsequent activities, while 7 items (17.5%) are item number 10, 26, 27, 29, 32, 34, and 35 do not meet the criteria for a good level of difficulty so they need to be revised or eliminated; and (4) suitable for use in students with low to moderate ability (θ) in the range -3.5 to +1.5 in logit. Future research is expected to be able to analyze Arabic language question items in the form of descriptive tests on a large scale or develop high-quality high-order thinking skills in Arabic.

Download Full-text

Validation of a Novel Climate Change Denial Measure Using Item Response Theory

10.31234/osf.io/57nbk ◽

2019 ◽

Author(s):

George Loram ◽

Mathew Ling ◽

Andrew Head ◽

Edward John Roy Clarke

Keyword(s):

Climate Change ◽

Item Response Theory ◽

Item Response ◽

Ad Hoc ◽

Social Dominance Orientation ◽

Limited Range ◽

Screening Tools ◽

Future Research ◽

Response Theory ◽

Climate Change Denial

Climate change denial persists despite overwhelming scientific consensus on the issue. However, the rates of denial reported in the literature are inconsistent, potentially as a function of ad hoc measurement of denial. This further impacts on interpretability and integration of research. This study aims to create a standardised measure of climate change denial using Item Response Theory (IRT). The measure was created by pooling items from existing denial measures, and was administered to a U.S. sample recruited using Amazon MTurk (N = 206). Participants responded to the prototype measure as well as being measured on a number of constructs that have been shown to correlate with climate change denial (authoritarianism, social dominance orientation, mistrust in scientists, and conspiracist beliefs). Item characteristics were calculated using a 2-parameter IRT model. After screening out poorly discriminating and redundant items, the scale contained eight items. Discrimination indices were high, ranging from 2.254 to 30.839, but item difficulties ranged from 0.437 to 1.167, capturing a relatively narrow band of climate change denial. Internal consistency was high, ω = .94. Moderate to strong correlations were found between the denial measure and the convergent measures. This measure is a novel and efficient approach to the measurement of climate change denial and includes highly discriminating items that could be used as screening tools. The limited range of item difficulties suggests that different forms of climate change denial may be closer together than previously thought. Future research directions include validating the measure in larger samples, and examining the predictive utility of the measure.

Download Full-text

Measuring Creative Self-Efficacy: An Item Response Theory Analysis of the Creative Self-Efficacy Scale

Frontiers in Psychology ◽

10.3389/fpsyg.2021.678033 ◽

2021 ◽

Vol 12 ◽

Author(s):

Amy Shaw ◽

Melissa Kapnek ◽

Neil A. Morelli

Keyword(s):

Item Response Theory ◽

Item Response ◽

Self Efficacy ◽

Convergent Validity ◽

Item Difficulty ◽

Future Research ◽

Response Theory ◽

Scale Scores ◽

Gender Based ◽

Creative Self

Applying the graded response model within the item response theory framework, the present study analyzes the psychometric properties of Karwowski’s creative self-efficacy (CSE) scale. With an ethnically diverse sample of US college students, the results suggested that the six items of the CSE scale were well fitted to a latent unidimensional structure. The scale also had adequate measurement precision or reliability, high levels of item discrimination, and an appropriate range of item difficulty. Gender-based differential item functioning analyses confirmed that there were no differences in the measurement results of the scale concerning gender. Additionally, openness to experience was found to be positively related to the CSE scale scores, providing some support for the scale’s convergent validity. Collectively, these results confirmed the psychometric soundness of the CSE scale for measuring CSE and also identified avenues for future research.

Download Full-text

The Value Of Item Response Theory In Invariance Testing

Journal of Applied Business Research (JABR) ◽

10.19030/jabr.v32i6.9836 ◽

2016 ◽

Vol 32 (6) ◽

pp. 1843

Author(s):

Nico Martins ◽

Hester Nienaber

Keyword(s):

Item Response Theory ◽

Measurement Invariance ◽

Item Response ◽

Employee Engagement ◽

Future Research ◽

Analytic Structure ◽

Response Theory ◽

Multiple Group ◽

The Individual ◽

Annual Turnover

The goal of the current study was to assess the Employee Engagement Instrument (EEI) from an item response theory (IRT) perspective, with a specific focus on measurement invariance for annual turnover. The sample comprised 4 099 respondents from all business sectors in South Africa. This article describes the logic and procedures used to test for factorial invariance across groups in the context of construct validation. The procedures included testing for configural and metric invariance in the framework of multiple-group confirmatory factor analysis (CFA).The results confirmed the factor analytic structure of the model fit for some of the individual scales of the EEI. The measurement invariance of the EEI as a function of annual turnover was confirmed. However, the results indicated that the EEI needs to be refined for future research.

Download Full-text

Validating the 28-Tender Joint Count Using Item Response Theory

The Journal of Rheumatology ◽

10.3899/jrheum.110436 ◽

2011 ◽

Vol 38 (12) ◽

pp. 2557-2564 ◽

Cited By ~ 6

Author(s):

LISETH SIEMONS ◽

PETER M. ten KLOOSTER ◽

ERIK TAAL ◽

INA H. KUPER ◽

PIET L.C.M. van RIEL ◽

...

Keyword(s):

Item Response Theory ◽

Construct Validity ◽

Disease Activity ◽

Item Response ◽

Early Stage ◽

Future Research ◽

Joint Involvement ◽

Tender Joint Count ◽

Response Theory ◽

Tender Joint

Objective.To examine the construct validity of the 28-tender joint count (TJC-28) using item response theory (IRT)-based methods.Methods.A total of 457 patients with early stage rheumatoid arthritis (RA) were included. Internal construct validity of the TJC-28 was evaluated by determining whether the TJC-28 fit a 2-measure logistic IRT model. As well, we tested whether the discrimination and difficulty parameters of the joints properly reflected the known left-right symmetry of joint involvement. External validity was evaluated by correlations with other established measures of disease activity, including pain, disability, general health, erythrocyte sedimentation rate (ESR), and the 28-swollen joint count.Results.The TJC-28 showed a good fit with the 2-parameter logistic model, with no relevant differential item functioning across sex, age, and time and with excellent reliability. The 28 joints covered a reasonable range of disease activity, even though they were mainly targeted at patients with moderate or high disease activity levels. The joint parameters reflected the left-right symmetry of joint involvement for all pairs of joints except one. All disease activity measures, except ESR, were significantly correlated with the TJC-28. Most correlations were of the expected magnitude.Conclusion.The TJC-28 showed good internal and acceptable external construct validity for patients with early-stage RA. The IRT analyses did point to some potential limitations of the instrument, a major problem being its limited measurement range. Future research should examine whether instrument modifications might lead to a more robust assessment of disease activity in patients with RA.

Download Full-text