How Survey Scoring Decisions Can Bias Your Study’s Results: A Trip Through the IRT Looking Glass

2021 ◽  
Author(s):  
James Soland ◽  
Megan Kuhfeld

Though much effort is often put into designing psychological studies, the measurement model and scoring approach employed are often an afterthought, especially when short survey scales are used (Flake & Fried, 2020). One possible reason that measurement gets downplayed is that there is generally little understanding of how calibration/scoring approaches could impact common estimands of interest, including treatment effect estimates, beyond random noise due to measurement error. Another possible reason is that the process of scoring is complicated, involving selecting a suitable measurement model, calibrating its parameters, then deciding how to generate a score, all steps that occur before the score is even used to examine the desired psychological phenomenon. In this study, we provide three motivating examples where surveys are used to understand individuals’ underlying social emotional and/or personality constructs to demonstrate the potential consequences of measurement/scoring decisions. These examples also mean we can walk through the different measurement decision stages and, hopefully, begin to demystify them. As we show in our analyses, the decisions researchers make about how to calibrate and score the survey used has consequences that are often overlooked, with likely implications both for conclusions drawn from individual psychological studies and replications of studies.

2018 ◽  
Vol 37 (2) ◽  
pp. 232-256 ◽  
Author(s):  
Bradley C. Smith ◽  
William Spaniel

The causes and consequences of nuclear proficiency are central to important questions in international relations. At present, researchers tend to use observable characteristics as a proxy. However, aggregation is a problem: existing measures implicitly assume that each indicator is equally informative and that measurement error is not a concern. We overcome these issues by applying a statistical measurement model to directly estimate nuclear proficiency from observed indicators. The resulting estimates form a new dataset on nuclear proficiency which we call ν-CLEAR. We demonstrate that these estimates are consistent with known patterns of nuclear proficiency while also uncovering more nuance than existing measures. Additionally, we demonstrate how scholars can use these estimates to account for measurement error by revisiting existing results with our measure.


1981 ◽  
Vol 18 (1) ◽  
pp. 39-50 ◽  
Author(s):  
Claes Fornell ◽  
David F. Larcker

The statistical tests used in the analysis of structural equation models with unobservable variables and measurement error are examined. A drawback of the commonly applied chi square test, in addition to the known problems related to sample size and power, is that it may indicate an increasing correspondence between the hypothesized model and the observed data as both the measurement properties and the relationship between constructs decline. Further, and contrary to common assertion, the risk of making a Type II error can be substantial even when the sample size is large. Moreover, the present testing methods are unable to assess a model's explanatory power. To overcome these problems, the authors develop and apply a testing system based on measures of shared variance within the structural model, measurement model, and overall model.


2017 ◽  
Author(s):  
thomas Scheff

A Theory of War and Violence (First section)Thomas Scheff, G. Reginald Daniel, and Joseph Loe-Sterphone, Dept of Sociology, UCSB(9260 words total) Abstract: It is possible that war in modern societies is largely driven by emotions, but in a way that is almost completely hidden. Modernity individualizes the self and tends to ignore emotions. As a result, conflict can be caused by sequences in which the total hiding of humiliation leads to vengeance. This essay outlines a theory of the social-emotional world implied in the work of C. H. Cooley and others. Cooley’s concept of the “looking-glass self” can be used as antidote to the assumptions of modernity: the basic self is social and emotional: selves are based on “living in the mind” of others, with a result of feeling either pride of shame. Cooley discusses shame at some length, unlike most approaches, which tend to hide it. This essay proposes that the complete hiding of shame can lead to feedback loops (spirals) with no natural limit: shame about shame and anger is only the first step. Emotion backlogs can feed back when emotional experiences are completely hidden: avoiding all pain can lead to limitless spirals. These ideas may help explain the role of France in causing WWI, and Hitler’s rise to power in Germany. To the extent that these propositions are true, the part played by emotions and especially shame in causing wars need to be further studied.“...if a whole nation were to feel ashamed it would be like a lion recoiling in order to spring.” Karl Marx (1975, p. 200)


2012 ◽  
Vol 239-240 ◽  
pp. 735-738
Author(s):  
Bin Feng ◽  
Xiang Zhong Meng

In order to establish a larger sensing area (10m×10m) and perform higher precision of the measurement system during barrel weapon test of vertical target dispersion, an optical detector is provided additionally in acoustic array to form a new measurement model, which does not only have the advantage of large sensing area, but also of higher precision. The paper presents and gives the measuring formula of the measuring model, and provides analysis on the measurement error. The simulation result showes that the errors of x and y Coordinates are 5mm and 15mm respectively.


2017 ◽  
Vol 28 (7) ◽  
pp. 2049-2068 ◽  
Author(s):  
Di Shu ◽  
Grace Y Yi

Inverse probability weighting estimation has been popularly used to consistently estimate the average treatment effect. Its validity, however, is challenged by the presence of error-prone variables. In this paper, we explore the inverse probability weighting estimation with mismeasured outcome variables. We study the impact of measurement error for both continuous and discrete outcome variables and reveal interesting consequences of the naive analysis which ignores measurement error. When a continuous outcome variable is mismeasured under an additive measurement error model, the naive analysis may still yield a consistent estimator; when the outcome is binary, we derive the asymptotic bias in a closed-form. Furthermore, we develop consistent estimation procedures for practical scenarios where either validation data or replicates are available. With validation data, we propose an efficient method for estimation of average treatment effect; the efficiency gain is substantial relative to usual methods of using validation data. To provide protection against model misspecification, we further propose a doubly robust estimator which is consistent even when either the treatment model or the outcome model is misspecified. Simulation studies are reported to assess the performance of the proposed methods. An application to a smoking cessation dataset is presented.


2019 ◽  
Author(s):  
Alejandra Rodríguez Sánchez

Evidence of social inequalities in cognitive abilities in early childhood has been documented in many societies; however, three characteristics of the data used to measure cognitive constructs make it difficult to quantify inequalities across groups. First, a causal understanding of validity is not compatible with the standard validation framework, which forces researchers to think critically what it means to measure unobserved constructs. Second, test scores only provide ordinal information about individuals, they are not interval scales and require the use of suitable corresponding methods for their study. Third, measurement invariance, which causes measurement error, may make comparison of test scores across groups invalid. The paper explores these three data problems applied to standardized tests---one mathematics and two language assessments---taken by a cohort of German children. The paper proposes a comparative validation framework for researchers based on nonparametric psychometric models and the representational theory of measurement. This framework can help researchers to determine if data fit the assumptions of a measurement model, to check for various forms of measurement error, and to overcome potential issues. A comparison of competing statistical modeling alternatives reveals substantial differences: By conceptualizing ability as ordinal instead of interval and excluding items that do not fit the assumptions of measurement models, I find a reduction in effect sizes for typical covariates studied in social stratification research.


Sign in / Sign up

Export Citation Format

Share Document