Measuring Emotional Response: Comparing Alternative Approaches to Measurement

2015 ◽  
Vol 5 (4) ◽  
pp. 733-754 ◽  
Author(s):  
George E. Marcus ◽  
W. Russell Neuman ◽  
Michael B. MacKuen

Scholarly interest in the role of emotion in accounting for how people react to political figures, events, and messages has escalated over the past two plus decades in political science and psychology. However, research on the validity of the measurement of subjective self-report of emotional responses is rather limited. We introduce here a new measurement approach, a “slider” format and compare it with the long used “radio button” item format. We assess the reliability and validity of these two approaches to the measurement of affect. The study examines self-report measures of emotion to three generated news stories about terrorist threats. We report that both measurement formats are able to extract the expected threefold affect structure from a ten affect word battery. The slider format is, however, modestly more reliable, and more efficient in time to complete, has the ability to limit missing data, and generates continuous data that is less truncated than data derived from the radio button format. Finally, we report on three tests of construct validity. Both approaches exhibit equivalent results on two of those tests. However, the radio button format does poorly on one test of construct validity, that on the anticipated relationship between anxiety and interest in novel information. We present an assessment of two methods for measuring emotional reactions to stimuli such as political issues, political figures, or events. Both methods are suitable for use in online surveys or computer-driven experiments. The traditional method utilizes labeled “radio buttons” that enable a participant in a study to select by clicking on one of an array of typically five response options, ranging from lower to higher of some identified affect term (e.g., how angry one might feel). Second, the slider method offers a participant the ability to move an “arrow” up or down to indicate how much (up) or little (down) they feel. The goal of both measures is to ascertain the level of a targeted emotion, i.e., how little or how much, say anger. The slider method has been specifically developed to be used with participants using a computer. The slider approach falls within the category of visual analog scales. This method for measuring affective responses to stimuli of whatever sort has not hitherto been examined to determine its reliability and validity. The literature on the reliability and validity of these measurement strategies is thin and we found no studies including an explicit comparison.


1993 ◽  
Vol 72 (1) ◽  
pp. 271-275 ◽  
Author(s):  
Diane L. Ostrander ◽  
Cindi Penor Ceglian ◽  
David G. Fournier

The purpose of this study was to evaluate the Stressors of Clergy Children and Couples Scale, an abbreviated version of the Stressors of Clergy Children Inventory. The current self-report survey was tested using 317 ministers' family members across several denominations. Tests for internal consistency reliability and construct validity indicated this scale was adequate for use in research. Recommendations for refinement and use of the scale were presented.



1996 ◽  
Vol 169 (3) ◽  
pp. 276-281 ◽  
Author(s):  
Andrew J. Baillie ◽  
Richard P. Mattick

BackgroundA short self-report questionnaire was developed to assess dependence on benzodiazepines (BZDs), the Benzodiazepine Dependence Questionnaire (BDEPQ). The BDEPQ is the first scale to assess dependence on BZDs comprehensively, as all existing scales focus exclusively on withdrawal symptoms.MethodTo evaluate its internal consistency and construct validity, 302 regular BZD users were recruited from media advertisements and assessed on a number of measures. The BDEPQ was compared with measures of depression, anxiety, sleep quality, BZD withdrawal symptoms and neuroticism to assess its construct validity. A 3–4 month follow-up was conducted to assess the ability of the BDEPQ to predict changes in BZD consumption and future BZD withdrawal.ResultsThe BDEPQ was found to have high internal consistency and to be relatively stable over the follow-up period. Three subscales were identified, each with good internal consistency and temporal stability. The BDEPQ was able to predict the severity of withdrawal symptoms.ConclusionThe BDEPQ was found to be a reliable and valid self-report instrument for the assessment of BZD dependence in samples approximating the general population of people using BZDs.



Author(s):  
Rod A. Martin

AbstractI applaud Heintz and Ruch (this issue) for their efforts at evaluating the construct validity of the Humor Styles Questionnaire (HSQ). However, their study has several weaknesses that limit the conclusions that can be drawn. Most importantly, the measures developed by these authors to test the validity of the HSQ have questionable reliability and validity themselves, due to the challenges inherent in measuring humor styles via self-report. These limitations are discussed, and suggestions are provided for further research along these lines.



2018 ◽  
Vol 37 (1) ◽  
pp. 119-132 ◽  
Author(s):  
Oriol J. Bosch ◽  
Melanie Revilla ◽  
Anna DeCastellarnau ◽  
Wiebke Weber

Little is known about the reliability and validity in web surveys, although this is crucial information to evaluate how accurate the results might be and/or to correct for measurement errors. In particular, there are few studies based on probability-based samples for web surveys, looking at web-specific response scales and considering the impact of having smartphone respondents. In this article, we start filling these gaps by estimating the measurement quality of sliders compared to radio button scales controlling for the device respondents used. We conducted therefore two multitrait–multimethod (MTMM) experiments in the Norwegian Citizen Panel (NCP), a probability-based online panel. Overall, we find that if smartphone respondents represent a nonnegligible part of the whole sample, offering the response options in form of a slider or a radio button scale leads to a quite similar measurement quality. This means that sliders could be used more often without harming the data quality. Besides, if there are no smartphone respondents, we find that sliders can also be used, but that the marker should be placed initially in the middle rather than on the left side. However, in practice, there is no need to shift from radio buttons to sliders since the quality is not highly improved by providing sliders.



2018 ◽  
Author(s):  
Shelby Ceh ◽  
Eric John Vanman

Social robots will soon be a part of our everyday work and home lives. Roboticists have assumed that a more human-like appearance in robots will ensure pro-sociality and cooperation between robots and humans. Indeed, people more readily attribute human-like capacities and a perception of mind to such robots. In turn, people show greater empathy for human-like robots. However, a human-like design has also been linked to beliefs about the potential of robots to be evil and cause damage to humanity. The aim of this research was to further investigate these ambivalent emotional reactions. In two studies, participants viewed images of robots of varying human likeness that were also depicted as the main character in a sad situation. Participants rated their sympathy for the robot but also their perceptions of threat. Facial electromyography (EMG) was added in the second study to measure empathic facial responses. We predicted that human likeness would have effects on both empathy and threat responses. We also tested whether these effects would be moderated by social categorization and threat information. Study 1 (n = 201) manipulated the categorization of the robot through changes in the phrasing of the scenarios, whereas Study 2 (n = 120) manipulated threat information about robots by first presenting positive or negative news stories about robots. Consistent with previous research, more human-like robots elicited greater empathic and threat responses, both on self-report and EMG measures. Manipulations of social categorization and threat failed to moderate the effects of human likeness on the measures. These results have considerable implications for the existing human-robot interaction literature and could help inform social robot designs for the future.



2018 ◽  
Vol 02 (06) ◽  
pp. E191-E199 ◽  
Author(s):  
Shaw Bronner ◽  
Igor Urbano

AbstractThe Dance Functional Outcome Survey (DFOS) was developed as a self-report questionnaire for healthy and injured ballet and modern dancers, focusing on the low back and lower extremities. Our aim was to determine factor analysis and internal consistency of the 16 items and to investigate test-retest and equivalence reliability and validity of the DFOS compared to three orthopedic outcomes instruments. Data were collected from 80 healthy and injured adult ballet and modern pre-professional and professional dancers. DFOS Likert-type and visual analog scales were completed twice within 4–9 days to study test-retest reliability. The Cincinnati Knee Rating System, Olerud and Molander Foot-Ankle Questionnaire, and Oswestry Disability Index were used to assess concurrent validity using intraclass correlation coefficients in SPSS, p<0.05. To determine instrument dimensions and internal consistency of the items, we conducted exploratory factor analysis and calculated Cronbach’s α in JASP. DFOS demonstrated single factor loading and high Cronbach’s α; high test-retest repeatability and equivalence reliability (r=0.74–0.99) and acceptable criterion validity compared to the orthopedic outcomes instruments (r≥0.67). These results support further study of a revised 14 item Likert-version DFOS for repeatability, validity and responsiveness.



2001 ◽  
Vol 16 (1) ◽  
pp. 38-48 ◽  
Author(s):  
M. Fichter ◽  
N. Quadflieg

Objective. For reliable and valid assessment and diagnostic categorization of eating disorders, self-report measures have considerable limitations. A semi-structured interview – the SIAB-EX – was developed for a more reliable and valid assessment of eating disorders.Methods. One study (videotapes of 31 inpatients, seven raters) was made to establish inter-rater reliability; in another study with 80 patients the SIAB-EX was compared to another semi-structured interview designed for comparable purposes (EDE). In a third study data was obtained on 377 eating disorder patients seeking treatment to explore discriminant and convergent (construct) validity using the following self-rating scales: EDI, TFEQ, SCL-90, BDI, and the PERI Demoralization Scale.Results. Inter-rater reliability of dichotomous ratings was good with mean kappa values of .81 (current) and .85 (past). Comparison of the SIAB-EX with the EDE generally showed quite similar results and higher intercorrelation of the total scale (.77). There are, however, a number of differences between the two scales, which are discussed in detail. Construct validity of the SIAB-EX was established.Conclusion. Inter-rater reliability was good. Convergent and discriminant (construct) validity of the SIAB-EX was demonstrated. The constructs assessed by the SIAB and its subscales and items are discussed in the context of their correlations with other well-known scales.



2018 ◽  
Vol 35 (3) ◽  
pp. 376-399 ◽  
Author(s):  
Tan-Chyuan Chin ◽  
Eduardo Coutinho ◽  
Klaus R. Scherer ◽  
Nikki S. Rickard

Music engagement is complex and is influenced by music training, capacity, preferences, and motivations. A multi-modular self-report instrument (the Music Use and Background Questionnaire, or MUSEBAQ) was developed to measure a diverse set of music engagement constructs. Based on earlier work, a hybrid approach of exploratory and confirmatory analyses was conducted across a series of three independent studies to establish reliability and validity of the modular tool. Module 1 (Musicianship) provides a brief assessment of formal and informal music knowledge and practice. Module 2 (Musical capacity) measures emotional sensitivity to music, listening sophistication, music memory and imagery, and personal commitment to music. Module 3 (Music preferences) captures preferences from six broad genres and utilizes adaptive reasoning to selectively expand subgenres when administered online. Module 4 (Motivations for music use) assesses musical transcendence, emotion regulation, social, and musical identity and expression. The MUSEBAQ offers researchers and practitioners a comprehensive, modular instrument that can be used in whole, or by module as required to capture an individual’s level of engagement with music and to serve as a background questionnaire to measure and interpret the effects of dispositional differences in emotional reactions to music.



2016 ◽  
Vol 24 (1) ◽  
pp. 108-130
Author(s):  
Carol L. Lawrence ◽  
Anne E. Norris

Background and Purpose: The purpose of this research was to evaluate the psychometric properties of a new instrument to measure mother–infant togetherness, Mother–Infant Togetherness Survey (MITS). Methods: Stage 1 examined content validity. Stage 2 pretested the readability and understandability and further examined content validity. Stage 3 examined women’s ability to accurately self-report on the Delivery Events subscale. Stages 4 and 5 examined construct validity. Results: Good content validity was obtained at the scale/subscale level (CVI = .91–1.00). Internal consistency reliability was evaluated at the scale/subscale level (α = .62–.89). Construct validity was supported with known groups testing and factor analysis. Conclusion: Study findings provide support for the reliability and validity of the MITS. Future research should be done to improve the internal consistency reliability of the Postpartum Events subscale.



2019 ◽  
Vol 35 (1) ◽  
pp. 117-125
Author(s):  
Johannes Schult ◽  
Rebecca Schneider ◽  
Jörn R. Sparfeldt

Abstract. The need for efficient personality inventories has led to the wide use of short instruments. The corresponding items often contain multiple, potentially conflicting descriptors within one item. In Study 1 ( N = 198 university students), the reliability and validity of the TIPI (Ten-Item Personality Inventory) was compared with the reliability and validity of a modified TIPI based on items that rephrased each two-descriptor item into two single-descriptor items. In Study 2 ( N = 268 university students), we administered the BFI-10 (Big Five Inventory short version) and a similarly modified version of the BFI-10 without two-descriptor items. In both studies, reliability and construct validity values occasionally improved for separated multi-descriptor items. The inventories with multi-descriptor items showed shortcomings in some factors of the TIPI and the BFI-10. However, the other scales worked comparably well in the original and modified inventories. The limitations of short personality inventories with multi-descriptor items are discussed.



Sign in / Sign up

Export Citation Format

Share Document