scholarly journals A robust confidence–accuracy dissociation via criterion attraction

2021 ◽  
Vol 2021 (1) ◽  
Author(s):  
Dobromir Rahnev

Abstract Many studies have shown that confidence and accuracy can be dissociated in a variety of tasks. However, most of these dissociations involve small effect sizes, occur only in a subset of participants, and include a reaction time (RT) confound. Here, I develop a new method for inducing confidence–accuracy dissociations that overcomes these limitations. The method uses an external noise manipulation and relies on the phenomenon of criterion attraction where criteria for different tasks become attracted to each other. Subjects judged the identity of stimuli generated with either low or high external noise. The results showed that the two conditions were matched on accuracy and RT but produced a large difference in confidence (effect appeared for 25 of 26 participants, effect size: Cohen’s d = 1.9). Computational modeling confirmed that these results are consistent with a mechanism of criterion attraction. These findings establish a new method for creating conditions with large differences in confidence without differences in accuracy or RT. Unlike many previous studies, however, the current method does not lead to differences in subjective experience and instead produces robust confidence–accuracy dissociations by exploiting limitations in post-perceptual, cognitive processes.

2019 ◽  
Vol 3 (4) ◽  
Author(s):  
Christopher R Brydges

Abstract Background and Objectives Researchers typically use Cohen’s guidelines of Pearson’s r = .10, .30, and .50, and Cohen’s d = 0.20, 0.50, and 0.80 to interpret observed effect sizes as small, medium, or large, respectively. However, these guidelines were not based on quantitative estimates and are only recommended if field-specific estimates are unknown. This study investigated the distribution of effect sizes in both individual differences research and group differences research in gerontology to provide estimates of effect sizes in the field. Research Design and Methods Effect sizes (Pearson’s r, Cohen’s d, and Hedges’ g) were extracted from meta-analyses published in 10 top-ranked gerontology journals. The 25th, 50th, and 75th percentile ranks were calculated for Pearson’s r (individual differences) and Cohen’s d or Hedges’ g (group differences) values as indicators of small, medium, and large effects. A priori power analyses were conducted for sample size calculations given the observed effect size estimates. Results Effect sizes of Pearson’s r = .12, .20, and .32 for individual differences research and Hedges’ g = 0.16, 0.38, and 0.76 for group differences research were interpreted as small, medium, and large effects in gerontology. Discussion and Implications Cohen’s guidelines appear to overestimate effect sizes in gerontology. Researchers are encouraged to use Pearson’s r = .10, .20, and .30, and Cohen’s d or Hedges’ g = 0.15, 0.40, and 0.75 to interpret small, medium, and large effects in gerontology, and recruit larger samples.


2019 ◽  
Author(s):  
Farid Anvari ◽  
Daniel Lakens

Effect sizes are an important outcome of quantitative research, but few guidelines exist that explain how researchers can determine which effect sizes are meaningful. Psychologists often want to study effects that are large enough to make a difference to people’s subjective experience. Thus, subjective experience is one way to gauge meaningfulness of an effect. We illustrate how to quantify the minimum subjectively experienced difference—the smallest change in an outcome measure that individuals consider to be meaningful enough in their subjective experience such that they are willing to rate themselves as feeling different—using an anchor-based method with a global rating of change question applied to the positive and negative affect scale. For researchers interested in people’s subjective experiences, this anchor-based method provides one way to specify a smallest effect size of interest, which allows researchers to interpret observed results in terms of their theoretical and practical significance.


Blood ◽  
2004 ◽  
Vol 104 (11) ◽  
pp. 5305-5305
Author(s):  
Lee S. Schwartzberg ◽  
Barry V. Fortner ◽  
Arthur C. Houts

Abstract Background. Cancer related anemia (CRA) is associated with increased symptom burden and reduced quality of life (QoL) compared to non-anemic subjects. Standard treatment guidelines for CRA recomend treatment when hemoglobin (Hb) declines to ≤ 10 g/dl. We compared patients with mild anemia to those with normal Hb level to assess clinical significance of mild anemia. Methods. 3416 patients from a large community oncology database sorted by gender and Hb were retrospectively reviewed. Patients receiving chemotherapy (<30 days) and/or growth factor (< 60 days) were excluded. Each case provided one Hb determination and same day self reported scores on the Cancer Care Monitor (CCM), a validated measure of symptom burden, functioning, and health related QoL comprising 6 scales. All CCM items are rated for symptom severity on a 0–10 scale. Effect sizes for male groups are calculated relative to normal males (Hb > 14), and effect sizes for females are calculated relative to normal females (Hb > 12). Positive effect size (Cohen’s exact d) values indicate greater symptom burden, and negative effect size values indicate lower quality of life. Results. Group differences on CCM measures were not accounted for by demographic, cancer diagnosis, and chemotherapy history differences. Table 1 shows a pattern of greater symptom burden, lower functioning, and worse QOL for males and females with mild anemia (p < .05). As compared to standards for minimal clinically important differences such as found with SF-36 (0.09 < Cohen’s d < 0.28). patients with mild anemia showed clinically significant differences in terms of effect size (0.11 < Cohen’s d < 0.61). Conclusions. The QoL impact of mild CRA is significant and failure to treat mild anemia may result in unnecessary symptom burden and noteworthy decrements in health related quality of life. Differences in QoL for Patients with Mild Anemia M Hb12<14 M Hb >14 Effect Size F Hb 10<12 F Hb >12 Effect Size Cancer Care Monitor Measure* (n = 327) (n = 471) Males (n = 449) (n = 2169) Females Note. *Mean (SE) for each item or scale. Adjusted means with different subscripts across rows are significantly different (p < .05) by Bryant Paulson comparisons. Fatigue Item 3.10 (0.16) b 2.27 (0.13) c 0.30 3.47 (0.13) b 2.65 (0.06) d 0.30 Physical Symptoms 49.54 (0.58) b 45.49 (0.48) c 0.38 49.32 (0.49) b 46.97 (0.22) d 0.22 General Distress 49.51 (0.57) b 47.07 (0.47) d 0.19 51.23 (0.49) c 49.85 (0.22) b 0.11 Despair 50.74 (0.52) b 48.58 (0.43) c 0.25 50.37 (0.44) b 49.13 (0.19) c 0.14 Impaired Performance 55.55 (0.87) b 49.78 (0.70) c 0.61 55.02 (0.73) b 49.90 (0.32) c 0.48 Quality of Life 49.41 (0.79) b 53.46 (0.64) c −0.45 48.95 (0.67) b 52.39 (0.29) c −0.35


2019 ◽  
Author(s):  
Christopher Brydges

Background and Objectives: Researchers typically use Cohen’s guidelines of Pearson’s r = .10, .30, and .50, and Cohen’s d = 0.20, 0.50, and 0.80 to interpret observed effect sizes as small, medium, or large, respectively. However, these guidelines were not based on quantitative estimates, and are only recommended if field-specific estimates are unknown. The current study investigated the distribution of effect sizes in both individual differences research and group differences research in gerontology to provide estimates of effect sizes in the field.Research Design and Methods: Effect sizes (Pearson’s r, Cohen’s d, and Hedges’ g) were extracted from meta-analyses published in ten top-ranked gerontology journals. The 25th, 50th, and 75th percentile ranks were calculated for Pearson’s r (individual differences) and Cohen’s d or Hedges’ g (group differences) values as indicators of small, medium, and large effects. A priori power analyses were conducted for sample size calculations given the observed effect size estimates.Results: Effect sizes of Pearson’s r = .12, .20, and .32 for individual differences research and Hedges’ g = 0.16, 0.38, and 0.76 for group differences research were interpreted as small, medium, and large effects in gerontology. Discussion and Implications: Cohen’s guidelines appear to overestimate effect sizes in gerontology. Researchers are encouraged to use Pearson’s r = .10, .20, and .30, and Cohen’s d or Hedges’ g = 0.15, 0.40, and 0.75 to interpret small, medium, and large effects in gerontology, and recruit larger samples.


2006 ◽  
Vol 20 (3) ◽  
pp. 186-194 ◽  
Author(s):  
Susanne Mayr ◽  
Michael Niedeggen ◽  
Axel Buchner ◽  
Guido Orgs

Responding to a stimulus that had to be ignored previously is usually slowed-down (negative priming effect). This study investigates the reaction time and ERP effects of the negative priming phenomenon in the auditory domain. Thirty participants had to categorize sounds as musical instruments or animal voices. Reaction times were slowed-down in the negative priming condition relative to two control conditions. This effect was stronger for slow reactions (above intraindividual median) than for fast reactions (below intraindividual median). ERP analysis revealed a parietally located negativity of the negative priming condition compared to the control conditions between 550-730 ms poststimulus. This replicates the findings of Mayr, Niedeggen, Buchner, and Pietrowsky (2003) . The ERP correlate was more pronounced for slow trials (above intraindividual median) than for fast trials (below intraindividual median). The dependency of the negative priming effect size on the reaction time level found in the reaction time analysis as well as in the ERP analysis is consistent with both the inhibition as well as the episodic retrieval account of negative priming. A methodological artifact explanation of this effect-size dependency is discussed and discarded.


2010 ◽  
Vol 24 (1) ◽  
pp. 1-6 ◽  
Author(s):  
Oscar H. Hernández ◽  
Muriel Vogel-Sprott

A missing stimulus task requires an immediate response to the omission of a regular recurrent stimulus. The task evokes a subclass of event-related potential known as omitted stimulus potential (OSP), which reflects some cognitive processes such as expectancy. The behavioral response to a missing stimulus is referred to as omitted stimulus reaction time (RT). This total RT measure is known to include cognitive and motor components. The cognitive component (premotor RT) is measured by the time from the missing stimulus until the onset of motor action. The motor RT component is measured by the time from the onset of muscle action until the completion of the response. Previous research showed that RT is faster to auditory than to visual stimuli, and that the premotor of RT to a missing auditory stimulus is correlated with the duration of an OSP. Although this observation suggests that similar cognitive processes might underlie these two measures, no research has tested this possibility. If similar cognitive processes are involved in the premotor RT and OSP duration, these two measures should be correlated in visual and somatosensory modalities, and the premotor RT to missing auditory stimuli should be fastest. This hypothesis was tested in 17 young male volunteers who performed a missing stimulus task, who were presented with trains of auditory, visual, and somatosensory stimuli and the OSP and RT measures were recorded. The results showed that premotor RT and OSP duration were consistently related, and that both measures were shorter with respect to auditory stimuli than to visual or somatosensory stimuli. This provides the first evidence that the premotor RT is related to an attribute of the OSP in all three sensory modalities.


Methodology ◽  
2019 ◽  
Vol 15 (3) ◽  
pp. 97-105
Author(s):  
Rodrigo Ferrer ◽  
Antonio Pardo

Abstract. In a recent paper, Ferrer and Pardo (2014) tested several distribution-based methods designed to assess when test scores obtained before and after an intervention reflect a statistically reliable change. However, we still do not know how these methods perform from the point of view of false negatives. For this purpose, we have simulated change scenarios (different effect sizes in a pre-post-test design) with distributions of different shapes and with different sample sizes. For each simulated scenario, we generated 1,000 samples. In each sample, we recorded the false-negative rate of the five distribution-based methods with the best performance from the point of view of the false positives. Our results have revealed unacceptable rates of false negatives even with effects of very large size, starting from 31.8% in an optimistic scenario (effect size of 2.0 and a normal distribution) to 99.9% in the worst scenario (effect size of 0.2 and a highly skewed distribution). Therefore, our results suggest that the widely used distribution-based methods must be applied with caution in a clinical context, because they need huge effect sizes to detect a true change. However, we made some considerations regarding the effect size and the cut-off points commonly used which allow us to be more precise in our estimates.


2021 ◽  
pp. 174077452098487
Author(s):  
Brian Freed ◽  
Brian Williams ◽  
Xiaolu Situ ◽  
Victoria Landsman ◽  
Jeehyoung Kim ◽  
...  

Background: Blinding aims to minimize biases from what participants and investigators know or believe. Randomized controlled trials, despite being the gold standard to evaluate treatment effect, do not generally assess the success of blinding. We investigated the extent of blinding in back pain trials and the associations between participant guesses and treatment effects. Methods: We did a review with PubMed/OvidMedline, 2000–2019. Eligibility criteria were back pain trials with data available on treatment effect and participants’ guess of treatment. For blinding, blinding index was used as chance-corrected measure of excessive correct guess (0 for random guess). For treatment effects, within- or between-arm effect sizes were used. Analyses of investigators’ guess/blinding or by treatment modality were performed exploratorily. Results: Forty trials (3899 participants) were included. Active and sham treatment groups had mean blinding index of 0.26 (95% confidence interval: 0.12, 0.41) and 0.01 (−0.11, 0.14), respectively, meaning 26% of participants in active treatment believed they received active treatment, whereas only 1% in sham believed they received sham treatment, beyond chance, that is, random guess. A greater belief of receiving active treatment was associated with a larger within-arm effect size in both arms, and ideal blinding (namely, “random guess,” and “wishful thinking” that signifies both groups believing they received active treatment) showed smaller effect sizes, with correlation of effect size and summary blinding indexes of 0.35 ( p = 0.028) for between-arm comparison. We observed uniformly large sham treatment effects for all modalities, and larger correlation for investigator’s (un)blinding, 0.53 ( p = 0.046). Conclusion: Participants in active treatments in back pain trials guessed treatment identity more correctly, while those in sham treatments tended to display successful blinding. Excessive correct guesses (that could reflect weaker blinding and/or noticeable effects) by participants and investigators demonstrated larger effect sizes. Blinding and sham treatment effects on back pain need due consideration in individual trials and meta-analyses.


2013 ◽  
Vol 2013 ◽  
pp. 1-9 ◽  
Author(s):  
Liansheng Larry Tang ◽  
Michael Caudy ◽  
Faye Taxman

Multiple meta-analyses may use similar search criteria and focus on the same topic of interest, but they may yield different or sometimes discordant results. The lack of statistical methods for synthesizing these findings makes it challenging to properly interpret the results from multiple meta-analyses, especially when their results are conflicting. In this paper, we first introduce a method to synthesize the meta-analytic results when multiple meta-analyses use the same type of summary effect estimates. When meta-analyses use different types of effect sizes, the meta-analysis results cannot be directly combined. We propose a two-step frequentist procedure to first convert the effect size estimates to the same metric and then summarize them with a weighted mean estimate. Our proposed method offers several advantages over existing methods by Hemming et al. (2012). First, different types of summary effect sizes are considered. Second, our method provides the same overall effect size as conducting a meta-analysis on all individual studies from multiple meta-analyses. We illustrate the application of the proposed methods in two examples and discuss their implications for the field of meta-analysis.


Author(s):  
H. S. Styn ◽  
S. M. Ellis

The determination of significance of differences in means and of relationships between variables is of importance in many empirical studies. Usually only statistical significance is reported, which does not necessarily indicate an important (practically significant) difference or relationship. With studies based on probability samples, effect size indices should be reported in addition to statistical significance tests in order to comment on practical significance. Where complete populations or convenience samples are worked with, the determination of statistical significance is strictly speaking no longer relevant, while the effect size indices can be used as a basis to judge significance. In this article attention is paid to the use of effect size indices in order to establish practical significance. It is also shown how these indices are utilized in a few fields of statistical application and how it receives attention in statistical literature and computer packages. The use of effect sizes is illustrated by a few examples from the research literature.


Sign in / Sign up

Export Citation Format

Share Document