scholarly journals A Note on Estimating Unreported Sample Statistics for Meta-Analysis

Author(s):  
Joseph G. Eisenhauer

A major challenge confronting meta-analysts seeking to synthesize existing empirical research on a given topic is the frequent failure of primary studies to fully report their sample statistics.  Because such research cannot be included in a meta-analysis unless the unreported statistics can somehow be recovered, a number of methods have been devised to estimate the sample mean and standard deviation from other quantities.  This note compares several recently proposed sets of estimators that rely on extrema and/or quartiles to estimate unreported statistics for any given sample.  The simplest method relies on an underlying model of normality, while the more complex methods are explicitly designed to accommodate non-normality.  Our empirical comparison uses a previously developed data set containing 58 samples, ranging in size from 48 to 2,528 observations, from a standard depression screening instrument, the nine-item Patient Health Questionnaire (PHQ-9).  When only the median and extrema are known, we find that the estimation method based on normality yields the most accurate estimates of both the mean and standard deviation, despite the existence of asymmetry throughout the data set; and when other information is given, the normality-based estimators have accuracy comparable to that of the other estimators reviewed here.  Additionally, if the sample size is unknown, the method based on normality is the only feasible approach.  The simplicity of the normality-based approach provides an added convenience for practitioners. 

2020 ◽  
Vol 29 (9) ◽  
pp. 2520-2537 ◽  
Author(s):  
Sean McGrath ◽  
XiaoFei Zhao ◽  
Russell Steele ◽  
Brett D. Thombs ◽  
Andrea Benedetti ◽  
...  

Researchers increasingly use meta-analysis to synthesize the results of several studies in order to estimate a common effect. When the outcome variable is continuous, standard meta-analytic approaches assume that the primary studies report the sample mean and standard deviation of the outcome. However, when the outcome is skewed, authors sometimes summarize the data by reporting the sample median and one or both of (i) the minimum and maximum values and (ii) the first and third quartiles, but do not report the mean or standard deviation. To include these studies in meta-analysis, several methods have been developed to estimate the sample mean and standard deviation from the reported summary data. A major limitation of these widely used methods is that they assume that the outcome distribution is normal, which is unlikely to be tenable for studies reporting medians. We propose two novel approaches to estimate the sample mean and standard deviation when data are suspected to be non-normal. Our simulation results and empirical assessments show that the proposed methods often perform better than the existing methods when applied to non-normal data.


Author(s):  
Lena Golubovskaja

This chapter analyzes the tone and information content of the two external policy reports of the Internal Monetary Fund (IMF), the IMF Article IV Staff Reports, and Executive Board Assessments for Euro area countries. In particular, the researchers create a tone measure denoted WARNING based on the existing DICTION 5.0 Hardship dictionary. This study finds that in the run-up to the current credit crises, average WARNING tone levels of Staff Reports for Slovenia, Luxembourg, Greece, and Malta are one standard deviation above the EMU sample mean; and for Spain and Belgium, they are one standard deviation below the mean value. Furthermore, on average for Staff Reports over the period 2005-2007, there are insignificant differences between the EMU sample mean and Staff Reports’ yearly averages. Researchers find the presence of a significantly increased level of WARNING tone in 2006 (compared to the previous year) for the IMF Article IV Staff Reports. There is also a systematic bias of WARNING scores for Executive Board Assessments versus WARNING scores for the Staff Reports.


2015 ◽  
Vol 8 (4) ◽  
pp. 1799-1818 ◽  
Author(s):  
R. A. Scheepmaker ◽  
C. Frankenberg ◽  
N. M. Deutscher ◽  
M. Schneider ◽  
S. Barthlott ◽  
...  

Abstract. Measurements of the atmospheric HDO/H2O ratio help us to better understand the hydrological cycle and improve models to correctly simulate tropospheric humidity and therefore climate change. We present an updated version of the column-averaged HDO/H2O ratio data set from the SCanning Imaging Absorption spectroMeter for Atmospheric CHartographY (SCIAMACHY). The data set is extended with 2 additional years, now covering 2003–2007, and is validated against co-located ground-based total column δD measurements from Fourier transform spectrometers (FTS) of the Total Carbon Column Observing Network (TCCON) and the Network for the Detection of Atmospheric Composition Change (NDACC, produced within the framework of the MUSICA project). Even though the time overlap among the available data is not yet ideal, we determined a mean negative bias in SCIAMACHY δD of −35 ± 30‰ compared to TCCON and −69 ± 15‰ compared to MUSICA (the uncertainty indicating the station-to-station standard deviation). The bias shows a latitudinal dependency, being largest (∼ −60 to −80‰) at the highest latitudes and smallest (∼ −20 to −30‰) at the lowest latitudes. We have tested the impact of an offset correction to the SCIAMACHY HDO and H2O columns. This correction leads to a humidity- and latitude-dependent shift in δD and an improvement of the bias by 27‰, although it does not lead to an improved correlation with the FTS measurements nor to a strong reduction of the latitudinal dependency of the bias. The correction might be an improvement for dry, high-altitude areas, such as the Tibetan Plateau and the Andes region. For these areas, however, validation is currently impossible due to a lack of ground stations. The mean standard deviation of single-sounding SCIAMACHY–FTS differences is ∼ 115‰, which is reduced by a factor ∼ 2 when we consider monthly means. When we relax the strict matching of individual measurements and focus on the mean seasonalities using all available FTS data, we find that the correlation coefficients between SCIAMACHY and the FTS networks improve from 0.2 to 0.7–0.8. Certain ground stations show a clear asymmetry in δD during the transition from the dry to the wet season and back, which is also detected by SCIAMACHY. This asymmetry points to a transition in the source region temperature or location of the water vapour and shows the added information that HDO/H2O measurements provide when used in combination with variations in humidity.


2016 ◽  
Vol 38 (3) ◽  
Author(s):  
Mohammad Fraiwan Al-Saleh ◽  
Adil Eltayeb Yousif

Unlike the mean, the standard deviation ¾ is a vague concept. In this paper, several properties of ¾ are highlighted. These properties include the minimum and the maximum of ¾, its relationship to the mean absolute deviation and the range of the data, its role in Chebyshev’s inequality and the coefficient of variation. The hidden information in the formula itself is extracted. The confusion about the denominator of the sample variance being n ¡ 1 is also addressed. Some properties of the sample mean and varianceof normal data are carefully explained. Pointing out these and other properties in classrooms may have significant effects on the understanding and the retention of the concept.


2021 ◽  
pp. 57-89
Author(s):  
Charles Auerbach

In this chapter readers will learn about methodological issues to consider in analyzing the success of the intervention and how to conduct visual analysis. The chapter begins with a discussion of descriptive statistics that can aid the visual analysis of findings by summarizing patterns of data across phases. An example data set is used to illustrate the use of specific graphs, including box plots, standard deviation band graphs, and line charts showing the mean, median, and trimmed mean that can used to compare any two phases. SSD for R provides three standard methods for computing effect size, which are discussed in detail. Additionally, four methods of evaluating effect size using non-overlap methods are examined. The use of the goal line is discussed. The chapter concludes with a discussion of autocorrelation in the intervention phase and how to consider dealing with this issue.


2021 ◽  
Vol 11 (7) ◽  
pp. 908
Author(s):  
Spyridon Siafis ◽  
Alessandro Rodolico ◽  
Oğulcan Çıray ◽  
Declan G. Murphy ◽  
Mara Parellada ◽  
...  

Introduction: Response to treatment, according to Clinical Global Impression-Improvement (CGI-I) scale, is an easily interpretable outcome in clinical trials of autism spectrum disorder (ASD). Yet, the CGI-I rating is sometimes reported as a continuous outcome, and converting it to dichotomous would allow meta-analysis to incorporate more evidence. Methods: Clinical trials investigating medications for ASD and presenting both dichotomous and continuous CGI-I data were included. The number of patients with at least much improvement (CGI-I ≤ 2) were imputed from the CGI-I scale, assuming an underlying normal distribution of a latent continuous score using a primary threshold θ = 2.5 instead of θ = 2, which is the original cut-off in the CGI-I scale. The original and imputed values were used to calculate responder rates and odds ratios. The performance of the imputation method was investigated with a concordance correlation coefficient (CCC), linear regression, Bland–Altman plots, and subgroup differences of summary estimates obtained from random-effects meta-analysis. Results: Data from 27 studies, 58 arms, and 1428 participants were used. The imputation method using the primary threshold (θ = 2.5) had good performance for the responder rates (CCC = 0.93 95% confidence intervals [0.86, 0.96]; β of linear regression = 1.04 [0.95, 1.13]; bias and limits of agreements = 4.32% [−8.1%, 16.74%]; no subgroup differences χ2 = 1.24, p-value = 0.266) and odds ratios (CCC = 0.91 [0.86, 0.96]; β = 0.96 [0.78, 1.14]; bias = 0.09 [−0.87, 1.04]; χ2 = 0.02, p-value = 0.894). The imputation method had poorer performance when the secondary threshold (θ = 2) was used. Discussion: Assuming a normal distribution of the CGI-I scale, the number of responders could be imputed from the mean and standard deviation and used in meta-analysis. Due to the wide limits of agreement of the imputation method, sensitivity analysis excluding studies with imputed values should be performed.


Author(s):  
Mark J. DeBonis

One classic example of a binary classifier is one which employs the mean and standard deviation of the data set as a mechanism for classification. Indeed, principle component analysis has played a major role in this effort. In this paper, we propose that one should also include skew in order to make this method of classification a little more precise. One needs a simple probability distribution function which can be easily fit to a data set and use this pdf to create a classifier with improved error rates and comparable to other classifiers.


2005 ◽  
Vol 9 (3) ◽  
pp. 214-227 ◽  
Author(s):  
David S. Wallace ◽  
René M. Paulson ◽  
Charles G. Lord ◽  
Charles F. Bond

A meta-analysis of 797 studies and 1,001 effect sizes tested a theoretical hypothesis that situational constraints, such as perceived social pressure and perceived difficulty, weaken the relationship between attitudes and behavior. This hypothesis was confirmed for attitudes toward performing behaviors and for attitudes toward issues and social groups. Meta-analytic estimates of attitude-behavior correlations served to quantify these moderating effects. The present results indicated that the mean attitude-behavior correlation was .41 when people experienced a mean level of social pressure to perform a behavior of mean difficulty. The mean correlation was .30 when people experienced social pressure 1 standard deviation above the mean to perform a behavior that was 1 standard deviation more difficult than the mean. The results suggest a need for increased attention to the “behavior” side of the attitude-behavior equation. Attitudes predict some behaviors better than others.


2019 ◽  
Vol 64 (5) ◽  
pp. 48-73
Author(s):  
Fryderyk Mirota

In empirical research significant diversity of corporate cash holdings speed of adjustment (SOA) estimates can be observed. It is possible that some of the results are affected by publication selection bias. Articles whose results are clearly in line with economic theories may be preferred by authors and reviewers and, consequently, conclusions from this area can be published more frequently. The aim of this article is to verify whether there is a publication selection bias with respect to studies related to corporate cash holdings adjustments and to investigate the sources of heterogeneity in cash holdings SOA estimates. The statistical method used in the study was meta-analysis, which allows a combined analysis of the results from independent research. Meta-analysis enables to verify the occurrence of the publication selection bias and to explain the heterogeneity of the results presented in articles. The study was based on data collected asa result of a review of the literature published between 2003 and 2017. On the basis of information on 402 estimates from 58 different studies it has been shown that the publication selection bias does not occur. The Bayesian Model Averaging was used for modelling. It was identified that the characteristics associated with the data set used in the study, model specification and the estimation method significantly affect the hetero-geneity of corporate cash holdings SOA estimates. This diversity is determined, among others, by the choice of estimation method, length of the period covered by the analysis and characteristics of the market environment of the concerned entities.


2006 ◽  
Vol 6 (3) ◽  
pp. 831-846 ◽  
Author(s):  
X. Calbet ◽  
P. Schlüssel

Abstract. The Empirical Orthogonal Function (EOF) retrieval technique consists of calculating the eigenvectors of the spectra to later perform a linear regression between these and the atmospheric states, this first step is known as training. At a later stage, known as performing the retrievals, atmospheric profiles are derived from measured atmospheric radiances. When EOF retrievals are trained with a statistically different data set than the one used for retrievals two basic problems arise: significant biases appear in the retrievals and differences between the covariances of the training data set and the measured data set degrade them. The retrieved profiles will show a bias with respect to the real profiles which comes from the combined effect of the mean difference between the training and the real spectra projected into the atmospheric state space and the mean difference between the training and the atmospheric profiles. The standard deviations of the difference between the retrieved profiles and the real ones show different behavior depending on whether the covariance of the training spectra is bigger, equal or smaller than the covariance of the measured spectra with which the retrievals are performed. The procedure to correct for these effects is shown both analytically and with a measured example. It consists of first calculating the average and standard deviation of the difference between real observed spectra and the calculated spectra obtained from the real atmospheric state and the radiative transfer model used to create the training spectra. In a later step, measured spectra must be bias corrected with this average before performing the retrievals and the linear regression of the training must be performed adding noise to the spectra corresponding to the aforementioned calculated standard deviation. This procedure is optimal in the sense that to improve the retrievals one must resort to using a different training data set or a different algorithm.


Sign in / Sign up

Export Citation Format

Share Document