scholarly journals The Probability That a Measurement Falls within a Range of Standard Deviations from an Estimate of the Mean

2012 ◽  
Vol 2012 ◽  
pp. 1-8 ◽  
Author(s):  
Louis M. Houston

We derive a general equation for the probability that a measurement falls within a range of n standard deviations from an estimate of the mean. So, we provide a format that is compatible with a confidence interval centered about the mean that is naturally independent of the sample size. The equation is derived by interpolating theoretical results for extreme sample sizes. The intermediate value of the equation is confirmed with a computational test.

1985 ◽  
Vol 31 (4) ◽  
pp. 574-580 ◽  
Author(s):  
K Linnet

Abstract The precision of estimates of the sensitivity of diagnostic tests is evaluated. "Sensitivity" is defined as the fraction of diseased subjects with test values exceeding the 0.975-fractile of the distribution of control values. An estimate of the sensitivity is subject to sample variation because of variation of both control observations and patient observations. If gaussian distributions are assumed, the 0.95-confidence interval for a sensitivity estimate is up to +/- 0.15 for a sample of 100 controls and 100 patients. For the same sample size, minimum differences of 0.08 to 0.32 of sensitivities of two tests are established as significant with a power of 0.90. For some published diagnostic test evaluations the median sample sizes for controls and patients were 63 and 33, respectively. I show that, to obtain a reasonable precision of sensitivity estimates and a reasonable power when two tests are being compared, the number of samples should in general be considerably larger.


2018 ◽  
Vol 10 (11) ◽  
pp. 123
Author(s):  
Alberto Cargnelutti Filho ◽  
Cleiton Antonio Wartha ◽  
Jéssica Andiara Kleinpaul ◽  
Ismael Mario Marcio Neu ◽  
Daniela Lixinski Silveira

The aim of this study was to determine the sample size (i.e., number of plants) required to estimate the mean and median of canola (Brassica napus L.) traits of the Hyola 61, Hyola 76, and Hyola 433 hybrids with precision levels. At 124 days after sowing, 225 plants of each hybrid were randomly collected. In each plant, morphological (plant height) and productive traits (number of siliques, fresh matter of siliques, fresh matter of aerial part without siliques, fresh matter of aerial part, dry matter of siliques, dry matter of aerial part without siliques, and dry matter of aerial part) were measured. For each trait, measures of central tendency, variability, skewness, and kurtosis were calculated. Sample size was determined by resampling with replacement of 10,000 resamples. The sample size required for the estimation of measures of central tendency (mean and median) varies between traits and hybrids. Productive traits required larger sample sizes in relation to the morphological traits. Larger sample sizes are required for the hybrids Hyola 433, Hyola 61, and Hyola 76, in this sequence. In order to estimate the mean of canola traits of the Hyola 61, Hyola 76 e Hyola 433 hybrids with the amplitude of the confidence interval of 95% equal to 30% of the estimated mean, 208 plants are required. Whereas 661 plants are necessary to estimate the median with the same precision.


Author(s):  
Jordan Anaya

GRIMMER (Granularity-Related Inconsistency of Means Mapped to Error Repeats) builds upon the GRIM test and allows for testing whether reported measures of variability are mathematically possible. GRIMMER relies upon the statistical phenomenon that variances display a simple repetitive pattern when the data is discrete, i.e. granular. This observation allows for the generation of an algorithm that can quickly identify whether a reported statistic of any size or precision is consistent with the stated sample size and granularity. My implementation of the test is available at PrePubMed (http://www.prepubmed.org/grimmer) and currently allows for testing variances, standard deviations, and standard errors for integer data. It is possible to extend the test to other measures of variability such as deviation from the mean, or apply the test to non-integer data such as data reported to halves or tenths. The ability of the test to identify inconsistent statistics relies upon four factors: (1) the sample size; (2) the granularity of the data; (3) the precision (number of decimals) of the reported statistic; and (4) the size of the standard deviation or standard error (but not the variance). The test is most powerful when the sample size is small, the granularity is large, the statistic is reported to a large number of decimal places, and the standard deviation or standard error is small (variance is immune to size considerations). This test has important implications for any field that routinely reports statistics for granular data to at least two decimal places because it can help identify errors in publications, and should be used by journals during their initial screen of new submissions. The errors detected can be the result of anything from something as innocent as a typo or rounding error to large statistical mistakes or unfortunately even fraud. In this report I describe the mathematical foundations of the GRIMMER test and the algorithm I use to implement it.


Mathematics ◽  
2020 ◽  
Vol 8 (5) ◽  
pp. 698
Author(s):  
Chanseok Park ◽  
Min Wang

The control charts based on X ¯ and S are widely used to monitor the mean and variability of variables and can help quality engineers identify and investigate causes of the process variation. The usual requirement behind these control charts is that the sample sizes from the process are all equal, whereas this requirement may not be satisfied in practice due to missing observations, cost constraints, etc. To deal with this situation, several conventional methods were proposed. However, some methods based on weighted average approaches and an average sample size often result in degraded performance of the control charts because the adopted estimators are biased towards underestimating the true population parameters. These observations motivate us to investigate the existing methods with rigorous proofs and we provide a guideline to practitioners for the best selection to construct the X ¯ and S control charts when the sample sizes are not equal.


1999 ◽  
Vol 45 (6) ◽  
pp. 882-894 ◽  
Author(s):  
Kristian Linnet

Abstract Background: In method comparison studies, it is of importance to assure that the presence of a difference of medical importance is detected. For a given difference, the necessary number of samples depends on the range of values and the analytical standard deviations of the methods involved. For typical examples, the present study evaluates the statistical power of least-squares and Deming regression analyses applied to the method comparison data. Methods: Theoretical calculations and simulations were used to consider the statistical power for detection of slope deviations from unity and intercept deviations from zero. For situations with proportional analytical standard deviations, weighted forms of regression analysis were evaluated. Results: In general, sample sizes of 40–100 samples conventionally used in method comparison studies often must be reconsidered. A main factor is the range of values, which should be as wide as possible for the given analyte. For a range ratio (maximum value divided by minimum value) of 2, 544 samples are required to detect one standardized slope deviation; the number of required samples decreases to 64 at a range ratio of 10 (proportional analytical error). For electrolytes having very narrow ranges of values, very large sample sizes usually are necessary. In case of proportional analytical error, application of a weighted approach is important to assure an efficient analysis; e.g., for a range ratio of 10, the weighted approach reduces the requirement of samples by >50%. Conclusions: Estimation of the necessary sample size for a method comparison study assures a valid result; either no difference is found or the existence of a relevant difference is confirmed.


2011 ◽  
Vol 41 (5) ◽  
pp. 1130-1139 ◽  
Author(s):  
James A. Westfall ◽  
Paul L. Patterson ◽  
John W. Coulston

Post-stratification is used to reduce the variance of estimates of the mean. Because the stratification is not fixed in advance, within-strata sample sizes can be quite small. The survey statistics literature provides some guidance on minimum within-strata sample sizes; however, the recommendations and justifications are inconsistent and apply broadly for many different population structures. The impacts of minimum within-strata and total sample sizes on estimates of means and standard errors were examined for two forest inventory variables: proportion forestland and cubic net volume. Estimates of the means seem unbiased across a range of minimum within-strata sample sizes. A ratio that described the decrease in variability with increasing sample size allowed for assessment of minimum within-strata sample requirements to obtain stable estimates of means. This metric indicated that the minimum within-strata sample size should be at least 10. Estimates of standard errors were found to be biased at small total sample sizes. To obtain a bias of less than 3%, the required minimum total sample size was 25 for proportion forestland and 75 for cubic net volume. The results presented allow analysts to determine within-stratum and total sample size requirements corresponding to their criteria for acceptable levels of bias and variability.


1978 ◽  
Vol 100 (4) ◽  
pp. 607-612 ◽  
Author(s):  
D. Kececioglu ◽  
G. Lamarre

Charts are presented relating the lower one-sided confidence limit on the reliability, RL1, to the effective sample size, ne, calculated from the sample sizes used to estimate the failure governing stress and strength distributions, or f(s) and f(S) respectively, and a factor K which is a function of the estimated means and standard deviations of f(s) and f(S). These graphs cover an ne range of 5 to 1000, confidence levels of 0.80, 0.90, 0.95, and 0.99, and lower one-sided limits on the reliability of 0.85 to 0.9145. The equations used to develop these charts are derived and two examples of their applications are given.


2017 ◽  
Author(s):  
Gjalt - Jorn Ygram Peters

The experimental method is one of the staple methodological tools of the scientific method, and is as such prevalent in the psychological literature. It relies on the process of randomization to create equivalent groups. However, this procedure requires sufficiently large samples to succeed. In the current paper, we introduce tools that are based on the sampling distribution of Cohen’s d and that enable computing the likelihood that randomization succeeded in creating equivalent groups and the required sample size to achieve a desired likelihood of randomization success. The required sample sizes are considerable, and to illustrate this, we compute the likelihood of randomization failure using data from the Reproducability Project: Psychology. It is shown that it is likely that many original studies but also many replications failed to successfully create equivalent groups. For the replications, the mean likelihood of randomization failure was 44.54% (with a 95% confidence interval of [35.03%; 54.05%]) in the most liberal scenario, and 100% in the most conservative scenario. This means that many studies were in fact not experiments: the observed effects were at best conditional upon the value of unknown confounders, and at worst biased. In any case replication is unlikely when the randomization procedure failed to generate equivalent groups in either the original study or the replication. The consequence is that researchers in psychology, but also the funders of research in psychology, will have to get used to conducting considerably larger studies if they are to build a strong evidence base.


2020 ◽  
Vol 42 (1) ◽  
Author(s):  
Adroaldo Guimarães Rossetti ◽  
Francisco das Chagas Vidal Neto ◽  
Levi de Moura Barros

Abstract The aim of this work was to estimate sample sizes to assist the genetic improvement of the cashew tree (Anacardium occidentale L.). Stratified sampling, comprising five strata (S5, S4, S3, S2, and S1) of five cashew clones (BRS 274, BRS 275, BRS 226, BRS 189 and CCP 76), was effective for estimating the different sample sizes of the nut. Sample size for each clone depends on the weight-nut variance, the margin of error B permitted in the estimates and the desired precision of the results. The increases in sample size with clone variance, lowered the permitted margin of error B, and increased the desired precision of the results. These clones required different sample sizes for a morphological study of the nuts. Larger nuts require larger samples for the same margin of error B. For an error B of 0.2g, the sample size for clones S5, S4 and S3 were n5 = 84, n4 = 49 and n3 = 37 nuts. For clones BRS 274 (S5) and BRS 275 (S4), with better nut classification, the mean weights were respectively 16.79 and 12.78g. Clones BRS 189 (S2) and CCP 76 (S1), with smaller nuts, have a smaller variances, s22 = 0.7638 and s12 = 1.0712, where the mean weight was 8.29 and 7.81g respectively.


Methodology ◽  
2014 ◽  
Vol 10 (1) ◽  
pp. 1-11 ◽  
Author(s):  
Bethany A. Bell ◽  
Grant B. Morgan ◽  
Jason A. Schoeneberger ◽  
Jeffrey D. Kromrey ◽  
John M. Ferron

Whereas general sample size guidelines have been suggested when estimating multilevel models, they are only generalizable to a relatively limited number of data conditions and model structures, both of which are not very feasible for the applied researcher. In an effort to expand our understanding of two-level multilevel models under less than ideal conditions, Monte Carlo methods, through SAS/IML, were used to examine model convergence rates, parameter point estimates (statistical bias), parameter interval estimates (confidence interval accuracy and precision), and both Type I error control and statistical power of tests associated with the fixed effects from linear two-level models estimated with PROC MIXED. These outcomes were analyzed as a function of: (a) level-1 sample size, (b) level-2 sample size, (c) intercept variance, (d) slope variance, (e) collinearity, and (f) model complexity. Bias was minimal across nearly all conditions simulated. The 95% confidence interval coverage and Type I error rate tended to be slightly conservative. The degree of statistical power was related to sample sizes and level of fixed effects; higher power was observed with larger sample sizes and level-1 fixed effects.


Sign in / Sign up

Export Citation Format

Share Document