Analyzing designed experiments: Should we report standard deviations or standard errors of the mean or standard errors of the difference or what?

2019 ◽  
Vol 56 (2) ◽  
pp. 312-319
Author(s):  
Marcin Kozak ◽  
Hans-Peter Piepho

AbstractANOVA, one of the most common statistical methods applied in agronomy, offers a variety of results we can report when analyzing designed experiments. The focus, of course, is on treatment means, but what should we report to characterize precision? Should we choose treatment standard deviations (SDs) or standard errors of the mean or standard errors of the difference (SEDs)? We discuss why treatment raw SDs should not be reported as the result of ANOVA, and point out that most of the time it is SEDs that should be provided.

Author(s):  
Jordan Anaya

GRIMMER (Granularity-Related Inconsistency of Means Mapped to Error Repeats) builds upon the GRIM test and allows for testing whether reported measures of variability are mathematically possible. GRIMMER relies upon the statistical phenomenon that variances display a simple repetitive pattern when the data is discrete, i.e. granular. This observation allows for the generation of an algorithm that can quickly identify whether a reported statistic of any size or precision is consistent with the stated sample size and granularity. My implementation of the test is available at PrePubMed (http://www.prepubmed.org/grimmer) and currently allows for testing variances, standard deviations, and standard errors for integer data. It is possible to extend the test to other measures of variability such as deviation from the mean, or apply the test to non-integer data such as data reported to halves or tenths. The ability of the test to identify inconsistent statistics relies upon four factors: (1) the sample size; (2) the granularity of the data; (3) the precision (number of decimals) of the reported statistic; and (4) the size of the standard deviation or standard error (but not the variance). The test is most powerful when the sample size is small, the granularity is large, the statistic is reported to a large number of decimal places, and the standard deviation or standard error is small (variance is immune to size considerations). This test has important implications for any field that routinely reports statistics for granular data to at least two decimal places because it can help identify errors in publications, and should be used by journals during their initial screen of new submissions. The errors detected can be the result of anything from something as innocent as a typo or rounding error to large statistical mistakes or unfortunately even fraud. In this report I describe the mathematical foundations of the GRIMMER test and the algorithm I use to implement it.


1981 ◽  
Vol 41 (4) ◽  
pp. 1033-1038 ◽  
Author(s):  
Lewis R. Aiken

Formulas are given for computing, from the responses of returnees alone, the maximum and minimum values between which the mean response to a survey item must fall in the total sample. Expressions for the standard errors of these maximum and minimum mean values are provided. The difference between the maximum and minimum means, in addition to the magnitudes of their standard errors, vary inversely with the proportion of returns. It is also shown that the extent to which the responses of returnees to a survey item are representative of the responses of the total sample is a function of sample size, proportion of returns, and proportion of returnees responding to the item in a specified direction. Formulas are derived for computing (1) the probability that the difference between the proportion of returnees who respond in a specified direction and the proportion of the total sample responding in that direction will be equal to or greater than an acceptable value, and (2) the minimum proportion of returns required to be fairly confident that the responses of returnees are representative of the responses of the total sample.


2016 ◽  
Author(s):  
Jordan Anaya

GRIMMER (Granularity-Related Inconsistency of Means Mapped to Error Repeats) builds upon the GRIM test and allows for testing whether reported measures of variability are mathematically possible. GRIMMER relies upon the statistical phenomenon that variances display a simple repetitive pattern when the data is discrete, i.e. granular. This observation allows for the generation of an algorithm that can quickly identify whether a reported statistic of any size or precision is consistent with the stated sample size and granularity. My implementation of the test is available at PrePubMed (http://www.prepubmed.org/grimmer) and currently allows for testing variances, standard deviations, and standard errors for integer data. It is possible to extend the test to other measures of variability such as deviation from the mean, or apply the test to non-integer data such as data reported to halves or tenths. The ability of the test to identify inconsistent statistics relies upon four factors: (1) the sample size; (2) the granularity of the data; (3) the precision (number of decimals) of the reported statistic; and (4) the size of the standard deviation or standard error (but not the variance). The test is most powerful when the sample size is small, the granularity is large, the statistic is reported to a large number of decimal places, and the standard deviation or standard error is small (variance is immune to size considerations). This test has important implications for any field that routinely reports statistics for granular data to at least two decimal places because it can help identify errors in publications, and should be used by journals during their initial screen of new submissions. The errors detected can be the result of anything from something as innocent as a typo or rounding error to large statistical mistakes or unfortunately even fraud. In this report I describe the mathematical foundations of the GRIMMER test and the algorithm I use to implement it.


1986 ◽  
Vol 69 (3) ◽  
pp. 527-531 ◽  
Author(s):  
Roy E Ginn ◽  
Vernal S Packard ◽  
Terrance L Fox ◽  
◽  
E Arnold ◽  
...  

Abstract Eleven laboratories participated in a collaborative study to compare the dry rehydratable film (Petrifilm® SM and Petrifilm® VRB) methods, respectively, to the standard plate count (SPC) and violet red bile agar (VRBA) standard methods for estimation of total bacteria and coliform counts in raw and homogenized pasteurized milk. Each laboratory analyzed 16 samples (8 different samples in blind duplicate) for total count by both the SPC and Petrifilm SM methods. A second set of 16 samples was analyzed by the VRBA and Petrifilm VRB methods. The repeatability standard deviations (the square root of the between-replicates variance) of the SPC, Petrifilm SM, VRBA, and Petrifilm VRB methods were 0.0S104, 0.0444, 0.14606, and 0.13806, respectively; the reproducibility standard deviations were 0.7197, C.06380, 0.15326, and 0.13806, respectively. The difference between the mean Iog10 SPC and the mean logio Petrifilm SM results was 0.027. For the VRBA and Petrifilm VRB methods, the mean log10 difference was 0.013. These results generally indicate the suitability of the dry rehydratable film methods as alternatives to the SPC and VRBA methods for milk samples. The methods have been adopted official first action.


The test for the significance of the difference of two means, when the standard errors of one observation are unequal, has been the subject of much recent discussion (Fisher 1935; Bartlett 1936; Welch 1937; Daniels 1938), but the appropriate treatment remains in doubt. A significance test for the difference of two means, on my principles, has already been given (Jeffreys 1937 a ), but is not altogether satisfactory, for two reasons. The result was, for large numbers of observations, K = P(q|θh ) / P(~Q|θH ) = (2/ π σ 2 + T 2 / σ 2 / m + T 2 / n ) 1/2 exp (-1/2 x - - y - ) 2 /σ 2 / m + T 2 / n ) (1) where x - and y - are the means in the two series, m and n the numbers of observations, σ and T the (estimated) standard errors of one observation. The most serious practical defect of this formula is that the numbers of observations are supposed large enough for the uncertainty of the standard errors to be neglected. This was due to a premature approximation and could be corrected easily; the resulting change would be similar to the difference between the normal layer and "Student's" formula, as has already been shown in other cases. There is, however, an other anomaly, less serious in practice, but of theoretical importance. We noticed at if T = 0, when the observations in the second series are exact, the first factor reduces to (2 m / π ) 1/2 , which is the usual form for the test of one new parameter, and is satisfactory. But if σ = T , and n is very large, so that the uncertainty of the true value in the second series is again negligible, should again expect the outside factor to reduce to (2 m / π ) 1/2 , since we are again comparing the mean of the first series with an accurate value. Actually it reduces to (4 m / π ) 1/2 . This is not of much practical importance, since if formula (1) gives k = 1, the correct formula would give k = 1/√2, and the result would still be indecisive, though slightly in favour of ~ q .


1991 ◽  
Vol 31 (3) ◽  
pp. 393
Author(s):  
NA Maier ◽  
AB Frensham ◽  
KSR Chapman ◽  
CMJ Williams

Total tuber yields were compared for inner and outer (guard) rows from 4 phosphorus (P) and 3 nitrogen (N) field experiments conducted during 1985-86 in South Australia, and from 5 N and 2 potassium (K) field experiments conducted during 1985-86 and 1987-88 in Tasmania. All fertiliser treatments were banded along the rows, either at planting or part at planting and the remainder sidedressed after emergence. The inter-row spacings were in the range 76-86 cm and the cultivars used were Kennebec, Coliban and Russet Burbank. Analysis showed that at only 1 of the 14 sites (site 6 in South Australia) was the mean total tuber yield for the inner 2 rows significantly (P<0.01) less than the mean total tuber yield for all 4 rows. However, the difference was small (0.8 t/ha or 1.9%) and of little practical importance. The relationships between mean (� s.e.) total tuber yield and rate (kg/ha) of applied nutrient (0-240 P, 0-320 N, 0-400 K) for inner and guard rows showed that differences between means were small and usually within standard error ranges at all sites. There were no consistent differences in the magnitudes of the standard errors of the means for inner and guard rows for all rates and types of nutrient applied. No significant cross-feeding occurred in these fertiliser experiments, which suggests that omission of guard rows from experiments where the fertiliser treatments are applied along the rows should not result in serious errors of interpretation of tuber yield response.


1955 ◽  
Vol 8 (1) ◽  
pp. 54 ◽  
Author(s):  
RN Bracewell

Let a. two-dimensional survey with a Gaussian aerial beam establish values at intervals of standard deviations. Then the correction for aerial smoothing is simply calculated as the difference between the value to be corrected and the mean of the neighbouring four values.


2016 ◽  
Vol 33 (S1) ◽  
pp. S597-S597
Author(s):  
R. Alsalman

IntroductionThe Beck Scale for Suicide ideation (BSS) has consistently been regarded as a strong tool for measuring cognitive and somatic aspects of suicide ideation symptomatology in both clinical and non-clinical population. There is no study until this date that examines the BSS within Kuwaiti College students.ObjectiveThe present study aims at identifying impact of gender (male/female) on suicide ideation.MethodsThe sample was consisted of (584) undergraduates students (284 of males and 300 females). The study applies Beck Scale for Suicide ideation (BSS) and suicide Ideation Questionnaire (SIQ).ResultsTable 1 descriptive statistics for two standardized self-report measures means and standard deviations for these measures were within the expected ranges for college samples. The mean (BSS) score was 5.2 for males and 7.0 for females. The mean (SIQ) score was 11.3 for males and 13.7 for females.ConclusionBSS revealed significant gender differences in score indicated that females obtained higher scores than males on suicide Ideation although the magnitude of the difference was small.Table not available.Disclosure of interestThe author has not supplied his declaration of competing interest.


2015 ◽  
Vol 2 (3) ◽  
Author(s):  
Dr. Shashi Kala Singh ◽  
Mrs. Pushpa Singh

The aim of the present study was to examine the gender and religion difference in secularism. Participants were 100 school students belong to Ranchi town (50 boys& 50 girls) of age range 13 to 16 years. All of these belong to middle socio-economic status. Respondents were given secularism scale. Data was analyzed by using means, standard deviations and “t”. The mean of male student was 31.53and female student was 29.97. The difference between the means was insignificant. Boys and girls showed similar level of secularism.Christian group showed significantly higher level of secularism than Muslim group.


Author(s):  
N.N. Kobeniak

In recent decades, the prevalence of gastrointestinal diseases has increased thus posing the immediate, both therapeutic and surgical treatment. It brings forth a problem of searching new and improving existing approaches and techniques for correcting the above-mentioned diseases. Preclinical studies in this area are conducted exclusively on laboratory animals and peculiarities of the morphological features of their organs are of great importance when comparing with the human morphology. The methodology used in the study included histological, morphometric and statistical techniques; biopsy samples of caecum taken from 5 rabbits were investigated. We assessed the correctness of the trait distribution by each of the variations, the mean values for each trait studied, standard errors and standard deviations. The significance of the difference of values between independent micrometric values in the normal trait distribution was determined by Student's criterion. The paper describes the main morphological characteristics of the caecum in rabbits and compared the findings obtained with similar structures of the human caecum. The caecum of rabbits, as of humans, has four layers: mucous, submucosal, muscular and serous. The mucous membrane consists of the epithelial layer located on the basement membrane and the muscular plate and contains cellular elements. The submucosa is composed of loose fibrous connective tissue, which contains collagen and reticular fibres, elements of diffuse lymphoid tissue, blood vessels, and nerve endings. The muscular and serous membranes are quite similar to the human caecum. Thus, the optic light microscopy has demonstrated the morphology of the caecum in rabbits is similar to that in the human caecum.


Sign in / Sign up

Export Citation Format

Share Document