On the validity of Fisher's z test when applied to an actual example of non-normal data. (With five text-figures.)

1933 ◽  
Vol 23 (1) ◽  
pp. 6-17 ◽  
Author(s):  
T. Eden ◽  
F. Yates

Summary1. Previous work on the validity of the t and z tests on non-normal distributions is described. The question as to whether these tests, which are all on small samples from theoretical distributions, are really apposite is discussed.2. The necessity of making a practical test with actual data which shall comply with the usual conditions obtaining in agricultural experiments is urged.3. A practical test has been made on a skew distribution obtained from the observation of 256 height measurements on wheat. The distribution of the values of R. A. Fisher's z from a thousand random samples has been obtained and found to agree satisfactorily with the theoretical distribution.

2016 ◽  
Vol 41 (5) ◽  
pp. 472-505 ◽  
Author(s):  
Elizabeth Tipton ◽  
Kelly Hallberg ◽  
Larry V. Hedges ◽  
Wendy Chan

Background: Policy makers and researchers are frequently interested in understanding how effective a particular intervention may be for a specific population. One approach is to assess the degree of similarity between the sample in an experiment and the population. Another approach is to combine information from the experiment and the population to estimate the population average treatment effect (PATE). Method: Several methods for assessing the similarity between a sample and population currently exist as well as methods estimating the PATE. In this article, we investigate properties of six of these methods and statistics in the small sample sizes common in education research (i.e., 10–70 sites), evaluating the utility of rules of thumb developed from observational studies in the generalization case. Result: In small random samples, large differences between the sample and population can arise simply by chance and many of the statistics commonly used in generalization are a function of both sample size and the number of covariates being compared. The rules of thumb developed in observational studies (which are commonly applied in generalization) are much too conservative given the small sample sizes found in generalization. Conclusion: This article implies that sharp inferences to large populations from small experiments are difficult even with probability sampling. Features of random samples should be kept in mind when evaluating the extent to which results from experiments conducted on nonrandom samples might generalize.


Author(s):  
Agus Budi Santosa ◽  
Nur Iriawan ◽  
Setiawan Setiawan ◽  
Mohammad Dokhi

The assumption of the error normality in the regression model was often questioned especially in cases where there was an outlier, which causes the behavior of asymmetric data. To overcome this, without data transformation, we could use skew distribution. This distribution was very important and applicable in various fields of science such as finance, economics, actuarial science, medicine, biology, investment. Skew Normal distributions had been proven to have a convenient for calculating bias in data with asymmetric behavior. This study aims to model SUR with Skew Normal error using Bayesian approach applied to East Java GRDP data. This study would compared two types of models, namely models with Normal distributed errors and models with Skew Normal distributed errors. The result of parameter estimation with Bayesian approach shows that SUR Skew Normal model was more suitable for East Java GRDP modeling rather than using normal error model. This was based on their smaller Root of Mean Square Error (RMSE), Mean Absolute Error (MAE) and Mean Absolute Percentage Error (MAPE) value. 


1965 ◽  
Vol 17 (1) ◽  
pp. 79-90 ◽  
Author(s):  
Wayne Lee ◽  
Mary Janke

40 Ss categorized random samples from one of two normal distributions and then received feedback. The over-all, molar probabilities of samples for the two distributions were .35 and .65. Problem-solving or Gambling instructions, and distributions on a Dot position or a Number continuum, were combined factorially with sex. There were no main instructional effects. Performance was superior on the position continuum ( p < .01). Over-all performance resembled micromatching of probabilities at each continuum position, but some Ss responded independently of sample position. The molar probability of response “1” for the Dot group was less (.295) than the molar reinforcement level ( p < .001). There were no main sex effects. Distance between centers was overestimated for Dot but not for the Number continuum.


1996 ◽  
Vol 7 (1) ◽  
pp. 166-168
Author(s):  
D E Uehlinger

It seems reasonable to postulate that an observed relationship between dialysis dose and protein intake reflects improved nutrition with correction of uremic symptoms. This article demonstrates a purely statistical association between the two parameters--protein intake, expressed as normalized protein catabolic rate (NPCR), and dialysis dose quantified as KT/V [the product of urea clearance (K) and treatment time (T) divided by the distribution volume of urea (V)] in peritoneal dialysis. The use of random samples from independent normal distributions of K, V, and urea generation rate (G) for the calculation of KT/V and NPCR reveals that a statistical association is introduced when both protein intake and dialysis dose are normalized proportional to a common distribution V. Not the normalizing parameter V per se, but rather the variability of V accounts for the introduction of this statistical artifact.


1998 ◽  
Vol 38 (2) ◽  
pp. 157-162 ◽  
Author(s):  
Rod G Gullberg ◽  
Barry K Logan

Random samples from normal distributions are an important assumption for many statistical methods. The present study evaluates this assumption with regard to quantitative breath alcohol analyses. Eight individuals (six male and two female) consumed alcoholic beverages and subsequently provided replicate (n ranging from 22 to 69) breath samples to an infrared breath alcohol instrument within short time intervals. The serially collected data were treated with several descriptive and inferential methods. Descriptive results among the eight individuals included: mean 0.0420–0.1175 g/210L, SD 0.0008–0.0045 g/210L and CV: 1.9%–4.7%. Statistical tests for normality showed seven of the distributions to be reasonably normal (p ≥ 0.25) and the other marginal (p = 0.051). A test for runs about the median showed random results (p ≥ 0.10) for four individuals and non-random (p ≤ 0.01) for the other four. The results suggest an individual's breath alcohol measurement, when appropriately collected and analysed, should be considered a random sample from a normal within-subject distribution. The existing variability in breath alcohol analysis, due largely to biological and sampling considerations, is acceptably minimized to warrant forensic application.


2014 ◽  
Vol 22 (1) ◽  
pp. 31-44 ◽  
Author(s):  
Kai Arzheimer ◽  
Jocelyn Evans

In this article, we propose a polling accuracy measure for multi-party elections based on a generalization of Martin, Traugott, and Kennedy's two-party predictive accuracy index. Treating polls as random samples of a voting population, we first estimate an intercept only multinomial logit model to provide proportionate odds measures of each party's share of the vote, and thereby both unweighted and weighted averages of these values as a summary index for poll accuracy. We then propose measures for significance testing, and run a series of simulations to assess possible bias from the resulting folded normal distribution across different sample sizes, finding that bias is small even for polls with small samples. We apply our measure to the 2012 French presidential election polls to demonstrate its applicability in tracking overall polling performance across time and polling organizations. Finally, we demonstrate the practical value of our measure by using it as a dependent variable in an explanatory model of polling accuracy, testing the different possible sources of bias in the French data.


Crisis ◽  
2013 ◽  
Vol 34 (6) ◽  
pp. 434-437 ◽  
Author(s):  
Donald W. MacKenzie

Background: Suicide clusters at Cornell University and the Massachusetts Institute of Technology (MIT) prompted popular and expert speculation of suicide contagion. However, some clustering is to be expected in any random process. Aim: This work tested whether suicide clusters at these two universities differed significantly from those expected under a homogeneous Poisson process, in which suicides occur randomly and independently of one another. Method: Suicide dates were collected for MIT and Cornell for 1990–2012. The Anderson-Darling statistic was used to test the goodness-of-fit of the intervals between suicides to distribution expected under the Poisson process. Results: Suicides at MIT were consistent with the homogeneous Poisson process, while those at Cornell showed clustering inconsistent with such a process (p = .05). Conclusions: The Anderson-Darling test provides a statistically powerful means to identify suicide clustering in small samples. Practitioners can use this method to test for clustering in relevant communities. The difference in clustering behavior between the two institutions suggests that more institutions should be studied to determine the prevalence of suicide clustering in universities and its causes.


2011 ◽  
Vol 27 (2) ◽  
pp. 127-132 ◽  
Author(s):  
Heide Glaesmer ◽  
Gesine Grande ◽  
Elmar Braehler ◽  
Marcus Roth

The Satisfaction with Life Scale (SWLS) is the most commonly used measure for life satisfaction. Although there are numerous studies confirming factorial validity, most studies on dimensionality are based on small samples. A controversial debate continues on the factorial invariance across different subgroups. The present study aimed to test psychometric properties, factorial structure, factorial invariance across age and gender, and to deliver population-based norms for the German general population from a large cross-sectional sample of 2519 subjects. Confirmatory factor analyses supported that the scale is one-factorial, even though indications of inhomogeneity of the scale have been detected. Both findings show invariance across the seven age groups and both genders. As indicators of the convergent validity, a positive correlation with social support and negative correlation with depressiveness was shown. Population-based norms are provided to support the application in the context of individual diagnostics.


Sign in / Sign up

Export Citation Format

Share Document