Explorations in statistics: the bootstrap

2009 ◽  
Vol 33 (4) ◽  
pp. 286-292 ◽  
Author(s):  
Douglas Curran-Everett

Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This fourth installment of Explorations in Statistics explores the bootstrap. The bootstrap gives us an empirical approach to estimate the theoretical variability among possible values of a sample statistic such as the sample mean. The appeal of the bootstrap is that we can use it to make an inference about some experimental result when the statistical theory is uncertain or even unknown. We can also use the bootstrap to assess how well the statistical theory holds: that is, whether an inference we make from a hypothesis test or confidence interval is justified.

2009 ◽  
Vol 33 (2) ◽  
pp. 87-90 ◽  
Author(s):  
Douglas Curran-Everett

Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This third installment of Explorations in Statistics investigates confidence intervals. A confidence interval is a range that we expect, with some level of confidence, to include the true value of a population parameter such as the mean. A confidence interval provides the same statistical information as the P value from a hypothesis test, but it circumvents the drawbacks of that hypothesis test. Even more important, a confidence interval focuses our attention on the scientific importance of some experimental result.


2012 ◽  
Vol 36 (3) ◽  
pp. 181-187 ◽  
Author(s):  
Douglas Curran-Everett

Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This eighth installment of Explorations in Statistics explores permutation methods, empiric procedures we can use to assess an experimental result–to test a null hypothesis–when we are reluctant to trust statistical theory alone. Permutation methods operate on the observations–the data–we get from an experiment. A permutation procedure answers this question: out of all the possible ways we can rearrange the observations we got, in what proportion of those arrangements is the sample statistic we care about at least as extreme as the one we got? The answer to that question is the P value.


2012 ◽  
Vol 2012 ◽  
pp. 1-5 ◽  
Author(s):  
Gui Bing Hong ◽  
Chih Ming Ma

Titanium dioxide (TiO2) has been extensively studied with regard to its application as the physical sunblock in sunscreen or other cosmetic products, as well as in environmental remediation processes. In this study, the Taguchi method was applied to determine the optimum gaseous DCM (dichloromethane) by Pt-TiO2with different Pt contents. An orthogonal array (L32) experimental design that allows for the simultaneous investigation of the variations of six parameters (light source, catalyst type, initial concentration, retention time, photointensity, and relative humidity) was employed to determine the optimum levels. The value of photointensity was not at a confidence interval. According to the response values and an analysis of variance (ANOVA), an experimental result of 34.7 as the optimum condition is forecast. Although the predetermination of 34.7 is not equal to the experimental value, it is contained within the 90% confidence interval (25.8, 43.6).


1999 ◽  
Vol 79 (2) ◽  
pp. 186-195 ◽  
Author(s):  
Julius Sim ◽  
Norma Reid

Abstract This article examines the role of the confidence interval (CI) in statistical inference and its advantages over conventional hypothesis testing, particularly when data are applied in the context of clinical practice. A CI provides a range of population values with which a sample statistic is consistent at a given level of confidence (usually 95%). Conventional hypothesis testing serves to either reject or retain a null hypothesis. A CI, while also functioning as a hypothesis test, provides additional information on the variability of an observed sample statistic (ie, its precision) and on its probable relationship to the value of this statistic in the population from which the sample was drawn (ie, its accuracy). Thus, the CI focuses attention on the magnitude and the probability of a treatment or other effect. It thereby assists in determining the clinical usefulness and importance of, as well as the statistical significance of, findings. The CI is appropriate for both parametric and nonparametric analyses and for both individual studies and aggregated data in meta-analyses. It is recommended that, when inferential statistical analysis is performed, CIs should accompany point estimates and conventional hypothesis tests wherever possible.


2015 ◽  
Vol 52 (04) ◽  
pp. 1115-1132 ◽  
Author(s):  
Krzysztof Bartoszek ◽  
Serik Sagitov

We consider a stochastic evolutionary model for a phenotype developing amongst n related species with unknown phylogeny. The unknown tree is modelled by a Yule process conditioned on n contemporary nodes. The trait value is assumed to evolve along lineages as an Ornstein-Uhlenbeck process. As a result, the trait values of the n species form a sample with dependent observations. We establish three limit theorems for the sample mean corresponding to three domains for the adaptation rate. In the case of fast adaptation, we show that for large n the normalized sample mean is approximately normally distributed. Using these limit theorems, we develop novel confidence interval formulae for the optimal trait value.


Author(s):  
Giuseppe Perinetti

When conducting research on a given type of patients, it is impossible to examine all the existing subjects of that type (population)to derive the true mean of the parameter of interest. More realistically, by the investigation of a small group of subjects (sample) fromthe whole population, researchers can estimate an interval into which the true mean of the population lies. In statistics, such interval isreferred to as confidence interval (CI). The calculation of the CI from a sample mean is simple and gives important information, not onlyregarding the true mean of the population, but also on the statistical significance of the difference between groups being compared. Forthese reasons, the reporting of the CIs is preferred over the p value alone.


2014 ◽  
Vol 556-562 ◽  
pp. 2878-2881
Author(s):  
Ke Wei Li

There are some issues for the shuffle network intrusion detection, such as high loss detection rates and time-consuming procedures. This paper proposes a shuffle network intrusion detection method fusing the misuse behavior analysis and analyzes the network misuse behavior procedures. According to the damaged data flow balance features by network misuse behavior, the paper applies the hypothesis test in probability theory to evaluate whether the confidence interval excesses 0. If the confidence interval does not contain zero, it indicates the presence of feed-forward network intrusion; otherwise, there is no feed-forward network intrusion. The experimental results show that this method can effectively solve the multi-packet collaborative intrusion problems. Compared to traditional methods, the test speed and accuracy of the method is significantly improved.


2005 ◽  
Vol 18 (17) ◽  
pp. 3699-3703 ◽  
Author(s):  
John R. Lanzante

Abstract Climate studies often involve comparisons between estimates of some parameter derived from different observed and/or model-generated datasets. It is common practice to present estimates of two or more statistical quantities with error bars about each representing a confidence interval. If the error bars do not overlap, it is presumed that there is a statistically significant difference between them. In general, such a procedure is not valid and usually results in declaring statistical significance too infrequently. Simple examples that demonstrate the nature of this pitfall, along with some formulations, are presented. It is recommended that practitioners use standard hypothesis testing techniques that have been derived from statistical theory rather than the ad hoc approach involving error bars.


Sign in / Sign up

Export Citation Format

Share Document