bootstrap distribution
Recently Published Documents


TOTAL DOCUMENTS

14
(FIVE YEARS 7)

H-INDEX

5
(FIVE YEARS 1)

Econometrica ◽  
2021 ◽  
Vol 89 (4) ◽  
pp. 1963-1977 ◽  
Author(s):  
Jinyong Hahn ◽  
Zhipeng Liao

Asymptotic justification of the bootstrap often takes the form of weak convergence of the bootstrap distribution to some limit distribution. Theoretical literature recognized that the weak convergence does not imply consistency of the bootstrap second moment or the bootstrap variance as an estimator of the asymptotic variance, but such concern is not always reflected in the applied practice. We bridge the gap between the theory and practice by showing that such common bootstrap based standard error in fact leads to a potentially conservative inference.


2021 ◽  
Vol 4 (1) ◽  
pp. 251524592091188
Author(s):  
Guillaume A. Rousselet ◽  
Cyril R. Pernet ◽  
Rand R. Wilcox

The percentile bootstrap is the Swiss Army knife of statistics: It is a nonparametric method based on data-driven simulations. It can be applied to many statistical problems, as a substitute to standard parametric approaches, or in situations for which parametric methods do not exist. In this Tutorial, we cover R code to implement the percentile bootstrap to make inferences about central tendency (e.g., means and trimmed means) and spread in a one-sample example and in an example comparing two independent groups. For each example, we explain how to derive a bootstrap distribution and how to get a confidence interval and a p value from that distribution. We also demonstrate how to run a simulation to assess the behavior of the bootstrap. For some purposes, such as making inferences about the mean, the bootstrap performs poorly. But for other purposes, it is the only known method that works well over a broad range of situations. More broadly, combining the percentile bootstrap with robust estimators (i.e., estimators that are not overly sensitive to outliers) can help users gain a deeper understanding of their data than they would using conventional methods.


Biometrika ◽  
2020 ◽  
Author(s):  
T A Kuffner ◽  
S M S Lee ◽  
G A Young

Summary We establish a general theory of optimality for block bootstrap distribution estimation for sample quantiles under mild strong mixing conditions. In contrast to existing results, we study the block bootstrap for varying numbers of blocks. This corresponds to a hybrid between the sub- sampling bootstrap and the moving block bootstrap, in which the number of blocks is between 1 and the ratio of sample size to block length. The hybrid block bootstrap is shown to give theoretical benefits, and startling improvements in accuracy in distribution estimation in important practical settings. The conclusion that bootstrap samples should be of smaller size than the original sample has significant implications for computational efficiency and scalability of bootstrap methodologies with dependent data. Our main theorem determines the optimal number of blocks and block length to achieve the best possible convergence rate for the block bootstrap distribution estimator for sample quantiles. We propose an intuitive method for empirical selection of the optimal number and length of blocks, and demonstrate its value in a nontrivial example.


Econometrica ◽  
2020 ◽  
Vol 88 (6) ◽  
pp. 2547-2574
Author(s):  
Giuseppe Cavaliere ◽  
Iliyan Georgiev

Asymptotic bootstrap validity is usually understood as consistency of the distribution of a bootstrap statistic, conditional on the data, for the unconditional limit distribution of a statistic of interest. From this perspective, randomness of the limit bootstrap measure is regarded as a failure of the bootstrap. We show that such limiting randomness does not necessarily invalidate bootstrap inference if validity is understood as control over the frequency of correct inferences in large samples. We first establish sufficient conditions for asymptotic bootstrap validity in cases where the unconditional limit distribution of a statistic can be obtained by averaging a (random) limiting bootstrap distribution. Further, we provide results ensuring the asymptotic validity of the bootstrap as a tool for conditional inference, the leading case being that where a bootstrap distribution estimates consistently a conditional (and thus, random) limit distribution of a statistic. We apply our framework to several inference problems in econometrics, including linear models with possibly nonstationary regressors, CUSUM statistics, conditional Kolmogorov–Smirnov specification tests and tests for constancy of parameters in dynamic econometric models.


Author(s):  
Wenzhen Huang ◽  
Junge Zhang ◽  
Kaiqi Huang

Model-based reinforcement learning (RL) methods attempt to learn a dynamics model to simulate the real environment and utilize the model to make better decisions. However, the learned environment simulator often has more or less model error which would disturb making decision and reduce performance. We propose a bootstrapped model-based RL method which bootstraps the modules in each depth of the planning tree. This method can quantify the uncertainty of environment model on different state-action pairs and lead the agent to explore the pairs with higher uncertainty to reduce the potential model errors. Moreover, we sample target values from their bootstrap distribution to connect the uncertainties at current and subsequent time-steps and introduce the prior mechanism to improve the exploration efficiency. Experiment results demonstrate that our method efficiently decreases model error and outperforms TreeQN and other stateof-the-art methods on multiple Atari games.


2019 ◽  
Author(s):  
Guillaume A Rousselet ◽  
Cyril R Pernet ◽  
Rand R. Wilcox

The percentile bootstrap is the Swiss Army Knife of statistics: it is a non-parametric method based on data-driven simulations. It can be applied to many statistical problems, as a substitute to standard parametric approaches, or in situations where parametric methods do not exist. In this tutorial, we cover R code to implement the percentile bootstrap in a few situations: one-sample estimation and the comparison of two independent groups for measures of central tendency (means and trimmed means) and spread. For each example, we explain how to derive a bootstrap distribution, and how to get a confidence interval and a p value from that distribution. We also demonstrate how to run a simulation to assess the behaviour of the bootstrap. In some situations, the bootstrap performs poorly, such as when making inferences about the mean. But for other purposes, it is the only known method that works well over a broad range of situations, such as when comparing medians and there are tied (duplicated) values. More broadly, combining the percentile bootstrap with robust estimators, i.e. estimators that are not overly sensitive to outliers, the bootstrap can help users gain a deeper understanding of their data, relative to conventional methods.


2011 ◽  
Vol 26 (4) ◽  
pp. 157-164 ◽  
Author(s):  
Hans-Erik Andersen ◽  
Jacob Strunk ◽  
Hailemariam Temesgen

Abstract Airborne laser scanning, collected in a sampling mode, has the potential to be a valuable tool for estimating the biomass resources available to support bioenergy production in rural communities of interior Alaska. In this study, we present a methodology for estimating forest biomass over a 201,226-ha area (of which 163,913 ha are forested) in the upper Tanana valley of interior Alaska using a combination of 79 field plots and high-density airborne light detection and ranging (LiDAR) collected in a sampling mode along 27 single strips (swaths) spaced approximately 2.5 km apart. A model-based approach to estimating total aboveground biomass for the area is presented. Although a design-based sampling approach (based on a probability sample of field plots) would allow for stronger inference, a model-based approach is justified when the cost of obtaining a probability sample is prohibitive. Using a simulation-based approach, the proportion of the variability associated with sampling error and modeling error was assessed. Results indicate that LiDAR sampling can be used to obtain estimates of total biomass with an acceptable level of precision (8.1 ± 0.7 [8%] teragrams [total ± SD]), with sampling error accounting for 58% of the SD of the bootstrap distribution. In addition, we investigated the influence of plot location (i.e., GPS) error, plot size, and field-measured diameter threshold on the variability of the total biomass estimate. We found that using a larger plot (1/30 ha versus 1/59 ha) and a lower diameter threshold (7.6 versus 12.5 cm) significantly reduced the SD of the bootstrap distribution (by approximately 20%), whereas larger plot location error (over a range from 0 to 20 m root mean square error) steadily increased variability at both plot sizes.


2009 ◽  
Vol 12 (2) ◽  
pp. 169-174 ◽  
Author(s):  
Marij Gielen ◽  
Catherine Derom ◽  
Robert Derom ◽  
Robert Vlietinck ◽  
Maurice P. Zeegers

AbstractBoth zygosity and chorionicity provide important information in twin research. The East Flanders Prospective Twin Survey (EFPTS) determines zygosity and chorionicity at birth and therefore provides a gold standard for the testing of diagnostic parameters that can be used to determine chorionicity. The aim of the present study was to investigate whether birthweight discordancy can be used as an indicator of chorionicity. The study sample consisted of 4,060 live-born twin pairs from the EFPTS. We studied MZ twins, using univariate and multivariate logistic regression analyses to calculate odds ratios (OR) and 95% confidence intervals (CI) of being MC in relation to discordancy level. Diagnostic parameters, including sensitivity and specificity, were calculated. A two-fold cross-validation was carried out and a bootstrap distribution with 10,000 samples was created to estimate the standard deviations. For discordancy levels of below 10%, 10–15%, 15–20%, 20–25% and above 25%, the ORs (95% CI) were 1.16 (0.91–1.47), 1.38 (1.05–1.80), 2.13 (1.51–3.01), 2.73 (1.73–4.29) and 2.81 (2.81–4.35) respectively. There were no gender differences. Sensitivity was 42.2% (SD 5.6%), specificity was 72.8% (SD 6.3%), positive predictive value was 72.8% (1.5%) and the negative predictive value was 39.2% (0.7%). In conclusion, although a higher discordancy level resulted in higher ORs of being an MC twin, birthweight discordancy level can only be used to some weak extent as a proxy for chorionicity, highlighting the need to assess and record chorionicity data in obstetrical units.


Sign in / Sign up

Export Citation Format

Share Document