NONPARAMETRIC ESTIMATION OF REGRESSION FUNCTIONS WITH DISCRETE REGRESSORS

2009 ◽  
Vol 25 (1) ◽  
pp. 1-42 ◽  
Author(s):  
Desheng Ouyang ◽  
Qi Li ◽  
Jeffrey S. Racine

We consider the problem of estimating a nonparametric regression model containing categorical regressors only. We investigate the theoretical properties of least squares cross-validated smoothing parameter selection, establish the rate of convergence (to zero) of the smoothing parameters for relevant regressors, and show that there is a high probability that the smoothing parameters for irrelevant regressors converge to their upper bound values, thereby automatically smoothing out the irrelevant regressors. A small-scale simulation study shows that the proposed cross-validation-based estimator performs well in finite-sample settings.

Sankhya A ◽  
2019 ◽  
Vol 82 (2) ◽  
pp. 386-425
Author(s):  
Suzanne Sniekers ◽  
Aad van der Vaart

AbstractA credible band is the set of all functions between a lower and an upper bound that are constructed so that the set has prescribed mass under the posterior distribution. In a Bayesian analysis such a band is used to quantify the remaining uncertainty on the unknown function in a similar manner as a confidence band. We investigate the validity of a credible band in the nonparametric regression model with the prior distribution on the function given by a Gaussian process. We show that there are many true regression functions for which the credible band has the correct order of magnitude to be used as a confidence set. We also exhibit functions for which the credible band is misleading.


2009 ◽  
Vol 10 (4) ◽  
pp. 223-233 ◽  
Author(s):  
Seoung Bum Kim ◽  
Xiaoming Huo ◽  
Kwok-Leung Tsui

2021 ◽  
pp. 263208432199622
Author(s):  
Tim Mathes ◽  
Oliver Kuss

Background Meta-analysis of systematically reviewed studies on interventions is the cornerstone of evidence based medicine. In the following, we will introduce the common-beta beta-binomial (BB) model for meta-analysis with binary outcomes and elucidate its equivalence to panel count data models. Methods We present a variation of the standard “common-rho” BB (BBST model) for meta-analysis, namely a “common-beta” BB model. This model has an interesting connection to fixed-effect negative binomial regression models (FE-NegBin) for panel count data. Using this equivalence, it is possible to estimate an extension of the FE-NegBin with an additional multiplicative overdispersion term (RE-NegBin), while preserving a closed form likelihood. An advantage due to the connection to econometric models is, that the models can be easily implemented because “standard” statistical software for panel count data can be used. We illustrate the methods with two real-world example datasets. Furthermore, we show the results of a small-scale simulation study that compares the new models to the BBST. The input parameters of the simulation were informed by actually performed meta-analysis. Results In both example data sets, the NegBin, in particular the RE-NegBin showed a smaller effect and had narrower 95%-confidence intervals. In our simulation study, median bias was negligible for all methods, but the upper quartile for median bias suggested that BBST is most affected by positive bias. Regarding coverage probability, BBST and the RE-NegBin model outperformed the FE-NegBin model. Conclusion For meta-analyses with binary outcomes, the considered common-beta BB models may be valuable extensions to the family of BB models.


2020 ◽  
Vol 0 (0) ◽  
Author(s):  
Weixin Cai ◽  
Mark van der Laan

AbstractThe Highly-Adaptive least absolute shrinkage and selection operator (LASSO) Targeted Minimum Loss Estimator (HAL-TMLE) is an efficient plug-in estimator of a pathwise differentiable parameter in a statistical model that at minimal (and possibly only) assumes that the sectional variation norm of the true nuisance functions (i.e., relevant part of data distribution) are finite. It relies on an initial estimator (HAL-MLE) of the nuisance functions by minimizing the empirical risk over the parameter space under the constraint that the sectional variation norm of the candidate functions are bounded by a constant, where this constant can be selected with cross-validation. In this article we establish that the nonparametric bootstrap for the HAL-TMLE, fixing the value of the sectional variation norm at a value larger or equal than the cross-validation selector, provides a consistent method for estimating the normal limit distribution of the HAL-TMLE. In order to optimize the finite sample coverage of the nonparametric bootstrap confidence intervals, we propose a selection method for this sectional variation norm that is based on running the nonparametric bootstrap for all values of the sectional variation norm larger than the one selected by cross-validation, and subsequently determining a value at which the width of the resulting confidence intervals reaches a plateau. We demonstrate our method for 1) nonparametric estimation of the average treatment effect when observing a covariate vector, binary treatment, and outcome, and for 2) nonparametric estimation of the integral of the square of the multivariate density of the data distribution. In addition, we also present simulation results for these two examples demonstrating the excellent finite sample coverage of bootstrap-based confidence intervals.


2019 ◽  
Vol 76 (7) ◽  
pp. 2349-2361
Author(s):  
Benjamin Misiuk ◽  
Trevor Bell ◽  
Alec Aitken ◽  
Craig J Brown ◽  
Evan N Edinger

Abstract Species distribution models are commonly used in the marine environment as management tools. The high cost of collecting marine data for modelling makes them finite, especially in remote locations. Underwater image datasets from multiple surveys were leveraged to model the presence–absence and abundance of Arctic soft-shell clam (Mya spp.) to support the management of a local small-scale fishery in Qikiqtarjuaq, Nunavut, Canada. These models were combined to predict Mya abundance, conditional on presence throughout the study area. Results suggested that water depth was the primary environmental factor limiting Mya habitat suitability, yet seabed topography and substrate characteristics influence their abundance within suitable habitat. Ten-fold cross-validation and spatial leave-one-out cross-validation (LOO CV) were used to assess the accuracy of combined predictions and to test whether this was inflated by the spatial autocorrelation of transect sample data. Results demonstrated that four different measures of predictive accuracy were substantially inflated due to spatial autocorrelation, and the spatial LOO CV results were therefore adopted as the best estimates of performance.


2013 ◽  
Vol 805-806 ◽  
pp. 1948-1951
Author(s):  
Tian Jin

The non-homogeneous Poisson model has been applied to various situations, including air pollution data. In this paper, we propose a kernel based nonparametric estimation for fitting the non-homogeneous Poisson process data. We show that our proposed estimator is-consistent and asymptotically normally distributed. We also study the finite-sample properties with a simulation study.


Energy ◽  
2018 ◽  
Vol 161 ◽  
pp. 776-791 ◽  
Author(s):  
Yonghong Xu ◽  
Liang Tong ◽  
Hongguang Zhang ◽  
Xiaochen Hou ◽  
Fubin Yang ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document