scholarly journals Bootstrap‐Based Inference for Cube Root Asymptotics

Econometrica ◽  
2020 ◽  
Vol 88 (5) ◽  
pp. 2203-2219 ◽  
Author(s):  
Matias D. Cattaneo ◽  
Michael Jansson ◽  
Kenichi Nagasawa

This paper proposes a valid bootstrap‐based distributional approximation for M‐estimators exhibiting a Chernoff (1964)‐type limiting distribution. For estimators of this kind, the standard nonparametric bootstrap is inconsistent. The method proposed herein is based on the nonparametric bootstrap, but restores consistency by altering the shape of the criterion function defining the estimator whose distribution we seek to approximate. This modification leads to a generic and easy‐to‐implement resampling method for inference that is conceptually distinct from other available distributional approximations. We illustrate the applicability of our results with four examples in econometrics and machine learning.

2021 ◽  
Vol 9 ◽  
Author(s):  
Daniel Lowell Weller ◽  
Tanzy M. T. Love ◽  
Martin Wiedmann

Recent studies have shown that predictive models can supplement or provide alternatives to E. coli-testing for assessing the potential presence of food safety hazards in water used for produce production. However, these studies used balanced training data and focused on enteric pathogens. As such, research is needed to determine 1) if predictive models can be used to assess Listeria contamination of agricultural water, and 2) how resampling (to deal with imbalanced data) affects performance of these models. To address these knowledge gaps, this study developed models that predict nonpathogenic Listeria spp. (excluding L. monocytogenes) and L. monocytogenes presence in agricultural water using various combinations of learner (e.g., random forest, regression), feature type, and resampling method (none, oversampling, SMOTE). Four feature types were used in model training: microbial, physicochemical, spatial, and weather. “Full models” were trained using all four feature types, while “nested models” used between one and three types. In total, 45 full (15 learners*3 resampling approaches) and 108 nested (5 learners*9 feature sets*3 resampling approaches) models were trained per outcome. Model performance was compared against baseline models where E. coli concentration was the sole predictor. Overall, the machine learning models outperformed the baseline E. coli models, with random forests outperforming models built using other learners (e.g., rule-based learners). Resampling produced more accurate models than not resampling, with SMOTE models outperforming, on average, oversampling models. Regardless of resampling method, spatial and physicochemical water quality features drove accurate predictions for the nonpathogenic Listeria spp. and L. monocytogenes models, respectively. Overall, these findings 1) illustrate the need for alternatives to existing E. coli-based monitoring programs for assessing agricultural water for the presence of potential food safety hazards, and 2) suggest that predictive models may be one such alternative. Moreover, these findings provide a conceptual framework for how such models can be developed in the future with the ultimate aim of developing models that can be integrated into on-farm risk management programs. For example, future studies should consider using random forest learners, SMOTE resampling, and spatial features to develop models to predict the presence of foodborne pathogens, such as L. monocytogenes, in agricultural water when the training data is imbalanced.


2020 ◽  
Vol 41 (2) ◽  
pp. 133
Author(s):  
Ariani Indrawati ◽  
Hendro Subagyo ◽  
Andre Sihombing ◽  
Wagiyah Wagiyah ◽  
Sjaeful Afandi

The extremely skewed data in artificial intelligence, machine learning, and data mining cases are often given misleading results. It is caused because machine learning algorithms are designated to work best with balanced data. However, we often meet with imbalanced data in the real situation. To handling imbalanced data issues, the most popular technique is resampling the dataset to modify the number of instances in the majority and minority classes into a standard balanced data. Many resampling techniques, oversampling, undersampling, or combined both of them, have been proposed and continue until now. Resampling techniques may increase or decrease the classifier performance. Comparative research on resampling methods in structured data has been widely carried out, but studies that compare resampling methods with unstructured data are very rarely conducted. That raises many questions, one of which is whether this method is applied to unstructured data such as text that has large dimensions and very diverse characters. To understand how different resampling techniques will affect the learning of classifiers for imbalanced data text, we perform an experimental analysis using various resampling methods with several classification algorithms to classify articles at the Indonesian Scientific Journal Database (ISJD). From this experiment, it is known resampling techniques on imbalanced data text generally to improve the classifier performance but they are doesn’t give significant result because data text has very diverse and large dimensions.


Foods ◽  
2021 ◽  
Vol 10 (10) ◽  
pp. 2472
Author(s):  
Shogo Okamoto

In the last decade, temporal dominance of sensations (TDS) methods have proven to be potent approaches in the field of food sciences. Accordingly, thus far, methods for analyzing TDS curves, which are the major outputs of TDS methods, have been developed. This study proposes a method of bootstrap resampling for TDS tasks. The proposed method enables the production of random TDS curves to estimate the uncertainties, that is, the 95% confidence interval and standard error of the curves. Based on Monte Carlo simulation studies, the estimated uncertainties are considered valid and match those estimated by approximated normal distributions with the number of independent TDS tasks or samples being 50–100 or greater. The proposed resampling method enables researchers to apply statistical analyses and machine-learning approaches that require a large sample size of TDS curves.


Author(s):  
Zheng Zukang ◽  
Wu Lipeng

AbstractA recursive resampling method is discussed in this paper. Let X1, X2,…, Xn, be i.i.d. random variables with distribution function F and construct the empirical distribution function Fn. A new sample Xn+1 is drawn from Fn and the new empirical distribution function 1 in the wide sense, is computed from X1, X2,…, Xn, Xn+1. Then Xn+2 is drawn from 1 and 2 is obtained. In this way, Xn+m and m are found. It will be proved that m converges to a random variable almost surely as m goes to infinity and the limiting distribution is a compound beta distribution. In comparison with the usual non-recursive bootstrap, the main advantage of this procedure is a reduction in unconditional variance.


2012 ◽  
Vol 29 (3) ◽  
pp. 482-516 ◽  
Author(s):  
Dong Li ◽  
Shiqing Ling ◽  
Wai Keung Li

This paper studies the asymptotic theory of least squares estimation in a threshold moving average model. Under some mild conditions, it is shown that the estimator of the threshold is n-consistent and its limiting distribution is related to a two-sided compound Poisson process, whereas the estimators of other coefficients are strongly consistent and asymptotically normal. This paper also provides a resampling method to tabulate the limiting distribution of the estimated threshold in practice, which is the first successful effort in this direction. This resampling method contributes to threshold literature. Simultaneously, simulation studies are carried out to assess the performance of least squares estimation in finite samples.


2020 ◽  
Vol 43 ◽  
Author(s):  
Myrthe Faber

Abstract Gilead et al. state that abstraction supports mental travel, and that mental travel critically relies on abstraction. I propose an important addition to this theoretical framework, namely that mental travel might also support abstraction. Specifically, I argue that spontaneous mental travel (mind wandering), much like data augmentation in machine learning, provides variability in mental content and context necessary for abstraction.


2020 ◽  
Author(s):  
Mohammed J. Zaki ◽  
Wagner Meira, Jr
Keyword(s):  

2020 ◽  
Author(s):  
Marc Peter Deisenroth ◽  
A. Aldo Faisal ◽  
Cheng Soon Ong
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document