scholarly journals Comparison of the Frequentist MATA Confidence Interval with Bayesian Model-Averaged Confidence Intervals

2015 ◽  
Vol 2015 ◽  
pp. 1-9 ◽  
Author(s):  
Daniel Turek

Model averaging is a technique used to account for model uncertainty, in both Bayesian and frequentist multimodel inferences. In this paper, we compare the performance of model-averaged Bayesian credible intervals and frequentist confidence intervals. Frequentist intervals are constructed according to the model-averaged tail area (MATA) methodology. Differences between the Bayesian and frequentist methods are illustrated through an example involving cloud seeding. The coverage performance and interval width of each technique are then studied using simulation. A frequentist MATA interval performs best in the normal linear setting, while Bayesian credible intervals yield the best coverage performance in a lognormal setting. The use of a data-dependent prior probability for models improved the coverage of the model-averaged Bayesian interval, relative to that using uniform model prior probabilities. Data-dependent model prior probabilities are philosophically controversial in Bayesian statistics, and our results suggest that their use is beneficial when model averaging.

Marketing ZFP ◽  
2019 ◽  
Vol 41 (4) ◽  
pp. 33-42
Author(s):  
Thomas Otter

Empirical research in marketing often is, at least in parts, exploratory. The goal of exploratory research, by definition, extends beyond the empirical calibration of parameters in well established models and includes the empirical assessment of different model specifications. In this context researchers often rely on the statistical information about parameters in a given model to learn about likely model structures. An example is the search for the 'true' set of covariates in a regression model based on confidence intervals of regression coefficients. The purpose of this paper is to illustrate and compare different measures of statistical information about model parameters in the context of a generalized linear model: classical confidence intervals, bootstrapped confidence intervals, and Bayesian posterior credible intervals from a model that adapts its dimensionality as a function of the information in the data. I find that inference from the adaptive Bayesian model dominates that based on classical and bootstrapped intervals in a given model.


2017 ◽  
Author(s):  
Jose D. Perezgonzalez

‘The fallacy of placing confidence in confidence intervals’ (Morey et al., 2016, Psychonomic Bulletin & Review, doi: 10.3758/s13423-015-0947-8) delved into a much needed technical and philosophical dissertation regarding the differences between typical (mis)interpretations of frequentist confidence intervals and the typical correct interpretation of Bayesian credible intervals. My contribution here partly strengthens the authors’ argument, partly closes some gaps they left open, and concludes with a note of attention to the possibility that there may be distinctions without real practical differences in the ultimate use of estimation by intervals, namely when assuming a common ground of uninformative priors and intervals as ranges of values instead of as posterior distributions per se.


2015 ◽  
Vol 15 (5) ◽  
pp. 1079-1087 ◽  
Author(s):  
Robert H. McArthur ◽  
Robert C. Andrews

Effective coagulation is essential to achieving drinking water treatment objectives when considering surface water. To minimize settled water turbidity, artificial neural networks (ANNs) have been adopted to predict optimum alum and carbon dioxide dosages at the Elgin Area Water Treatment Plant. ANNs were applied to predict both optimum carbon dioxide and alum dosages with correlation (R2) values of 0.68 and 0.90, respectively. ANNs were also used to developed surface response plots to ease optimum selection of dosage. Trained ANNs were used to predict turbidity outcomes for a range of alum and carbon dioxide dosages and these were compared to historical data. Point-wise confidence intervals were obtained based on error and squared error values during the training process. The probability of the true value falling within the predicted interval ranged from 0.25 to 0.81 and the average interval width ranged from 0.15 to 0.62 NTU. Training an ANN using the squared error produced a larger average interval width, but better probability of a true prediction interval.


2021 ◽  
Author(s):  
Aho Glele Ludwig Serge ◽  
Emmanuel Simon ◽  
Camille Bouit ◽  
Maeva Serrand ◽  
Laurence Filipuzzi ◽  
...  

Background: Wei et al. have published a meta-analysis (MA) which aimed to evaluate the association between SARS-CoV-2 infection during pregnancy and adverse pregnancy outcomes. Using classical random-effects model, they found that SARS-CoV-2 infection was associated with preeclampsia, preterm birth and stillbirth. Performing MA with low event rates or with few studies may be challenging as MA relies on several within and between study distributional assumptions. Methods: to assess the robustness of the results provided by Wei et al., we performed a sensitivity analysis using several frequentist and Bayesian meta-analysis methods. We also estimated fragility indexes. Results: For eclampsia (patients with Covid-19 vs without), the confidence intervals of most frequentist models contain 1. All beta-binomial models (Bayesian) lead to credible intervals containing 1. The prediction interval, based on DL method ranges from 0.75 to 2.38. The fragility index is 2 for the DL method. For preterm, the confidence (credible) intervals exclude 1. The prediction interval is broad, ranging from 0.84 to 20.61. The fragility index ranges from 27 to 10. For stillbirth, the confidence intervals of most frequentist models contain 1. Six Bayesian MA models lead to credible intervals containing 1.The prediction interval ranges from 0.52 to 8.49. The fragility index is 3. Interpretation: Given the available data and the results of our broad sensitivity analysis, we can only suggest that SARS-CoV-2 infection during pregnancy is associated to preterm, and may be associated to preeclampsia. For stillbirth, more data are needed as none of the Bayesian analyses are conclusive.


F1000Research ◽  
2018 ◽  
Vol 7 ◽  
pp. 407 ◽  
Author(s):  
Michael Duggan ◽  
Patrizio Tressoldi

Background: This is an update of the Mossbridge et al’s meta-analysis related to the physiological anticipation preceding seemingly unpredictable stimuli which overall effect size was 0.21; 95% Confidence Intervals: 0.13 - 0.29 Methods: Nineteen new peer and non-peer reviewed studies completed from January 2008 to June 2018 were retrieved describing a total of 27 experiments and 36 associated effect sizes. Results: The overall weighted effect size, estimated with a frequentist multilevel random model, was: 0.28; 95% Confidence Intervals: 0.18-0.38; the overall weighted effect size, estimated with a multilevel Bayesian model, was: 0.28; 95% Credible Intervals: 0.18-0.38. The weighted mean estimate of the effect size of peer reviewed studies was higher than that of non-peer reviewed studies, but with overlapped confidence intervals: Peer reviewed: 0.36; 95% Confidence Intervals: 0.26-0.47; Non-Peer reviewed: 0.22; 95% Confidence Intervals: 0.05-0.39. Similarly, the weighted mean estimate of the effect size of Preregistered studies was higher than that of Non-Preregistered studies: Preregistered: 0.31; 95% Confidence Intervals: 0.18-0.45; No-Preregistered: 0.24; 95% Confidence Intervals: 0.08-0.41. The statistical estimation of the publication bias by using the Copas selection model suggest that the main findings are not contaminated by publication bias. Conclusions: In summary, with this update, the main findings reported in Mossbridge et al’s meta-analysis, are confirmed.


1983 ◽  
Vol 22 (01) ◽  
pp. 25-28 ◽  
Author(s):  
J. I. Balla ◽  
A. Elstein ◽  
P. Gates

The process of revising diagnostic probabilities was studied to examine the relative influence of prior probability and test diagnosticity at various levels of clinical experience. The aims were to show changes with increasing seniority, to explore the effects of perceived seriousness of a disease, and to demonstrate systematic biases in handling probabilistic information. To these ends, 4 case vignettes were presented to 169 medical students in the three final years of the medical course and 25 residents. The data presented included the prior probabilities of two diseases and the true positive fraction of a single clinical manifestation of the rarer disease.Results showed a slightly increasing reliance on prior probability with increasing experience. The less experienced seemed most influenced by test results and by perceived seriousness of the disease. In some vignettes judgment seemed to depend on representativeness. In others, the most plausible explanation of the diagnostic choice would have been availability. Marked case-to-case variation was noted for individuals and there was a general lack of systematic biases. Revision of diagnostic opinion often depended on preconceived notions, and prior probabilities tended to be ignored. These are clear indications for teaching the basics of decision theory to medical students and early post-graduates.


Author(s):  
Rashad M. EL-Sagheer ◽  
Mustafa M. Hasaballah

In this paper, we discussed the estimation of the index Cpy for a 3-Burr-XII distribution based on Progressive Type-II censoring. The maximum likelihood and Bayes method have been used to obtain the estimating of the index Cpy. The Fisher information matrix has been used to construct approximate confidence intervals. Also, bootstrap confidence intervals (CIs) of the estimators have been obtained. The Bayesian estimates for the index Cpy have been obtained by the Markov Chain Monte Carlo method. Also, the credible intervals are constructed by using MCMC samples. Two real-datasets have been discussed using the proposed index.


2021 ◽  
Author(s):  
Andrzej Kotarba ◽  
Mateusz Solecki

<p>Vertically-resolved cloud amount is essential for understanding the Earth’s radiation budget. Joint CloudSat-CALIPSO, lidar-radar cloud climatology remains the only dataset providing such information globally. However, a specific sampling scheme (pencil-like swath, 16-day revisit) introduces an uncertainty to CloudSat-CALIPSO cloud amounts. In the research we assess those uncertainties in terms of a bootstrap confidence intervals. Five years (2006-2011) of the 2B-GEOPROF-LIDAR (version P2_R05) cloud product was examined, accounting for  typical spatial resolutions of a global grids (1.0°, 2.5°, 5.0°, 10.0°), four confidence levels of confidence interval (0.85, 0.90, 0.95, 0.99), and three time scales of mean cloud amount (annual, seasonal, monthly). Results proved that cloud amount accuracy of 1%, or 5%, is not achievable with the dataset, assuming a 5-year mean cloud amount, high (>0.95) confidence level, and fine spatial resolution (1º–2.5º). The 1% requirement was only met by ~6.5% of atmospheric volumes at 1º and 2.5º, while more tolerant criterion (5%) was met by 22.5% volumes at 1º, or 48.9% at 2.5º resolution. In order to have at least 99% of volumes meeting an accuracy criterion, the criterion itself would have to be lowered to ~20% for 1º data, or to ~8% for 2.5º data. Study also quantified the relation between confidence interval width, and spatial resolution, confidence level, number of observations. Cloud regime (mean cloud amount, and standard deviation of cloud amount) was found the most important factor impacting the width of confidence interval. The research has been funded by the National Science Institute of Poland grant no. UMO-2017/25/B/ST10/01787. This research has been supported in part by PL-Grid Infrastructure (a computing resources).</p>


2005 ◽  
Vol 19 (4) ◽  
pp. 455-475 ◽  
Author(s):  
Karl Halvor Teigen ◽  
Magne JØrgensen

2016 ◽  
Vol 28 (8) ◽  
pp. 1694-1722 ◽  
Author(s):  
Yu Wang ◽  
Jihong Li

In typical machine learning applications such as information retrieval, precision and recall are two commonly used measures for assessing an algorithm's performance. Symmetrical confidence intervals based on K-fold cross-validated t distributions are widely used for the inference of precision and recall measures. As we confirmed through simulated experiments, however, these confidence intervals often exhibit lower degrees of confidence, which may easily lead to liberal inference results. Thus, it is crucial to construct faithful confidence (credible) intervals for precision and recall with a high degree of confidence and a short interval length. In this study, we propose two posterior credible intervals for precision and recall based on K-fold cross-validated beta distributions. The first credible interval for precision (or recall) is constructed based on the beta posterior distribution inferred by all K data sets corresponding to K confusion matrices from a K-fold cross-validation. Second, considering that each data set corresponding to a confusion matrix from a K-fold cross-validation can be used to infer a beta posterior distribution of precision (or recall), the second proposed credible interval for precision (or recall) is constructed based on the average of K beta posterior distributions. Experimental results on simulated and real data sets demonstrate that the first credible interval proposed in this study almost always resulted in degrees of confidence greater than 95%. With an acceptable degree of confidence, both of our two proposed credible intervals have shorter interval lengths than those based on a corrected K-fold cross-validated t distribution. Meanwhile, the average ranks of these two credible intervals are superior to that of the confidence interval based on a K-fold cross-validated t distribution for the degree of confidence and are superior to that of the confidence interval based on a corrected K-fold cross-validated t distribution for the interval length in all 27 cases of simulated and real data experiments. However, the confidence intervals based on the K-fold and corrected K-fold cross-validated t distributions are in the two extremes. Thus, when focusing on the reliability of the inference for precision and recall, the proposed methods are preferable, especially for the first credible interval.


Sign in / Sign up

Export Citation Format

Share Document