Model selection in covariance structures analysis and the "problem" of sample size: A clarification.

Overpayment estimation using a sample of audited medical claims is an often used method to determine recoupment amounts. The current practice based on central limit theorem may not be efficient for certain kinds of claims data, including skewed payment populations with partial overpayments. As an alternative, we propose a novel Bayesian inflated mixture model. We provide an analysis of the validity and efficiency of the model estimates for a number of payment populations and overpayment scenarios. In addition, learning about the parameters of the overpayment distribution with increasing sample size may provide insights for the medical investigators. We present a discussion of model selection and potential modelling extensions.

Download Full-text

A Practical Solution to the Small Sample Size Bias and Uncertainty Problems of Model Selection Criteria in Two-Input Process Multiple Response Surface Methodology Datasets

Open Journal of Statistics ◽

10.4236/ojs.2019.91010 ◽

2019 ◽

Vol 09 (01) ◽

pp. 109-142

Author(s):

Domingo Pavolo ◽

Delson Chikobvu

Keyword(s):

Response Surface Methodology ◽

Model Selection ◽

Sample Size ◽

Selection Criteria ◽

Small Sample Size ◽

Small Sample ◽

Multiple Response ◽

Input Process ◽

Practical Solution ◽

Multiple Response Surface Methodology

Download Full-text

Modify Leave-One-Out Cross Validation by Moving Validation Samples around Random Normal Distributions: Move-One-Away Cross Validation

Applied Sciences ◽

10.3390/app10072448 ◽

2020 ◽

Vol 10 (7) ◽

pp. 2448

Author(s):

Liye Lv ◽

Xueguan Song ◽

Wei Sun

Keyword(s):

Model Selection ◽

Sample Size ◽

Cross Validation ◽

Engineering Problem ◽

Coefficient Of Determination ◽

Accuracy Rate ◽

Mathematical Functions ◽

Normal Distributions ◽

Model Independent ◽

Leave One Out

The leave-one-out cross validation (LOO-CV), which is a model-independent evaluate method, cannot always select the best of several models when the sample size is small. We modify the LOO-CV method by moving a validation point around random normal distributions—rather than leaving it out—naming it the move-one-away cross validation (MOA-CV), which is a model-dependent method. The key point of this method is to improve the accuracy rate of model selection that is unreliable in LOO-CV without enough samples. Errors from LOO-CV and MOA-CV, i.e., LOO-CVerror and MOA-CVerror, respectively, are employed to select the best one of four typical surrogate models through four standard mathematical functions and one engineering problem. The coefficient of determination (R-square, R2) is used to be a calibration of MOA-CVerror and LOO-CVerror. Results show that: (i) in terms of selecting the best models, MOA-CV and LOO-CV become better as sample size increases; (ii) MOA-CV has a better performance in selecting best models than LOO-CV; (iii) in the engineering problem, both the MOA-CV and LOO-CV can choose the worst models, and in most cases, MOA-CV has a higher probability to select the best model than LOO-CV.

Download Full-text

Dependence of Bayesian Model Selection Criteria and Fisher Information Matrix on Sample Size

Mathematical Geosciences ◽

10.1007/s11004-011-9359-0 ◽

2011 ◽

Vol 43 (8) ◽

pp. 971-993 ◽

Cited By ~ 22

Author(s):

Dan Lu ◽

Ming Ye ◽

Shlomo P. Neuman

Keyword(s):

Model Selection ◽

Sample Size ◽

Fisher Information ◽

Bayesian Model ◽

Selection Criteria ◽

Fisher Information Matrix ◽

Information Matrix ◽

Bayesian Model Selection ◽

Model Selection Criteria

Download Full-text

Prediction of Consumption and Income in National Accounts: Simulation-Based Forecast Model Selection

Engineering Proceedings ◽

10.3390/engproc2021005055 ◽

2021 ◽

Vol 5 (1) ◽

pp. 55

Author(s):

Adusei Jumah ◽

Robert M. Kunst

Keyword(s):

Panel Data ◽

Model Selection ◽

Sample Size ◽

Error Correction ◽

Forecast Model ◽

Household Consumption ◽

Prediction Tool ◽

National Accounts ◽

Simulation Based ◽

Forecast Models

Simulation-based forecast model selection considers two candidate forecast model classes, simulates from both models fitted to data, applies both forecast models to simulated structures, and evaluates the relative benefit of each candidate prediction tool. This approach, for example, determines a sample size beyond which a candidate predicts best. In an application, aggregate household consumption and disposable income provide an example for error correction. With panel data for European countries, we explore whether and to what degree the cointegration properties benefit forecasting. It evolves that statistical evidence on cointegration is not equivalent to better forecasting properties by the implied cointegrating structure.

Download Full-text

Sensitivity and specificity of information criteria

10.7287/peerj.preprints.1103v1 ◽

2015 ◽

Author(s):

John J. Dziak ◽

Donna L. Coffman ◽

Stephanie T. Lanza ◽

Runze Li

Keyword(s):

Model Selection ◽

Sample Size ◽

Likelihood Ratio Test ◽

Bayesian Information Criterion ◽

Information Criterion ◽

Information Criteria ◽

Ratio Test ◽

Relative Importance ◽

Missed Opportunities ◽

High Bias

Choosing a model with too few parameters can involve making unrealistically simple assumptions and lead to high bias, poor prediction, and missed opportunities for insight. Such models are not flexible enough to describe the sample or the population well. A model with too many parameters can t the observed data very well, but be too closely tailored to it. Such models may generalize poorly. Penalized-likelihood information criteria, such as Akaike's Information Criterion (AIC), the Bayesian Information Criterion (BIC), the Consistent AIC, and the Adjusted BIC, are widely used for model selection. However, different criteria sometimes support different models, leading to uncertainty about which criterion is the most trustworthy. In some simple cases the comparison of two models using information criteria can be viewed as equivalent to a likelihood ratio test, with the different models representing different alpha levels (i.e., different emphases on sensitivity or specificity; Lin & Dayton 1997). This perspective may lead to insights about how to interpret the criteria in less simple situations. For example, AIC or BIC could be preferable, depending on sample size and on the relative importance one assigns to sensitivity versus specificity. Understanding the differences among the criteria may make it easier to compare their results and to use them to make informed decisions.

Download Full-text