Cross-Validation Selection of Regularisation Parameter(s) for Semiparametric Transformation Models

In this work, we discuss practical methods for the assessment, comparison, and selection of complex hierarchical Bayesian models. A natural way to assess the goodness of the model is to estimate its future predictive capability by estimating expected utilities. Instead of just making a point estimate, it is important to obtain the distribution of the expected utility estimate because it describes the uncertainty in the estimate. The distributions of the expected utility estimates can also be used to compare models, for example, by computing the probability of one model having a better expected utility than some other model. We propose an approach using cross-validation predictive densities to obtain expected utility estimates and Bayesian bootstrap to obtain samples from their distributions. We also discuss the probabilistic assumptions made and properties of two practical cross-validation methods, importance sampling and k-fold cross-validation. As illustrative examples, we use multilayer perceptron neural networks and gaussian processes with Markov chain Monte Carlo sampling in one toy problem and two challenging real-world problems.

Download Full-text

A STURM-LIOUVILLE PROBLEM IN SEMIPARAMETRIC TRANSFORMATION MODELS

Advances in Statistical Modeling and Inference ◽

10.1142/9789812708298_0013 ◽

2007 ◽

pp. 255-275

Author(s):

Chris A. J. Klaassen

Keyword(s):

Liouville Problem ◽

Transformation Models ◽

Sturm Liouville ◽

Semiparametric Transformation

Download Full-text

Ascertainment of the number of samples in the validation set in Monte Carlo cross validation and the selection of model dimension with Monte Carlo cross validation

Chemometrics and Intelligent Laboratory Systems ◽

10.1016/j.chemolab.2005.07.004 ◽

2006 ◽

Vol 82 (1-2) ◽

pp. 83-89 ◽

Cited By ~ 15

Author(s):

Yi Ping Du ◽

Sumaporn Kasemsumran ◽

Katsuhiko Maruo ◽

Takehiro Nakagawa ◽

Yukihiro Ozaki

Keyword(s):

Monte Carlo ◽

Cross Validation ◽

Validation Set ◽

Monte Carlo Cross Validation ◽

Model Dimension ◽

Selection Of

Download Full-text

Selection of the optimal heat treatment conditions using advanced phase transformation models

IOP Conference Series Materials Science and Engineering ◽

10.1088/1757-899x/627/1/012018 ◽

2019 ◽

Vol 627 ◽

pp. 012018

Author(s):

Krzysztof Bzowski ◽

Lukasz Rauch ◽

Maciej Pietrzyk

Keyword(s):

Heat Treatment ◽

Phase Transformation ◽

Transformation Models ◽

Optimal Heat ◽

Optimal Heat Treatment ◽

Treatment Conditions ◽

Advanced Phase ◽

Selection Of

Download Full-text

DSRIG: Incorporating graphical structure in the regularized modeling of SNP data

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720019500173 ◽

2019 ◽

Vol 17 (03) ◽

pp. 1950017 ◽

Cited By ~ 2

Author(s):

Matthew Stephenson ◽

Gerarda A. Darlington ◽

Flavio S. Schenkel ◽

E. James Squires ◽

R. Ayesha Ali

Keyword(s):

Cross Validation ◽

Genetic Selection ◽

Boar Taint ◽

Farm Animals ◽

Candidate Snps ◽

Snp Data ◽

Validation Procedure ◽

Improvement Programs ◽

Graphical Structure ◽

Selection Of

Genetic selection of farm animals plays an important role in genetic improvement programs. Regularized regression methods on single nucleotide polymorphism (SNP) data from a set of candidate genes can help to identify genes that are associated with the trait of interest. This complex task must also consider the relative effect sizes on the desired trait and account for the relationships among the candidate SNPs so that selection of a SNP does not promote other undesirable traits through breeding. We present the Doubly Sparse Regression Incorporating Graphical structure (DSRIG), a novel regularized method for genetic selection that exploits the relationships among candidate SNPs to improve prediction. DSRIG was applied in the prediction of skatole and androstenone levels, two compounds known to be associated with boar taint. DSRIG was shown to provide a predictive benefit when compared to ordinary least squares (OLS) and the least absolute shrinkage and selection operator (LASSO) in a cross-validation procedure. The relative sizes of the coefficient estimates over the cross-validation procedure were compared to determine which SNPs may have the greatest impact on expression of the boar taint compounds and a consensus graph was used to infer the relationships among SNPs.

Download Full-text

Moving window cross validation: a new cross validation method for the selection of a rational number of components in a partial least squares calibration model

The Analyst ◽

10.1039/b515637h ◽

2006 ◽

Vol 131 (4) ◽

pp. 529 ◽

Cited By ~ 6

Author(s):

Sumaporn Kasemsumran ◽

Yi-Ping Du ◽

Bo-Yan Li ◽

Katsuhiko Maruo ◽

Yukihiro Ozaki

Keyword(s):

Least Squares ◽

Partial Least Squares ◽

Rational Number ◽

Cross Validation ◽

Calibration Model ◽

Moving Window ◽

Validation Method ◽

Number Of Components ◽

Selection Of

Download Full-text

Smoothing Noisy Data Using Dynamic Programming and Generalized Cross-Validation

Journal of Biomechanical Engineering ◽

10.1115/1.3108403 ◽

1988 ◽

Vol 110 (1) ◽

pp. 37-41 ◽

Cited By ~ 30

Author(s):

C. R. Dohrmann ◽

H. R. Busby ◽

D. M. Trujillo

Keyword(s):

Dynamic Programming ◽

Cross Validation ◽

Present Model ◽

Noisy Data ◽

Smoothing Parameter ◽

Spline Functions ◽

General Context ◽

Simple Extension ◽

Generalized Cross Validation ◽

Selection Of

Smoothing and differentiation of noisy data using spline functions requires the selection of an unknown smoothing parameter. The method of generalized cross-validation provides an excellent estimate of the smoothing parameter from the data itself even when the amount of noise associated with the data is unknown. In the present model only a single smoothing parameter must be obtained, but in a more general context the number may be larger. In an earlier work, smoothing of the data was accomplished by solving a minimization problem using the technique of dynamic programming. This paper shows how the computations required by generalized cross-validation can be performed as a simple extension of the dynamic programming formulas. The results of numerical experiments are also included.

Download Full-text