Distributional Robustness of K-class Estimators and the PULSE

Econometrics Journal ◽

10.1093/ectj/utab031 ◽

2021 ◽

Author(s):

Martin Emil Jakobsen ◽

Jonas Peters

Keyword(s):

Instrumental Variable ◽

Optimization Problem ◽

Confidence Region ◽

Real Data ◽

Data Driven ◽

Simulation Experiments ◽

Distributional Robustness ◽

Mean Squared Prediction Error ◽

Squared Prediction Error ◽

The Mean

Abstract While causal models are robust in that they are prediction optimal under arbitrarily strong interventions, they may not be optimal when the interventions are bounded. We prove that the classical K-class estimator satisfies such optimality by establishing a connection between K-class estimators and anchor regression. This connection further motivates a novel estimator in instrumental variable settings that minimizes the mean squared prediction error subject to the constraint that the estimator lies in an asymptotically valid confidence region of the causal coefficient. We call this estimator PULSE (p-uncorrelated least squares estimator), relate it to work on invariance, show that it can be computed efficiently as a data-driven K-class estimator, even though the underlying optimization problem is non-convex, and prove consistency. We evaluate the estimators on real data and perform simulation experiments illustrating that PULSE suffers from less variability. There are several settings including weak instrument settings, where it outperforms other estimators.

Download Full-text

A simulated annealing-based algorithm for selecting balanced samples

Computational Statistics ◽

10.1007/s00180-021-01113-3 ◽

2021 ◽

Author(s):

Roberto Benedetti ◽

Maria Michela Dickson ◽

Giuseppe Espa ◽

Francesco Pantalone ◽

Federica Piersimoni

Keyword(s):

Simulated Annealing ◽

Optimization Problem ◽

Sample Selection ◽

Auxiliary Information ◽

Real Data ◽

Simulation Experiments ◽

Balanced Sampling ◽

Inclusion Probabilities ◽

Random Method ◽

Annealing Algorithms

AbstractBalanced sampling is a random method for sample selection, the use of which is preferable when auxiliary information is available for all units of a population. However, implementing balanced sampling can be a challenging task, and this is due in part to the computational efforts required and the necessity to respect balancing constraints and inclusion probabilities. In the present paper, a new algorithm for selecting balanced samples is proposed. This method is inspired by simulated annealing algorithms, as a balanced sample selection can be interpreted as an optimization problem. A set of simulation experiments and an example using real data shows the efficiency and the accuracy of the proposed algorithm.

Download Full-text

Summary of the Paper Entitled ‘The Mean Squared Prediction Error Paradox’

SSRN Electronic Journal ◽

10.2139/ssrn.3756747 ◽

2020 ◽

Author(s):

Pablo M. Pincheira ◽

Nicolas Hardy

Keyword(s):

Prediction Error ◽

Mean Squared Prediction Error ◽

Squared Prediction Error ◽

The Mean

Download Full-text

The L-Curve Criterion as a Model Selection Tool in PLS Regression

Journal of Probability and Statistics ◽

10.1155/2019/3129769 ◽

2019 ◽

Vol 2019 ◽

pp. 1-7

Author(s):

Abdelmounaim Kerkri ◽

Jelloul Allal ◽

Zoubir Zarrouk

Keyword(s):

Model Selection ◽

Least Squares ◽

Cross Validation ◽

Real Data ◽

Ordinary Least Squares ◽

Pls Regression ◽

Selection Tool ◽

Reliable Model ◽

Mean Squared Prediction Error ◽

Squared Prediction Error

Partial least squares (PLS) regression is an alternative to the ordinary least squares (OLS) regression, used in the presence of multicollinearity. As with any other modelling method, PLS regression requires a reliable model selection tool. Cross validation (CV) is the most commonly used tool with many advantages in both preciseness and accuracy, but it also has some drawbacks; therefore, we will use L-curve criterion as an alternative, given that it takes into consideration the shrinking nature of PLS. A theoretical justification for the use of L-curve criterion is presented as well as an application on both simulated and real data. The application shows how this criterion generally outperforms cross validation and generalized cross validation (GCV) in mean squared prediction error and computational efficiency.

Download Full-text

Subpopulations and accuracy of prediction in pig carcass classification

Animal Science ◽

10.1017/s1357729800053820 ◽

2004 ◽

Vol 78 (1) ◽

pp. 37-52 ◽

Cited By ~ 4

Author(s):

B. Engel ◽

W. G. Buist ◽

M. Font i Furnols ◽

E. Lambooij

Keyword(s):

Prediction Error ◽

Real Data ◽

Prediction Formula ◽

Lean Meat ◽

Genetic Types ◽

Mean Squared Prediction Error ◽

Squared Prediction Error ◽

Run Up ◽

Pig Carcasses ◽

Accuracy Of Prediction

AbstractClassification of pig carcasses in the European Community (EC) is based on the lean meat percentage of the carcasses. The lean meat percentage is predicted from instrumental carcass measurements, such as fat and muscle depth measurements, obtained in the slaughterline. The prediction formula for an instrument is derived from the data of a dissection experiment. When the relationship between percentage lean and instrumental carcass measurements differs between subpopulations, such as sexes or breeds, accuracy of prediction may differ between these subpopulations. In particular for some subpopulations predicted lean meat percentages may be systematically too low and for other subpopulations systematically too high. Producers or buyers that largely specialize in subpopulations where the percentage lean is underestimated, are put at a financial disadvantage.The aim of this paper is to gain insight, on the basis of real data, into the effects of differences between subpopulations on the accuracy of the predicted percentage lean meat of pig carcasses. A simulation study was performed based on data from dissection trials in The Netherlands, comprising gilts and castrated males, and trials in Spain, comprising different genetic types. The possible gain in accuracy, i.e. reduction of prediction bias and mean squared prediction error, by the use of separate prediction formulae for (some of) the subpopulations was determined.We concluded that marked bias in the predicted percentage lean meat may occur between subpopulations when a single overall prediction formula is employed. Systematic differences in predicted percentage lean between subpopulations that are overestimated and underestimated may exceed 4% and for selected values of instrumental measurements may run up to 6%. Bias between subpopulations may be eliminated, and prediction accuracy may be markedly improved, when separate prediction formulae are used. With the use of separate formulae the root mean squared prediction error may be reduced by 13 to 26% of the expected value when a single prediction formula is used for all pig carcasses.These are substantial reductions on a national scale. This suggests that there will be a commercial interest in the use of separate prediction formulae for different subpopulations. In the near future, when the use of implants becomes more reliable, subpopulations will be recognized automatically in the slaughterline and use of different prediction formulae will become practically feasible. Some possible consequences for the EC regulations and national safeguards for quality of prediction formulae are discussed.

Download Full-text

Validation of the use of Bayesian Analysis in the Optimization of Gentamicin Therapy from the Commencement of Dosing

Drug Intelligence & Clinical Pharmacy ◽

10.1177/106002808802200112 ◽

1988 ◽

Vol 22 (1) ◽

pp. 49-53 ◽

Cited By ~ 11

Author(s):

Henry Chrystyn

Keyword(s):

Bayesian Analysis ◽

Prediction Error ◽

Statistical Technique ◽

Predictive Algorithm ◽

Mean Squared Prediction Error ◽

Squared Prediction Error ◽

Gentamicin Concentration ◽

The Mean ◽

Gentamicin Therapy ◽

Trough Levels

A computer program based on the statistical technique of Bayesian analysis has been adapted to run on several microcomputers. The clinical application of this method for gentamicin has been validated in 13 patients with varying degrees of renal function by a comparison of the accuracy of this method to a predictive algorithm method and one using standard pharmacokinetic principles. Blood samples for serum gentamicin analysis were taken after the administraiton of an intravenous loading dose of gentamicin. The results produced by each method were used to predict the peak and trough values measured on day 3 of therapy. Of the three methods studied, Bayesian analysis, using a serum gentamicin concentration drawn four hours after the initial dose, was the least biased and the most precise method for predicting the observed levels. The mean prediction error of the Bayesian analysis method, using the four-hour sample, was −0.03 mg/L for the peak serum concentration and −0.07 mg/L for the trough level on day 3. Using this method the corresponding root mean squared prediction error was 0.60 mg/L and 0.36 mg/L for the peak and trough levels, respectively.

Download Full-text

Exergy and sensibility analysis of each individual effect in a kraft multiple effect evaporator

TAPPI Journal ◽

10.32964/tj18.10.607 ◽

2019 ◽

Vol 18 (10) ◽

pp. 607-618

Author(s):

JÉSSICA MOREIRA ◽

BRUNO LACERDA DE OLIVEIRA CAMPOS ◽

ESLY FERREIRA DA COSTA JUNIOR ◽

ANDRÉA OLIVEIRA SOUZA DA COSTA

Keyword(s):

Optimization Problem ◽

Real Data ◽

Kraft Pulping ◽

Steam Temperature ◽

Input Flow ◽

Transfer Coefficients ◽

Heat Transfer Coefficients ◽

Multiple Effect ◽

Sensibility Analysis ◽

Exergetic Analysis

The multiple effect evaporator (MEE) is an energy intensive step in the kraft pulping process. The exergetic analysis can be useful for locating irreversibilities in the process and pointing out which equipment is less efficient, and it could also be the object of optimization studies. In the present work, each evaporator of a real kraft system has been individually described using mass balance and thermodynamics principles (the first and the second laws). Real data from a kraft MEE were collected from a Brazilian plant and were used for the estimation of heat transfer coefficients in a nonlinear optimization problem, as well as for the validation of the model. An exergetic analysis was made for each effect individually, which resulted in effects 1A and 1B being the least efficient, and therefore having the greatest potential for improvement. A sensibility analysis was also performed, showing that steam temperature and liquor input flow rate are sensible parameters.

Download Full-text

Mendelian randomisation for mediation analysis: current methods and challenges for implementation

European Journal of Epidemiology ◽

10.1007/s10654-021-00757-1 ◽

2021 ◽

Author(s):

Alice R. Carter ◽

Eleanor Sanderson ◽

Gemma Hammerton ◽

Rebecca C. Richmond ◽

George Davey Smith ◽

...

Keyword(s):

Measurement Error ◽

Causal Inference ◽

Mediation Analysis ◽

Instrumental Variable ◽

Real Data ◽

Mendelian Randomisation ◽

Differential Measurement ◽

Individual Level ◽

Differential Measurement Error ◽

Summary Data

AbstractMediation analysis seeks to explain the pathway(s) through which an exposure affects an outcome. Traditional, non-instrumental variable methods for mediation analysis experience a number of methodological difficulties, including bias due to confounding between an exposure, mediator and outcome and measurement error. Mendelian randomisation (MR) can be used to improve causal inference for mediation analysis. We describe two approaches that can be used for estimating mediation analysis with MR: multivariable MR (MVMR) and two-step MR. We outline the approaches and provide code to demonstrate how they can be used in mediation analysis. We review issues that can affect analyses, including confounding, measurement error, weak instrument bias, interactions between exposures and mediators and analysis of multiple mediators. Description of the methods is supplemented by simulated and real data examples. Although MR relies on large sample sizes and strong assumptions, such as having strong instruments and no horizontally pleiotropic pathways, our simulations demonstrate that these methods are unaffected by confounders of the exposure or mediator and the outcome and non-differential measurement error of the exposure or mediator. Both MVMR and two-step MR can be implemented in both individual-level MR and summary data MR. MR mediation methods require different assumptions to be made, compared with non-instrumental variable mediation methods. Where these assumptions are more plausible, MR can be used to improve causal inference in mediation analysis.

Download Full-text

Estimation of Site Amplification from Geotechnical Array Data Using Neural Networks

Bulletin of the Seismological Society of America ◽

10.1785/0120200346 ◽

2021 ◽

Author(s):

Daniel Roten ◽

Kim B. Olsen

Keyword(s):

Neural Network ◽

Neural Networks ◽

Site Response ◽

Site Amplification ◽

Data Driven ◽

Vertical Array ◽

The Neural Network ◽

Simplifying Assumptions ◽

The Mean ◽

Fully Connected

ABSTRACT We use deep learning to predict surface-to-borehole Fourier amplification functions (AFs) from discretized shear-wave velocity profiles. Specifically, we train a fully connected neural network and a convolutional neural network using mean AFs observed at ∼600 KiK-net vertical array sites. Compared with predictions based on theoretical SH 1D amplifications, the neural network (NN) results in up to 50% reduction of the mean squared log error between predictions and observations at sites not used for training. In the future, NNs may lead to a purely data-driven prediction of site response that is independent of proxies or simplifying assumptions.

Download Full-text

Multi-Instance Dimensionality Reduction via Sparsity and Orthogonality

Neural Computation ◽

10.1162/neco_a_01140 ◽

2018 ◽

Vol 30 (12) ◽

pp. 3281-3308

Author(s):

Hong Zhu ◽

Li-Zhi Liao ◽

Michael K. Ng

Keyword(s):

Dimensionality Reduction ◽

Optimization Problem ◽

Augmented Lagrangian ◽

Main Idea ◽

Real Data ◽

Learning Performance ◽

High Dimensional ◽

Data Sets ◽

Outer Loop ◽

Orthogonality Constraints

We study a multi-instance (MI) learning dimensionality-reduction algorithm through sparsity and orthogonality, which is especially useful for high-dimensional MI data sets. We develop a novel algorithm to handle both sparsity and orthogonality constraints that existing methods do not handle well simultaneously. Our main idea is to formulate an optimization problem where the sparse term appears in the objective function and the orthogonality term is formed as a constraint. The resulting optimization problem can be solved by using approximate augmented Lagrangian iterations as the outer loop and inertial proximal alternating linearized minimization (iPALM) iterations as the inner loop. The main advantage of this method is that both sparsity and orthogonality can be satisfied in the proposed algorithm. We show the global convergence of the proposed iterative algorithm. We also demonstrate that the proposed algorithm can achieve high sparsity and orthogonality requirements, which are very important for dimensionality reduction. Experimental results on both synthetic and real data sets show that the proposed algorithm can obtain learning performance comparable to that of other tested MI learning algorithms.

Download Full-text

Velocity inversion by global optimization using finite-offset common-reflection-surface stacking applied to synthetic and Tacutu Basin seismic data

Geophysics ◽

10.1190/geo2017-0117.1 ◽

2019 ◽

Vol 84 (2) ◽

pp. R165-R174 ◽

Cited By ~ 1

Author(s):

Marcelo Jorge Luz Mesquita ◽

João Carlos Ribeiro Cruz ◽

German Garabito Callapino

Keyword(s):

Objective Function ◽

Real Data ◽

Velocity Model ◽

Data Sets ◽

Layer By Layer ◽

Velocity Inversion ◽

Very Fast Simulated Annealing ◽

Target Layer ◽

Common Reflection Surface ◽

The Mean

Estimation of an accurate velocity macromodel is an important step in seismic imaging. We have developed an approach based on coherence measurements and finite-offset (FO) beam stacking. The algorithm is an FO common-reflection-surface tomography, which aims to determine the best layered depth-velocity model by finding the model that maximizes a semblance objective function calculated from the amplitudes in common-midpoint (CMP) gathers stacked over a predetermined aperture. We develop the subsurface velocity model with a stack of layers separated by smooth interfaces. The algorithm is applied layer by layer from the top downward in four steps per layer. First, by automatic or manual picking, we estimate the reflection times of events that describe the interfaces in a time-migrated section. Second, we convert these times to depth using the velocity model via application of Dix’s formula and the image rays to the events. Third, by using ray tracing, we calculate kinematic parameters along the central ray and build a paraxial FO traveltime approximation for the FO common-reflection-surface method. Finally, starting from CMP gathers, we calculate the semblance of the selected events using this paraxial traveltime approximation. After repeating this algorithm for all selected CMP gathers, we use the mean semblance values as an objective function for the target layer. When this coherence measure is maximized, the model is accepted and the process is completed. Otherwise, the process restarts from step two with the updated velocity model. Because the inverse problem we are solving is nonlinear, we use very fast simulated annealing to search the velocity parameters in the target layers. We test the method on synthetic and real data sets to study its use and advantages.

Download Full-text