Statistics and Computing
Latest Publications


TOTAL DOCUMENTS

1347
(FIVE YEARS 218)

H-INDEX

64
(FIVE YEARS 5)

Published By Springer-Verlag

1573-1375, 0960-3174

2021 ◽  
Vol 32 (1) ◽  
Author(s):  
Simon Godsill ◽  
Yaman Kındap

AbstractIn this paper novel simulation methods are provided for the generalised inverse Gaussian (GIG) Lévy process. Such processes are intractable for simulation except in certain special edge cases, since the Lévy density associated with the GIG process is expressed as an integral involving certain Bessel functions, known as the Jaeger integral in diffusive transport applications. We here show for the first time how to solve the problem indirectly, using generalised shot-noise methods to simulate the underlying point processes and constructing an auxiliary variables approach that avoids any direct calculation of the integrals involved. The resulting augmented bivariate process is still intractable and so we propose a novel thinning method based on upper bounds on the intractable integrand. Moreover, our approach leads to lower and upper bounds on the Jaeger integral itself, which may be compared with other approximation methods. The shot noise method involves a truncated infinite series of decreasing random variables, and as such is approximate, although the series are found to be rapidly convergent in most cases. We note that the GIG process is the required Brownian motion subordinator for the generalised hyperbolic (GH) Lévy process and so our simulation approach will straightforwardly extend also to the simulation of these intractable processes. Our new methods will find application in forward simulation of processes of GIG and GH type, in financial and engineering data, for example, as well as inference for states and parameters of stochastic processes driven by GIG and GH Lévy processes.


2021 ◽  
Vol 32 (1) ◽  
Author(s):  
Juan Kuntz ◽  
Francesca R. Crucinio ◽  
Adam M. Johansen

AbstractWe introduce a class of Monte Carlo estimators that aim to overcome the rapid growth of variance with dimension often observed for standard estimators by exploiting the target’s independence structure. We identify the most basic incarnations of these estimators with a class of generalized U-statistics and thus establish their unbiasedness, consistency, and asymptotic normality. Moreover, we show that they obtain the minimum possible variance amongst a broad class of estimators, and we investigate their computational cost and delineate the settings in which they are most efficient. We exemplify the merger of these estimators with other well known Monte Carlo estimators so as to better adapt the latter to the target’s independence structure and improve their performance. We do this via three simple mergers: one with importance sampling, another with importance sampling squared, and a final one with pseudo-marginal Metropolis–Hastings. In all cases, we show that the resulting estimators are well founded and achieve lower variances than their standard counterparts. Lastly, we illustrate the various variance reductions through several examples.


2021 ◽  
Vol 32 (1) ◽  
Author(s):  
Umberto Amato ◽  
Anestis Antoniadis ◽  
Italia De Feis ◽  
Irène Gijbels

AbstractThis article studies M-type estimators for fitting robust additive models in the presence of anomalous data. The components in the additive model are allowed to have different degrees of smoothness. We introduce a new class of wavelet-based robust M-type estimators for performing simultaneous additive component estimation and variable selection in such inhomogeneous additive models. Each additive component is approximated by a truncated series expansion of wavelet bases, making it feasible to apply the method to nonequispaced data and sample sizes that are not necessarily a power of 2. Sparsity of the additive components together with sparsity of the wavelet coefficients within each component (group), results into a bi-level group variable selection problem. In this framework, we discuss robust estimation and variable selection. A two-stage computational algorithm, consisting of a fast accelerated proximal gradient algorithm of coordinate descend type, and thresholding, is proposed. When using nonconvex redescending loss functions, and appropriate nonconvex penalty functions at the group level, we establish optimal convergence rates of the estimates. We prove variable selection consistency under a weak compatibility condition for sparse additive models. The theoretical results are complemented with some simulations and real data analysis, as well as a comparison to other existing methods.


2021 ◽  
Vol 32 (1) ◽  
Author(s):  
Marc Lambert ◽  
Silvère Bonnabel ◽  
Francis Bach

2021 ◽  
Vol 32 (1) ◽  
Author(s):  
Gabriel Frisch ◽  
Jean-Benoist Leger ◽  
Yves Grandvalet
Keyword(s):  

2021 ◽  
Vol 32 (1) ◽  
Author(s):  
Lena Sembach ◽  
Jan Pablo Burgard ◽  
Volker Schulz

AbstractGaussian Mixture Models are a powerful tool in Data Science and Statistics that are mainly used for clustering and density approximation. The task of estimating the model parameters is in practice often solved by the expectation maximization (EM) algorithm which has its benefits in its simplicity and low per-iteration costs. However, the EM converges slowly if there is a large share of hidden information or overlapping clusters. Recent advances in Manifold Optimization for Gaussian Mixture Models have gained increasing interest. We introduce an explicit formula for the Riemannian Hessian for Gaussian Mixture Models. On top, we propose a new Riemannian Newton Trust-Region method which outperforms current approaches both in terms of runtime and number of iterations. We apply our method on clustering problems and density approximation tasks. Our method is very powerful for data with a large share of hidden information compared to existing methods.


2021 ◽  
Vol 32 (1) ◽  
Author(s):  
Yang Liu ◽  
Robert J. B. Goudie

AbstractBayesian modelling enables us to accommodate complex forms of data and make a comprehensive inference, but the effect of partial misspecification of the model is a concern. One approach in this setting is to modularize the model and prevent feedback from suspect modules, using a cut model. After observing data, this leads to the cut distribution which normally does not have a closed form. Previous studies have proposed algorithms to sample from this distribution, but these algorithms have unclear theoretical convergence properties. To address this, we propose a new algorithm called the stochastic approximation cut (SACut) algorithm as an alternative. The algorithm is divided into two parallel chains. The main chain targets an approximation to the cut distribution; the auxiliary chain is used to form an adaptive proposal distribution for the main chain. We prove convergence of the samples drawn by the proposed algorithm and present the exact limit. Although SACut is biased, since the main chain does not target the exact cut distribution, we prove this bias can be reduced geometrically by increasing a user-chosen tuning parameter. In addition, parallel computing can be easily adopted for SACut, which greatly reduces computation time.


2021 ◽  
Vol 32 (1) ◽  
Author(s):  
L. Mihaela Paun ◽  
Dirk Husmeier

AbstractWe propose to accelerate Hamiltonian and Lagrangian Monte Carlo algorithms by coupling them with Gaussian processes for emulation of the log unnormalised posterior distribution. We provide proofs of detailed balance with respect to the exact posterior distribution for these algorithms, and validate the correctness of the samplers’ implementation by Geweke consistency tests. We implement these algorithms in a delayed acceptance (DA) framework, and investigate whether the DA scheme can offer computational gains over the standard algorithms. A comparative evaluation study is carried out to assess the performance of the methods on a series of models described by differential equations, including a real-world application of a 1D fluid-dynamics model of the pulmonary blood circulation. The aim is to identify the algorithm which gives the best trade-off between accuracy and computational efficiency, to be used in nonlinear DE models, which are computationally onerous due to repeated numerical integrations in a Bayesian analysis. Results showed no advantage of the DA scheme over the standard algorithms with respect to several efficiency measures based on the effective sample size for most methods and DE models considered. These gradient-driven algorithms register a high acceptance rate, thus the number of expensive forward model evaluations is not significantly reduced by the first emulator-based stage of DA. Additionally, the Lagrangian Dynamical Monte Carlo and Riemann Manifold Hamiltonian Monte Carlo tended to register the highest efficiency (in terms of effective sample size normalised by the number of forward model evaluations), followed by the Hamiltonian Monte Carlo, and the No U-turn sampler tended to be the least efficient.


2021 ◽  
Vol 32 (1) ◽  
Author(s):  
Luis A. García-Escudero ◽  
Agustín Mayo-Iscar ◽  
Marco Riani

AbstractA new methodology for constrained parsimonious model-based clustering is introduced, where some tuning parameter allows to control the strength of these constraints. The methodology includes the 14 parsimonious models that are often applied in model-based clustering when assuming normal components as limit cases. This is done in a natural way by filling the gap among models and providing a smooth transition among them. The methodology provides mathematically well-defined problems and is also useful to prevent us from obtaining spurious solutions. Novel information criteria are proposed to help the user in choosing parameters. The interest of the proposed methodology is illustrated through simulation studies and a real-data application on COVID data.


Sign in / Sign up

Export Citation Format

Share Document