scholarly journals Principle of Duality in Cubic Smoothing Spline

Mathematics ◽  
2020 ◽  
Vol 8 (10) ◽  
pp. 1839
Author(s):  
Ruixue Du ◽  
Hiroshi Yamada

Fitting a cubic smoothing spline is a typical smoothing method. This paper reveals a principle of duality in the penalized least squares regressions relating to the method. We also provide a number of results derived from them, some of which are illustrated by a real data example.

2020 ◽  
Vol 2020 ◽  
pp. 1-10
Author(s):  
Ni Putu Ayu Mirah Mariati ◽  
I. Nyoman Budiantara ◽  
Vita Ratnasari

So far, most of the researchers developed one type of estimator in nonparametric regression. But in reality, in daily life, data with mixed patterns were often encountered, especially data patterns which partly changed at certain subintervals, and some others followed a recurring pattern in a certain trend. The estimator method used for the data pattern was a mixed estimator method of smoothing spline and Fourier series. This regression model was approached by the component smoothing spline and Fourier series. From this process, the mixed estimator was completed using two estimation stages. The first stage was the estimation with penalized least squares (PLS), and the second stage was the estimation with least squares (LS). Those estimators were then implemented using simulated data. The simulated data were gained by generating two different functions, namely, polynomial and trigonometric functions with the size of the sample being 100. The whole process was then repeated 50 times. The experiment of the two functions was modeled using a mixture of the smoothing spline and Fourier series estimators with various smoothing and oscillation parameters. The generalized cross validation (GCV) minimum was selected as the best model. The simulation results showed that the mixed estimators gave a minimum (GCV) value of 11.98. From the minimum GCV results, it was obtained that the mean square error (MSE) was 0.71 and R2 was 99.48%. So, the results obtained indicated that the model was good for a mixture estimator of smoothing spline and Fourier series.


Author(s):  
Parisa Torkaman

The generalized inverted exponential distribution is introduced as a lifetime model with good statistical properties. This paper, the estimation of the probability density function and the cumulative distribution function of with five different estimation methods: uniformly minimum variance unbiased(UMVU), maximum likelihood(ML), least squares(LS), weighted least squares (WLS) and percentile(PC) estimators are considered. The performance of these estimation procedures, based on the mean squared error (MSE) by numerical simulations are compared. Simulation studies express that the UMVU estimator performs better than others and when the sample size is large enough the ML and UMVU estimators are almost equivalent and efficient than LS, WLS and PC. Finally, the result using a real data set are analyzed.


2021 ◽  
Vol 5 (1) ◽  
pp. 59
Author(s):  
Gaël Kermarrec ◽  
Niklas Schild ◽  
Jan Hartmann

Terrestrial laser scanners (TLS) capture a large number of 3D points rapidly, with high precision and spatial resolution. These scanners are used for applications as diverse as modeling architectural or engineering structures, but also high-resolution mapping of terrain. The noise of the observations cannot be assumed to be strictly corresponding to white noise: besides being heteroscedastic, correlations between observations are likely to appear due to the high scanning rate. Unfortunately, if the variance can sometimes be modeled based on physical or empirical considerations, the latter are more often neglected. Trustworthy knowledge is, however, mandatory to avoid the overestimation of the precision of the point cloud and, potentially, the non-detection of deformation between scans recorded at different epochs using statistical testing strategies. The TLS point clouds can be approximated with parametric surfaces, such as planes, using the Gauss–Helmert model, or the newly introduced T-splines surfaces. In both cases, the goal is to minimize the squared distance between the observations and the approximated surfaces in order to estimate parameters, such as normal vector or control points. In this contribution, we will show how the residuals of the surface approximation can be used to derive the correlation structure of the noise of the observations. We will estimate the correlation parameters using the Whittle maximum likelihood and use comparable simulations and real data to validate our methodology. Using the least-squares adjustment as a “filter of the geometry” paves the way for the determination of a correlation model for many sensors recording 3D point clouds.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Camilo Broc ◽  
Therese Truong ◽  
Benoit Liquet

Abstract Background The increasing number of genome-wide association studies (GWAS) has revealed several loci that are associated to multiple distinct phenotypes, suggesting the existence of pleiotropic effects. Highlighting these cross-phenotype genetic associations could help to identify and understand common biological mechanisms underlying some diseases. Common approaches test the association between genetic variants and multiple traits at the SNP level. In this paper, we propose a novel gene- and a pathway-level approach in the case where several independent GWAS on independent traits are available. The method is based on a generalization of the sparse group Partial Least Squares (sgPLS) to take into account groups of variables, and a Lasso penalization that links all independent data sets. This method, called joint-sgPLS, is able to convincingly detect signal at the variable level and at the group level. Results Our method has the advantage to propose a global readable model while coping with the architecture of data. It can outperform traditional methods and provides a wider insight in terms of a priori information. We compared the performance of the proposed method to other benchmark methods on simulated data and gave an example of application on real data with the aim to highlight common susceptibility variants to breast and thyroid cancers. Conclusion The joint-sgPLS shows interesting properties for detecting a signal. As an extension of the PLS, the method is suited for data with a large number of variables. The choice of Lasso penalization copes with architectures of groups of variables and observations sets. Furthermore, although the method has been applied to a genetic study, its formulation is adapted to any data with high number of variables and an exposed a priori architecture in other application fields.


Geophysics ◽  
2006 ◽  
Vol 71 (5) ◽  
pp. U67-U76 ◽  
Author(s):  
Robert J. Ferguson

The possibility of improving regularization/datuming of seismic data is investigated by treating wavefield extrapolation as an inversion problem. Weighted, damped least squares is then used to produce the regularized/datumed wavefield. Regularization/datuming is extremely costly because of computing the Hessian, so an efficient approximation is introduced. Approximation is achieved by computing a limited number of diagonals in the operators involved. Real and synthetic data examples demonstrate the utility of this approach. For synthetic data, regularization/datuming is demonstrated for large extrapolation distances using a highly irregular recording array. Without approximation, regularization/datuming returns a regularized wavefield with reduced operator artifacts when compared to a nonregularizing method such as generalized phase shift plus interpolation (PSPI). Approximate regularization/datuming returns a regularized wavefield for approximately two orders of magnitude less in cost; but it is dip limited, though in a controllable way, compared to the full method. The Foothills structural data set, a freely available data set from the Rocky Mountains of Canada, demonstrates application to real data. The data have highly irregular sampling along the shot coordinate, and they suffer from significant near-surface effects. Approximate regularization/datuming returns common receiver data that are superior in appearance compared to conventional datuming.


Author(s):  
Hervé Cardot ◽  
Pascal Sarda

This article presents a selected bibliography on functional linear regression (FLR) and highlights the key contributions from both applied and theoretical points of view. It first defines FLR in the case of a scalar response and shows how its modelization can also be extended to the case of a functional response. It then considers two kinds of estimation procedures for this slope parameter: projection-based estimators in which regularization is performed through dimension reduction, such as functional principal component regression, and penalized least squares estimators that take into account a penalized least squares minimization problem. The article proceeds by discussing the main asymptotic properties separating results on mean square prediction error and results on L2 estimation error. It also describes some related models, including generalized functional linear models and FLR on quantiles, and concludes with a complementary bibliography and some open problems.


Geophysics ◽  
2018 ◽  
Vol 83 (6) ◽  
pp. V345-V357 ◽  
Author(s):  
Nasser Kazemi

Given the noise-corrupted seismic recordings, blind deconvolution simultaneously solves for the reflectivity series and the wavelet. Blind deconvolution can be formulated as a fully perturbed linear regression model and solved by the total least-squares (TLS) algorithm. However, this algorithm performs poorly when the data matrix is a structured matrix and ill-conditioned. In blind deconvolution, the data matrix has a Toeplitz structure and is ill-conditioned. Accordingly, we develop a fully automatic single-channel blind-deconvolution algorithm to improve the performance of the TLS method. The proposed algorithm, called Toeplitz-structured sparse TLS, has no assumptions about the phase of the wavelet. However, it assumes that the reflectivity series is sparse. In addition, to reduce the model space and the number of unknowns, the algorithm benefits from the structural constraints on the data matrix. Our algorithm is an alternating minimization method and uses a generalized cross validation function to define the optimum regularization parameter automatically. Because the generalized cross validation function does not require any prior information about the noise level of the data, our approach is suitable for real-world applications. We validate the proposed technique using synthetic examples. In noise-free data, we achieve a near-optimal recovery of the wavelet and the reflectivity series. For noise-corrupted data with a moderate signal-to-noise ratio (S/N), we found that the algorithm successfully accounts for the noise in its model, resulting in a satisfactory performance. However, the results deteriorate as the S/N and the sparsity level of the data are decreased. We also successfully apply the algorithm to real data. The real-data examples come from 2D and 3D data sets of the Teapot Dome seismic survey.


Author(s):  
Jianqing Fan ◽  
Runze Li ◽  
Cun-Hui Zhang ◽  
Hui Zou

T-Comm ◽  
2020 ◽  
Vol 14 (12) ◽  
pp. 18-25
Author(s):  
Alina A. Sherstneva ◽  

The article aims to consider least squares approach for solving problems of queuing systems theory. The opportunity of predicting the behavior of infocommunication system is shown. Choosing the optimal model of its functioning is proposed. On base monitoring system metrics, statistical data were formed. The article proposes to make data trend forecasting, to estimate parameters of random processes over time. To obtain the results of functioning data in infocommunication systems that are as close as possible to the real values, polynomial and sine models are considered. The method of regression analysis is proposed to determine the parameter values for a model from a set of observational data. In theoretical research, the linear and nonlinear least squares methods are used in terms of a circle. The task of experimental analysis is to obtain an estimated parameter of sine, polynomial models and the center of circle. Experimental analysis was performed using the mathematical modeling program Matlab. A uniformly distributed random sequence and a random sequence with normal distribution are generated. The sequence with experimental data for polynomial and sine models, respectively, are calculated. The correspondence each model for generated data is shown in graphical form. The measurement data obeys observations. The estimated parameters are summarized in the tables. The polynomial order is estimated. The estimated dispersion curve of the polynomial model is obtained. The calculated variance values of the polynomial model are presented. Data trend forecasting for measurement data is made. The estimated values are extremally close to real data. The results are shown in graphs. Finally, an approximate model of the circumference of measurement data is presented in graphical form. After some iterations with estimated center from the arithmetic mean the new circle center is given. And quite close values for center and radius of circle are obtained.


Sign in / Sign up

Export Citation Format

Share Document