Least Squares Model Averaging Based on Generalized Cross Validation

2021 ◽  
Vol 37 (3) ◽  
pp. 495-509
Author(s):  
Xin-min Li ◽  
Guo-hua Zou ◽  
Xin-yu Zhang ◽  
Shang-wei Zhao
Geophysics ◽  
2018 ◽  
Vol 83 (6) ◽  
pp. V345-V357 ◽  
Author(s):  
Nasser Kazemi

Given the noise-corrupted seismic recordings, blind deconvolution simultaneously solves for the reflectivity series and the wavelet. Blind deconvolution can be formulated as a fully perturbed linear regression model and solved by the total least-squares (TLS) algorithm. However, this algorithm performs poorly when the data matrix is a structured matrix and ill-conditioned. In blind deconvolution, the data matrix has a Toeplitz structure and is ill-conditioned. Accordingly, we develop a fully automatic single-channel blind-deconvolution algorithm to improve the performance of the TLS method. The proposed algorithm, called Toeplitz-structured sparse TLS, has no assumptions about the phase of the wavelet. However, it assumes that the reflectivity series is sparse. In addition, to reduce the model space and the number of unknowns, the algorithm benefits from the structural constraints on the data matrix. Our algorithm is an alternating minimization method and uses a generalized cross validation function to define the optimum regularization parameter automatically. Because the generalized cross validation function does not require any prior information about the noise level of the data, our approach is suitable for real-world applications. We validate the proposed technique using synthetic examples. In noise-free data, we achieve a near-optimal recovery of the wavelet and the reflectivity series. For noise-corrupted data with a moderate signal-to-noise ratio (S/N), we found that the algorithm successfully accounts for the noise in its model, resulting in a satisfactory performance. However, the results deteriorate as the S/N and the sparsity level of the data are decreased. We also successfully apply the algorithm to real data. The real-data examples come from 2D and 3D data sets of the Teapot Dome seismic survey.


Author(s):  
Andrea Tri Rian Dani ◽  
Ludia Ni'matuzzahroh

Estimator Spline Truncated adalah salah satu pendekatan dalam regresi nonparametrik yang dapat digunakan ketika pola hubungan antara variabel respon dan variabel prediktor tidak diketahui dengan pasti polanya. Estimator Spline Truncated memiliki fleksibilitas yang tinggi dalam proses pemodelan. Pada penelitian ini  bertujuan untuk memodelkan persentase penduduk miskin Kabupaten/Kota di Provinsi Jawa Barat dengan menggunakan model regresi nonparametrik estimator Spline Truncated. Metode estimasi yang digunakan adalah Ordinary Least Squares (OLS). Kriteria kebaikan model regresi nonparametrik yang digunakan adalah Generalized Cross-Validation (GCV). Berdasarkan hasil analisis, diperoleh model terbaik dari regresi nonparametrik Spline Truncated, yaitu model dengan 3 titik knot, dimana diperoleh nilai GCV minimum sebesar 2.14. Berdasarkan hasil pengujian hipotesis, baik secara simultan maupun parsial, diketahui bahwa variabel prediktor yang digunakan pada penelitian ini, berpengaruh signifikan terhadap persentase penduduk miskin, dengan nilai koefisien determinasi sebesar 95.33%.


Agronomy ◽  
2021 ◽  
Vol 11 (4) ◽  
pp. 666
Author(s):  
Rafael Font ◽  
Mercedes del Río-Celestino ◽  
Diego Luna ◽  
Juan Gil ◽  
Antonio de Haro-Bailón

The near-infrared spectroscopy (NIRS) combined with modified partial least squares (modified PLS) regression was used for determining the neutral detergent fiber (NDF) and the acid detergent fiber (ADF) fractions of the chickpea (Cicer arietinum L.) seed. Fifty chickpea accessions (24 desi and 26 kabuli types) and fifty recombinant inbred lines F5:6 derived from a kabuli × desi cross were evaluated for NDF and ADF, and scanned by NIRS. NDF and ADF values were regressed against different spectral transformations by modified partial least squares regression. The coefficients of determination in the cross-validation and the standard deviation from the standard error of cross-validation ratio were, for NDF, 0.91 and 3.37, and for ADF, 0.98 and 6.73, respectively, showing the high potential of NIRS to assess these components in chickpea for screening (NDF) or quality control (ADF) purposes. The spectral information provided by different chromophores existing in the chickpea seed highly correlated with the NDF and ADF composition of the seed, and, thus, those electronic transitions are highly influenced on model fitting for fiber.


1992 ◽  
Vol 14 (4) ◽  
pp. 283-287 ◽  
Author(s):  
Chong Gu ◽  
Nancy Heckman ◽  
Grace Wahba

Agriculture ◽  
2021 ◽  
Vol 11 (11) ◽  
pp. 1129
Author(s):  
Yiping Peng ◽  
Lu Wang ◽  
Li Zhao ◽  
Zhenhua Liu ◽  
Chenjie Lin ◽  
...  

Soil nutrients play a vital role in plant growth and thus the rapid acquisition of soil nutrient content is of great significance for agricultural sustainable development. Hyperspectral remote-sensing techniques allow for the quick monitoring of soil nutrients. However, at present, obtaining accurate estimates proves to be difficult due to the weak spectral features of soil nutrients and the low accuracy of soil nutrient estimation models. This study proposed a new method to improve soil nutrient estimation. Firstly, for obtaining characteristic variables, we employed partial least squares regression (PLSR) fit degree to select an optimal screening algorithm from three algorithms (Pearson correlation coefficient, PCC; least absolute shrinkage and selection operator, LASSO; and gradient boosting decision tree, GBDT). Secondly, linear (multi-linear regression, MLR; ridge regression, RR) and nonlinear (support vector machine, SVM; and back propagation neural network with genetic algorithm optimization, GABP) algorithms with 10-fold cross-validation were implemented to determine the most accurate model for estimating soil total nitrogen (TN), total phosphorus (TP), and total potassium (TK) contents. Finally, the new method was used to map the soil TK content at a regional scale using the soil component spectral variables retrieved by the fully constrained least squares (FCLS) method based on an image from the HuanJing-1A Hyperspectral Imager (HJ-1A HSI) of the Conghua District of Guangzhou, China. The results identified the GBDT-GABP was observed as the most accurate estimation method of soil TN ( of 0.69, the root mean square error of cross-validation (RMSECV) of 0.35 g kg−1 and ratio of performance to interquartile range (RPIQ) of 2.03) and TP ( of 0.73, RMSECV of 0.30 g kg−1 and RPIQ = 2.10), and the LASSO-GABP proved to be optimal for soil TK estimations ( of 0.82, RMSECV of 3.39 g kg−1 and RPIQ = 3.57). Additionally, the highly accurate LASSO-GABP-estimated soil TK (R2 = 0.79) reveals the feasibility of the LASSO-GABP method to retrieve soil TK content at the regional scale.


Author(s):  
Wahyu Kurniasari, Dadan Kusnandar, Evy Sulistianingsih

Regresi spline merupakan suatu pendekatan ke arah pencocokan data dengan tetap memperhitungkan kemulusan kurva. Salah satu bentuk estimator dari regresi spline ialah penalized spline. Tujuan dari penelitian ini adalah untuk mengestimasi parameter regresi spline dengan metode penalized spline untuk data yang tidak memiliki pola tertentu. Data penelitian ini menggunakan data sekunder yang diperoleh dari Badan Pusat Statistik Indonesia pada tahun 2015 yaitu indeks pembangunan manusia, gini rasio, harapan lama sekolah, penduduk miskin, dan kepadatan penduduk. Hasil regresi spline yang diperoleh untuk model terbaik yaitu model spline linier pada setiap variabel dengan nilai Generalized Cross Validation (GCV) minimum. Hasil penelitian menunjukkan bahwa regresi spline dengan metode penalized spline menghasilkan estimasi parameter yang signifikan dan memperoleh nilai koefisien determinasi terkoreksi  sebesar 76,66% serta nilai MAPE untuk model regresi spline sebesar 1,415%. Kata Kunci: regresi nonparametrik, regresi spline, penalized spline.


Author(s):  
Manabu Kimura ◽  
◽  
Masashi Sugiyama

Recently, statistical dependence measures such as mutual information and kernelized covariance have been successfully applied to clustering. In this paper, we follow this line of research and propose a novel dependence-maximization clustering method based on least-squares mutual information, which is an estimator of a squared-loss variant of mutual information. A notable advantage of the proposed method over existing approaches is that hyperparameters such as kernel parameters and regularization parameters can be objectively optimized based on cross-validation. Thus, subjective manual-tuning of hyperparameters is not necessary in the proposed method, which is a highly useful property in unsupervised clustering scenarios. Through experiments, we illustrate the usefulness of the proposed approach.


Sign in / Sign up

Export Citation Format

Share Document