scholarly journals Locality-Preserving Partial Least Squares Regression

Author(s):  
Jing Wang ◽  
Jinglin Zhou ◽  
Xiaolu Chen

AbstractThis chapter proposes another nonlinear PLS method, named as locality-preserving partial least squares (LPPLS), which embeds the nonlinear degenerative and structure-preserving properties of LPP into the PLS model. The core of LPPLS is to replace the role of PCA in PLS with LPP. When extracting the principal components of $$\boldsymbol{t}_i$$ t i and $$\boldsymbol{u}_i$$ u i , two conditions must satisfy: (1) $$\boldsymbol{t}_i$$ t i and $$\boldsymbol{u}_i$$ u i retain the most information about the local nonlinear structure of their respective data sets. (2) The correlation between $$\boldsymbol{t}_i$$ t i and $$\boldsymbol{u}_i$$ u i is the largest. Finally, a quality-related monitoring strategy is established based on LPPLS.

2019 ◽  
Vol 5 (1) ◽  
pp. 10 ◽  
Author(s):  
Ahmed Rady ◽  
Daniel Guyer ◽  
William Kirk ◽  
Irwin R Donis-González

The sprouting of potato tubers during storage is a significant problem that suppresses obtaining high quality seeds or fried products. In this study, the potential of fusing data obtained from visible (VIS)/near-infrared (NIR) spectroscopic and hyperspectral imaging systems was investigated, to improve the prediction of primordial leaf count as a significant sign for tubers sprouting. Electronic and lab measurements were conducted on whole tubers of Frito Lay 1879 (FL1879) and Russet Norkotah (R.Norkotah) potato cultivars. The interval partial least squares (IPLS) technique was adopted to extract the most effective wavelengths for both systems. Linear regression was utilized using partial least squares regression (PLSR), and the best calibration model was chosen using four-fold cross-validation. Then the prediction models were obtained using separate test data sets. Prediction results were enhanced compared with those obtained from individual systems’ models. The values of the correlation coefficient (the ratio between performance to deviation, or r(RPD)) were 0.95(3.01) and 0.9s6(3.55) for FL1879 and R.Norkotah, respectively, which represented a feasible improvement by 6.7%(35.6%) and 24.7%(136.7%) for FL1879 and R.Norkotah, respectively. The proposed study shows the possibility of building a rapid, noninvasive, and accurate system or device that requires minimal or no sample preparation to track the sprouting activity of stored potato tubers.


1996 ◽  
Vol 26 (4) ◽  
pp. 590-600 ◽  
Author(s):  
Katherine L. Bolster ◽  
Mary E. Martin ◽  
John D. Aber

Further evaluation of near infrared reflectance spectroscopy as a method for the determination of nitrogen, lignin, and cellulose concentrations in dry, ground, temperate forest woody foliage is presented. A comparison is made between two regression methods, stepwise multiple linear regression and partial least squares regression. The partial least squares method showed consistently lower standard error of calibration and higher R2 values with first and second difference equations. The first difference partial least squares regression equation resulted in standard errors of calibration of 0.106%, with an R2 of 0.97 for nitrogen, 1.613% with an R2 of 0.88 for lignin, and 2.103% with an R2 of 0.89 for cellulose. The four most highly correlated wavelengths in the near infrared region, and the chemical bonds represented, are shown for each constituent and both regression methods. Generalizability of both methods for prediction of protein, lignin, and cellulose concentrations on independent data sets is discussed. Prediction accuracy for independent data sets and species from other sites was increased using partial least squares regression, but was poor for sample sets containing tissue types or laboratory-measured concentration ranges beyond those of the calibration set.


2019 ◽  
Author(s):  
Derek Beaton ◽  
Gilbert Saporta ◽  
Hervé Abdi ◽  

AbstractCurrent large scale studies of brain and behavior typically involve multiple populations, diverse types of data (e.g., genetics, brain structure, behavior, demographics, or “mutli-omics,” and “deep-phenotyping”) measured on various scales of measurement. To analyze these heterogeneous data sets we need simple but flexible methods able to integrate the inherent properties of these complex data sets. Here we introduce partial least squares-correspondence analysis-regression (PLS-CA-R) a method designed to address these constraints. PLS-CA-R generalizes PLS regression to most data types (e.g., continuous, ordinal, categorical, non-negative values). We also show that PLS-CA-R generalizes many “two-table” multivariate techniques and their respective algorithms, such as various PLS approaches, canonical correlation analysis, and redundancy analysis (a.k.a. reduced rank regression).


2002 ◽  
Vol 56 (7) ◽  
pp. 887-896 ◽  
Author(s):  
Henrik Öjelund ◽  
Henrik Madsen ◽  
Poul Thyregod

In this article a new calibration method called empirically weighted mean subset (EMS) is presented. The method is illustrated using spectral data. Using several near-infrared (NIR) benchmark data sets, EMS is compared to partial least-squares regression (PLS) and interval partial least-squares regression (iPLS). It is found that EMS improves on the prediction performance over PLS in terms of the mean squared errors and is more robust than iPLS. Furthermore, by investigating the estimated coefficient vector of EMS, knowledge about the important spectral regions can be gained. The EMS solution is obtained by calculating the weighted mean of all coefficient vectors for subsets of the same size. The weighting is proportional to SS−ωγ, where SSγ is the residual sum of squares from a linear regression with subset γ and ω is a weighting parameter estimated using cross-validation. This construction of the weighting implies that even if some coefficients will become numerically small, none will become exactly zero. An efficient algorithm has been implemented in MATLAB to calculate the EMS solution and the source code has been made available on the Internet.


2006 ◽  
Vol 82 (4) ◽  
pp. 463-468 ◽  
Author(s):  
N.P.P. Macciotta ◽  
C. Dimauro ◽  
N. Bacciu ◽  
P. Fresi ◽  
A. Cappio-Borlino

AbstractA model able to predict missing test day data for milk, fat and protein yields on the basis of few recorded tests was proposed, based on the partial least squares (PLS) regression technique, a multivariate method that is able to solve problems related to high collinearity among predictors. A data set of 1731 lactations of Sarda breed dairy Goats was split into two data sets, one for model estimation and the other for the evaluation of PLS prediction capability. Eight scenarios of simplified recording schemes for fat and protein yields were simulated. Correlations among predicted and observed test day yields were quite high (from 0·50 to 0·88 and from 0·53 to 0·96 for fat and protein yields, respectively, in the different scenarios). Results highlight great flexibility and accuracy of this multivariate technique.


2012 ◽  
Vol 61 (2) ◽  
pp. 277-290 ◽  
Author(s):  
Ádám Csorba ◽  
Vince Láng ◽  
László Fenyvesi ◽  
Erika Michéli

Napjainkban egyre nagyobb igény mutatkozik olyan technológiák és módszerek kidolgozására és alkalmazására, melyek lehetővé teszik a gyors, költséghatékony és környezetbarát talajadat-felvételezést és kiértékelést. Ezeknek az igényeknek felel meg a reflektancia spektroszkópia, mely az elektromágneses spektrum látható (VIS) és közeli infravörös (NIR) tartományában (350–2500 nm) végzett reflektancia-mérésekre épül. Figyelembe véve, hogy a talajokról felvett reflektancia spektrum információban nagyon gazdag, és a vizsgált tartományban számos talajalkotó rendelkezik karakterisztikus spektrális „ujjlenyomattal”, egyetlen görbéből lehetővé válik nagyszámú, kulcsfontosságú talajparaméter egyidejű meghatározása. Dolgozatunkban, a reflektancia spektroszkópia alapjaira helyezett, a talajok ösz-szetételének meghatározását célzó módszertani fejlesztés első lépéseit mutatjuk be. Munkánk során talajok szervesszén- és CaCO3-tartalmának megbecslését lehetővé tévő többváltozós matematikai-statisztikai módszerekre (részleges legkisebb négyzetek módszere, partial least squares regression – PLSR) épülő prediktív modellek létrehozását és tesztelését végeztük el. A létrehozott modellek tesztelése során megállapítottuk, hogy az eljárás mindkét talajparaméter esetében magas R2értéket [R2(szerves szén) = 0,815; R2(CaCO3) = 0,907] adott. A becslés pontosságát jelző közepes négyzetes eltérés (root mean squared error – RMSE) érték mindkét paraméter esetében közepesnek mondható [RMSE (szerves szén) = 0,467; RMSE (CaCO3) = 3,508], mely a reflektancia mérési előírások standardizálásával jelentősen javítható. Vizsgálataink alapján arra a következtetésre jutottunk, hogy a reflektancia spektroszkópia és a többváltozós kemometriai eljárások együttes alkalmazásával, gyors és költséghatékony adatfelvételezési és -értékelési módszerhez juthatunk.


2013 ◽  
Vol 38 (4) ◽  
pp. 465-470 ◽  
Author(s):  
Jingjie Yan ◽  
Xiaolan Wang ◽  
Weiyi Gu ◽  
LiLi Ma

Abstract Speech emotion recognition is deemed to be a meaningful and intractable issue among a number of do- mains comprising sentiment analysis, computer science, pedagogy, and so on. In this study, we investigate speech emotion recognition based on sparse partial least squares regression (SPLSR) approach in depth. We make use of the sparse partial least squares regression method to implement the feature selection and dimensionality reduction on the whole acquired speech emotion features. By the means of exploiting the SPLSR method, the component parts of those redundant and meaningless speech emotion features are lessened to zero while those serviceable and informative speech emotion features are maintained and selected to the following classification step. A number of tests on Berlin database reveal that the recogni- tion rate of the SPLSR method can reach up to 79.23% and is superior to other compared dimensionality reduction methods.


Sign in / Sign up

Export Citation Format

Share Document