Highly-overlapped, recursive partial least squares soft sensor with state partitioning via local variable selection

2018 ◽  
Vol 175 ◽  
pp. 104-115 ◽  
Author(s):  
Dominic V. Poerio ◽  
Steven D. Brown
2018 ◽  
Vol 2018 ◽  
pp. 1-9 ◽  
Author(s):  
Lianqing Zhu ◽  
Haitao Chang ◽  
Qun Zhou ◽  
Zhongyu Wang

In order to improve the classification accuracy of Chinese Salvia miltiorrhiza using near-infrared spectroscopy, a novel local variable selection strategy is thus proposed. Combining the strengths of the local algorithm and interval partial least squares, the spectra data have firstly been divided into several pairs of classes in sample direction and equidistant subintervals in variable direction. Then, a local classification model has been built, and the most proper spectral region has been selected based on the new evaluation criterion considering both classification error rate and best predictive ability under the leave-one-out cross validation scheme for each pair of classes. Finally, each observation can be assigned to belong to the class according to the statistical analysis of classification results of the local classification model built on selected variables. The performance of the proposed method was demonstrated through near-infrared spectra of cultivated or wild Salvia miltiorrhiza, which are collected from 8 geographical origins in 5 provinces of China. For comparison, soft independent modelling of class analogy and partial least squares discriminant analysis methods are, respectively, employed as the classification model. Experimental results showed that classification performance of the classification model with local variable selection was obvious better than that without variable selection.


2002 ◽  
Vol 56 (3) ◽  
pp. 337-345 ◽  
Author(s):  
S. Kamaledin Setarehdan ◽  
John J. Soraghan ◽  
David Littlejohn ◽  
Daran A. Sadler

Circulation ◽  
2021 ◽  
Vol 143 (Suppl_1) ◽  
Author(s):  
Natalie Gasca ◽  
Robyn McClelland

Most nutritional epidemiology studies investigating trends between diet and heart disease use outcome-independent dimension reduction methods, like principal component analysis, to create dietary patterns. While these methods construct patterns that describe important aspects of food consumption, these patterns are not inherently related to heart disease. Incorporating disease data into the pattern construction offers the possibility of more concisely summarizing the most disease-related foods. Sparse partial least squares (SPLS), one such method, was found to have favorable interpretation and prediction properties in the continuous outcome setting; while selecting a subset of relevant foods, it constructed a few dietary patterns that were correlated with BMI while also capturing variation in diet composition. These results were validated with simulated data. We propose incorporating SPLS into the Cox proportional hazards model to analyze a right-censored survival outcome. We hypothesized that this method would inherit the beneficial parsimony properties seen in the continuous setting, and we assessed whether this proposed method could use the most relevant covariates to create a few patterns that were associated with a survival outcome. While the proposed method targets covariate-level sparsity (i.e. variable selection), one competitor method exists that integrates pattern-level parsimony and partial least squares (PLS) in the Cox model, but it imposes more model parameters than the proposed method. We compared the variable selection, pattern selection, and predictive performance of four survival methods (Lasso, PLS, competitor sparse PLS, and proposed SPLS) via a simulation study. Simulation settings were informed in part by the Multi-Ethnic Study of Atherosclerosis (MESA), which has detailed food frequency questionnaire data on a large multi-ethnic population-based sample (6814 participants aged 45-84), as well as subsequent cardiovascular disease follow-up for over 15 years. In most studied simulation settings, the proposed method selected all 9 relevant predictors and the fewest number of irrelevant predictors (of 15) while creating a similar number of patterns and maintaining predictive ability of the outcome. In the setting most comparable to MESA, PLS chose all 24 predictors (by default) and 3.4 patterns (C-statistic=0.90), the competitor SPLS selected 21.1 predictors and 4.4 patterns (C-statistic=0.91), Lasso chose 16.4 predictors (C-statistic=0.91), and the proposed SPLS selected 11.7 predictors and 4.3 patterns (C-statistic=0.91), on average. We will also present an analysis of a coronary event in MESA using these four survival methods. In conclusion, we propose that using methods like SPLS to summarize food intake can create more heart disease-tailored dietary patterns that can complement the current nutritional epidemiology literature.


2014 ◽  
Vol 70 (5) ◽  
Author(s):  
Nor Fazila Rasaruddin ◽  
Mas Ezatul Nadia Mohd Ruah ◽  
Mohamed Noor Hasan ◽  
Mohd Zuli Jaafar

This paper shows the determination of iodine value (IV) of pure and frying palm oils using Partial Least Squares (PLS) regression with application of variable selection. A total of 28 samples consisting of pure and frying palm oils which acquired from markets. Seven of them were considered as high-priced palm oils while the remaining was low-priced. PLS regression models were developed for the determination of IV using Fourier Transform Infrared (FTIR) spectra data in absorbance mode in the range from 650 cm-1 to 4000 cm-1. Savitzky Golay derivative was applied before developing the prediction models. The models were constructed using wavelength selected in the FTIR region by adopting selectivity ratio (SR) plot and correlation coefficient to the IV parameter. Each model was validated through Root Mean Square Error Cross Validation, RMSECV and cross validation correlation coefficient, R2cv. The best model using SR plot was the model with mean centring for pure sample and model with a combination of row scaling and standardization of frying sample. The best model with the application of the correlation coefficient variable selection was the model with a combination of row scaling and standardization of pure sample and model with mean centering data pre-processing for frying sample. It is not necessary to row scaled the variables to develop the model since the effect of row scaling on model quality is insignificant.


Sign in / Sign up

Export Citation Format

Share Document