A Calibration Tutorial for Spectral Data. Part 1: Data Pretreatment and Principal Component Regression Using Matlab

1996 ◽  
Vol 4 (1) ◽  
pp. 225-242 ◽  
Author(s):  
Paul Geladi ◽  
Harald Martens

Regression and calibration play an important role in analytical chemistry. All analytical instrumentation is dependent on a calibration that uses some regression model for a set of calibration samples. The ordinary least squares (OLS) method of building a multivariate linear regression (MLR) model has strict limitations. Therefore, biased or regularised regression models have been introduced. Some selected ones are ridge regression (RR), principal component regression (PCR) and partial least squares regression (PLS or PLSR). Also, artificial neural networks (ANN) based on back-propagation can be used as regression models. In order to understand regression models more is needed than just a set of statistical parameters. A deeper understanding of the underlying chemistry and physics is always equally important. For spectral data this means that a basic understanding of spectra and their errors is useful and that spectral representation should be included in judging the usefulness of the data treatment. A “constructed” spectrometric example is introduced. It consists of real spectrometric measurements in the range 408–1176 nm for 26 calibration samples and 10 test samples. The main response variable is litmus concentration, but other constituents such as bromocresolgreen and ZnO are added as interferents and also the pH is changed. The example is introduced as a tutorial. All calculations are shown in detail in Matlab. This makes it easy for the reader to follow and understand the calculations. It also makes the calculations completely traceable. The raw data are available as a file. In Part 1, the emphasis is on pretreatment of the data and on visualisation in different stages of the calculations. Part 1 ends with principal component regression calculations. Partial least squares calculations and some ANN results are presented in Part 2.

2014 ◽  
Vol 2014 ◽  
pp. 1-7 ◽  
Author(s):  
Shen Yin ◽  
Lei Liu ◽  
Xin Gao ◽  
Hamid Reza Karimi

Soft measurement is a new, developing, and promising industry technology and has been widely used in the industry nowadays. This technology plays a significant role especially in the case where some key variables are difficult to be measured by traditional measurement methods. In this paper, the quality of the wine is evaluated given the wine physicochemical indexes according to multivariate methods based soft measurement. The multivariate methods used in this paper include ordinary least squares regression (OLSR), principal component regression (PCR), partial least squares regression (PLSR), and modified partial least squares regression (MPLSR). By comparing the performance of the four methods, the MPLSR prediction model shows superior results than the others. In general, to determine the quality of the wine, experienced wine tasters are hired to taste the wine and make a decision. However, since the physicochemical indexes of wine can to some extent reflect the quality of wine, the multivariate statistical methods based soft measure can help the oenologist in wine evaluation.


2019 ◽  
Vol 8 (4) ◽  
pp. 496-505
Author(s):  
Vetranella .T.R.A. Sinaga ◽  
Diah Safitri ◽  
Rita Rahmawati

The multiple regression classic assumptions are used to give linear unbiased and minimum variance estimator. In Human Development Index (HDI) and influencing factors in East Java, there are two variables with VIF more than 10 so the assumption of non-multicollinearity is not fulfilled. Principal component regression (PCR) and partial least squares regression (PLS-R) can solve this problem. By doing principal component analysis, there are two linear combinations to take as the new   independent variables which are free from collinearity. In the PLS-R, NIPALS algorithm is used to calculate the components and other structures and to estimate the parameter. While in PCR all independent variables are significant, the percentage of households with drinking water is feasibles is not significant in the model. PLS-R’s  is 95,85% is greater than PCR’s  = 93,42%. PCR’s PRESS = 81,78 is greater than PLS-R’s PRESS = 61,0595.Keywords: Human Development Index (HDI), Multicollinearity, Principal Component Regression, Partial Least Squares Regression, , PRESS


Sign in / Sign up

Export Citation Format

Share Document