Examination of Criteria for Local Model Principal Component Regression

In analytical chemistry, principal component regression (PCR) is widely used as a method for calibration and prediction. The motivation behind PCR is to select factors associated with predictive information and eliminate those associated with noise. The classical approach, referred to as top-down selection, chooses sequential factors based on singular value magnitudes, and the same factors are used for all future unknown samples; i.e., a global model is formed. The number of factors needed is often determined through cross-validation on the calibration samples or with an external validation set. Alternatively, a model developed specific to an unknown sample, i.e., a local model or sample-dependent model, could offer improved accuracy. The idea behind sample-dependent PCR is that factors associated with small singular values not included in a top-down PCR model can still contain relevant predictive information. This paper shows that local models generated by selecting factors on a sample-by-sample basis often reduce prediction errors compared with those for the global top-down model. However, evidence is also provided that supports the use of global top-down models. Several criteria are proposed and examined for selecting factors on a sample-dependent basis. Observations and conclusions presented are based on two near-infrared data sets.

Download Full-text

Nondestructive NIR and NIT Determination of Protein, Fat, and Water in Plastic-Wrapped, Homogenized Meat

Applied Spectroscopy ◽

10.1366/0003702924926745 ◽

1992 ◽

Vol 46 (11) ◽

pp. 1685-1694 ◽

Cited By ~ 37

Author(s):

Tomas Isaksson ◽

Charles E. Miller ◽

Tormod Næs

Keyword(s):

Diffuse Reflectance ◽

Near Infrared ◽

Multivariate Calibration ◽

Principal Component Regression ◽

Principal Component ◽

Prediction Errors ◽

Optimal Test ◽

Meat Samples ◽

Water Contents

In this work, the abilities of near-infrared diffuse reflectance (NIR) and transmittance (NIT) spectroscopy to noninvasively determine the protein, fat, and water contents of plastic-wrapped homogenized meat are evaluated. One hundred homogenized beef samples, ranging from 1 to 23% fat, wrapped in polyamide/polyethylene laminates, were used. Results of multivariate calibration and prediction for protein, fat, and water contents are presented. The optimal test set prediction errors (root mean square error of prediction, RMSEP), obtained with the use of the principal component regression method with NIR data, were 0.45, 0.29 and 0.50 weight % for protein, fat, and water, respectively, for plastic-wrapped meat (compared to 0.40, 0.28 and 0.45 wt % for unwrapped meat). The optimal prediction errors for the NIT method were 0.31, 0.52 and 0.42 wt % for protein, fat, and water, respectively, for plastic-wrapped meat samples (compared to 0.27, 0.38, and 0.37 wt % for unwrapped meat). We can conclude that the addition of the laminate only slightly reduced the abilities of the NIR and NIT method to predict protein, fat, and water contents in homogenized meat.

Download Full-text

The Effect of Multiplicative Scatter Correction (MSC) and Linearity Improvement in NIR Spectroscopy

Applied Spectroscopy ◽

10.1366/0003702884429869 ◽

1988 ◽

Vol 42 (7) ◽

pp. 1273-1284 ◽

Cited By ~ 362

Author(s):

Tomas Isaksson ◽

Tormod Næs

Keyword(s):

Near Infrared ◽

Scatter Correction ◽

Principal Component Regression ◽

Nir Spectroscopy ◽

Principal Component ◽

Data Sets ◽

Prediction Errors ◽

Regression Methods ◽

Multiplicative Scatter Correction ◽

Linearity Improvement

Near-infrared (NIR) reflectance spectra of five different food products were measured. The spectra were transformed by multiplicative scatter correction (MSC). Principal component regression (PCR) was performed, on both scatter-corrected and uncorrected spectra. Calibration and prediction were performed for four food constituents: protein, fat, water, and carbohydrates. All regressions gave lower prediction errors (7–68% improvement) by the use of MSC spectra than by the use of uncorrected absorbance spectra. One of these data sets was studied in more detail to clarify the effects of the MSC, by using PCR score, residual, and leverage plots. The improvement by using nonlinear regression methods is indicated.

Download Full-text

Measurements of Urea and Glucose in Aqueous Solutions with Dual-Beam Near-Infrared Fourier Transform Spectroscopy

Applied Spectroscopy ◽

10.1366/000370202321115869 ◽

2002 ◽

Vol 56 (12) ◽

pp. 1593-1599 ◽

Cited By ~ 6

Author(s):

Peter Snoer Jensen ◽

Jimmy Bak

Keyword(s):

Aqueous Solutions ◽

Near Infrared ◽

Principal Component Regression ◽

Principal Component ◽

Prediction Errors ◽

Single Beam ◽

Dual Beam ◽

Calibration Models ◽

Ft Ir ◽

Beam Technique

This study investigates the use of a dual-beam, optical null, FT-IR spectrometer to measure trace organic components in aqueous solutions in the combination band region 5000–4000 cm−1. The spectrometer may be used for both single- and dual-beam measurements, thereby facilitating comparison of these two modes of operation. The concentrations of aqueous solutions of urea and glucose in the ranges 0–40 mg/dL and 0–250 mg/dL, respectively, were determined by principal component regression using both modes. The dual-beam technique eliminated instrumental variations present in the single-beam measurements that must be taken into account when quantifying trace components from single-beam spectra. The data obtained with the dual-beam technique resulted in more stable calibration models based on principal component regression. These calibration models need fewer factors and yield lower prediction errors than those based on traditional single-beam data.

Download Full-text

Locally Weighted Regression in Diffuse Near-Infrared Transmittance Spectroscopy

Applied Spectroscopy ◽

10.1366/0003702924444344 ◽

1992 ◽

Vol 46 (1) ◽

pp. 34-43 ◽

Cited By ~ 75

Author(s):

Tormod Næs ◽

Tomas Isaksson

Keyword(s):

Near Infrared ◽

Principal Component Regression ◽

Principal Component ◽

Distance Measures ◽

Weighted Regression ◽

Weight Functions ◽

Prediction Errors ◽

Least Squares Regression ◽

Locally Weighted Regression ◽

Infrared Transmittance

This paper presents an application of locally weighted regression (LWR) in diffuse near-infrared transmittance spectroscopy. The data are from beef and pork samples. The LWR method is based on the idea that a nonlinearity can be approximated by local linear equations. Different weight functions (for the samples) as well as different distance measures for “closeness” are tested. The LWR is compared to principal component regression and partial least-squares regression. The LWR with weighted principal components is shown to give the best results. The improvements with respect to linear regression are up to 15% of the prediction errors.

Download Full-text

Analisis Kandungan Nutrisi Kulit Kopi (Coffea sp.) dengan Menggunakan Metode Near Infrared Reflectance Spectroscopy (NIRS)

Jurnal Ilmiah Mahasiswa Pertanian ◽

10.17969/jimfp.v2i4.5589 ◽

2017 ◽

Vol 2 (4) ◽

Author(s):

Andika Boy Yuliansyah ◽

Sitti Wajizah ◽

Samadi Samadi

Keyword(s):

Near Infrared ◽

Principal Component Regression ◽

Principal Component ◽

Coffea Canephora ◽

Reflectance Spectroscopy ◽

Infrared Reflectance ◽

Near Infrared Reflectance Spectroscopy ◽

Near Infrared Reflectance ◽

Coffee Pulp ◽

Infrared Reflectance Spectroscopy

Abstrak. Tujuan penelitian ini adalah untuk mengevaluasi akurasi metode analisis pakan dengan metode (Near Infrared Reflectance Sectroscopy) NIRS dalam memprediksi kandungan nutrisi limbah kulit kopi serta mengetahui panjang gelombangnya. Penelitian ini dilakukan di Laboratorium Ilmu Nutrisi dan Teknologi Pakan, Univeritas Syiah Kuala, dari Agustus hingga September 2017. Penelitian ini menggunakan 30 sampel limbah kulit kopi yang terdiri dari 2 varietas kopi yaitu kopi arabika (Coffea arabica) dan kopi robusta (Coffea canephora). Spektrum diukur dengan menggunakan yaitu FT-IR IPTEK T-1516 pada rentang wavelengrh 1000-2500 nm dan di kalibrasi dan validasi dengan menggunakan software The Unscrambler X version 10.4. Pretreatment yang digunakan yaitu Multiplicative scatter analysis (MSC) dan DeTrending (DT) dengan metode regresi Principal Component Regression (PCR). Parameter nutrisi yang dianalisis yaitu bahan kering (BK), protein kasar (PK) dan serat kasar (SK). Hasil penelitian memperlihatkan bahwa NIRS dengan model yang telah dibangun tidak dapat menprediksi bahan kering dengan baik. Hal ini ditunjukkan dengan nilai r, R2 dan RPD yang rendah (0.58, 0.34 dan 3.06) serta RMSEC yang tinggi (3.06). Metode NIRS dapat memprediksi kandungan PK dan SK dengan baik pada penggunaan pretreatment MSC (PK= r: 0.87, R2: 0.76, RMSEC: 0.45 dan RPD: 2.07; SK= r: 0.87, R2: 0.75, RMSEC: 2.83 dan RPD: 2.03). Prediksi kasar untuk PK dan SK didapatkan dengan menggunakan pretreatment DT (PK= r: 0.75, R2: 0.57, RMSEC: 0.60 dan RPD: 1.55; SK= r: 0.84, R2: 0.71, RMSEC: 3.06 dan RPD: 1.88). Analysis of Coffee Pulp (Coffea sp.) Nutrition Content Using Near Infrared Reflectance Spectroscopy (NIRS) Method Abstract. The aim of present study was to evaluate the accuration of feed analysis method of Near infrared reflectance spectroscopy (NIRS) in predicting nutritional content of Coffee pulp and to know its wavelength. The study was conducted in nutrition science and feed technology Laboratory, Department of Animal Husbandry, Faculty of Agriculture, Syiah Kuala University, august until september, 2017. As many as 30 coffee pulps were used in this study and seperated to 2 specieses of coffee, arabica coffee (Coffea arabica) and robusta coffee (Coffea canephora). The spectrum was scanned using. FT-IR IPTEK T-1516 at 1000 to 2500 nm wavelength and calibrated and validated using The Unscrambler X version 10.4 software. Pretreatment used in this study was Multiplicative scatter analysis (MSC) dan DeTrending (DT) with Principal component regression (PCR) calibration method. Nutrition parameters analyzed were dry matter (DM), crude protein (CP) and dietary fiber (DF). The results of study showed that NIRS with prediction models that have been build cannot predicted DM content in coffee pulp. This was shown with low value of r, R2 dan RPD (0.58, 0.34 dan 3.06) and high value of RMSEC (3.60). NIRS method can predicted CP and DF content quite well using MSC pretreatment (CP= r: 0.87, R2: 0.76, RMSEC: 0.45 dan RPD: 2.07; DF= r: 0.87, R2: 0.75, RMSEC: 2.83 dan RPD: 2.03). Rough prediction for CP and DM content was obtained by using DT pretreatment (CP= r: 0.75, R2: 0.57, RMSEC: 0.60 dan RPD: 1.55; DF= r: 0.84, R2: 0.71, RMSEC: 3.06 dan RPD: 1.88).

Download Full-text

Use of residual maximum likelihood to evaluate accuracy of two NIRS calibration procedures, relative to Kjeldahl, for determining nitrogen concentration of soybeans

Canadian Journal of Plant Science ◽

10.4141/cjps91-053 ◽

1991 ◽

Vol 71 (2) ◽

pp. 385-392 ◽

Cited By ~ 5

Author(s):

G. B. Schaalje ◽

H. -H. Mündel

Keyword(s):

Maximum Likelihood ◽

Near Infrared ◽

Nitrogen Concentration ◽

Principal Component Regression ◽

Principal Component ◽

Field Trials ◽

Near Infrared Reflectance ◽

Residual Maximum Likelihood ◽

Maximum Likelihood Procedure ◽

Calibration Methods

The accuracy of estimates of plant properties based on near-infrared reflectance spectroscopy (NIRS) varies with many factors including the biological material in question and the method used to calibrate the NIRS instrument. This study investigated the accuracy, relative to Kjeldahl analysis, of NIRS analysis based on two calibration methods in estimating nitrogen concentration of four stages and/or parts of soybean (Glycine max (L.) Merr.) plants. Samples of whole top growth at anthesis, whole top growth at maturity, whole top growth at maturity excluding seeds, and seeds were obtained from two field trials and one phytotron experiment. Two Kjeldahl determinations of nitrogen concentration were obtained for each sample, as well as reflectance values at each of 19 infrared wavelengths, using a Technicon InfraAlyser 400R. Different subsets of the sample data were used for calibration and assessment of accuracy. The instrument was calibrated using stepwise multiple linear regression (SMLR) and principal component regression (PCR). The residual maximum likelihood procedure was useful in showing that NIRS estimates based on either SMLR or PCR were at least as accurate as Kjeldahl estimates for all stages and/or parts except whole top growth at maturity excluding seeds. Key words: Calibration, principal component regression, stepwise regression

Download Full-text

Rapid identification of epoxy resin and phenolic resin using near infrared spectroscopy

Journal of Near Infrared Spectroscopy ◽

10.1177/0967033517732580 ◽

2017 ◽

Vol 25 (5) ◽

pp. 324-329 ◽

Cited By ~ 2

Author(s):

Li Dan ◽

Wu Yi-Hui

Keyword(s):

Infrared Spectroscopy ◽

Epoxy Resin ◽

Near Infrared Spectroscopy ◽

Near Infrared ◽

Phenolic Resin ◽

External Validation ◽

Principal Component ◽

Rapid Identification ◽

Classical Analysis ◽

Prediction Rate

The aim of this research was to investigate the feasibility of Fourier transform near infrared spectroscopy combined with chemometric analysis to develop a rapid method for identification of different resin types which had been deemed similar by a preliminary visual examination. Principal component analysis was applied on spectral data to classify two types of epoxy resin samples and three types of phenolic resin samples. In this case, a total of two hundred and fifteen samples were used for the evaluation and validation of two types of epoxy resin samples (SY1342 and SY1346) and three types of phenolic resin samples (Y3567, Y2705 and Y2137). All were correctly differentiated by their respective models. Moreover, in the external validation, the prediction rate of samples correctly classified was also 100%. Such classifications are very important for the detection of adulterated samples and for quality control. Near infrared spectroscopy was shown to be a very reliable, accurate and useful tool to classify resin samples in a fast, clean and inexpensive way compared to classical analysis, and it will enable copper clad laminate manufacturers to detect and take early corrective actions that will ultimately save time and money while establishing a uniform quality.

Download Full-text

Comparative particle size determination of phenacetin bulk powder by using Kubelka–Munk theory and principal component regression analysis based on near-infrared spectroscopy

Powder Technology ◽

10.1016/j.powtec.2004.01.025 ◽

2004 ◽

Vol 141 (3) ◽

pp. 244-250 ◽

Cited By ~ 84

Author(s):

Makoto Otsuka

Keyword(s):

Particle Size ◽

Regression Analysis ◽

Near Infrared ◽

Principal Component Regression ◽

Principal Component ◽

Size Determination ◽

Bulk Powder ◽

Principal Component Regression Analysis ◽

Particle Size Determination

Download Full-text

Near-infrared spectroscopy quantitative determination of Pefloxacin mesylate concentration in pharmaceuticals by using partial least squares and principal component regression multivariate calibration

Spectrochimica Acta Part A Molecular and Biomolecular Spectroscopy ◽

10.1016/j.saa.2010.02.012 ◽

2010 ◽

Vol 75 (5) ◽

pp. 1535-1539 ◽

Cited By ~ 25

Author(s):

Yunfei Xie ◽

Yan Song ◽

Yong Zhang ◽

Bing Zhao

Keyword(s):

Infrared Spectroscopy ◽

Quantitative Determination ◽

Least Squares ◽

Partial Least Squares ◽

Near Infrared Spectroscopy ◽

Near Infrared ◽

Multivariate Calibration ◽

Principal Component Regression ◽

Principal Component

Download Full-text

A Comparison of Techniques for Modelling Data with Non-Linear Structure

Journal of Near Infrared Spectroscopy ◽

10.1255/jnirs.354 ◽

2003 ◽

Vol 11 (1) ◽

pp. 55-70 ◽

Cited By ~ 6

Author(s):

Laila Stordrange ◽

Olav M. Kvalheim ◽

Per A. Hassel ◽

Dick Malthe-Sørenssen ◽

Fred Olav Libnau

Keyword(s):

Near Infrared ◽

Principal Component Regression ◽

Predictive Ability ◽

Principal Component ◽

Linear Structure ◽

Data Set ◽

Linear Pattern ◽

Near Infrared Spectra ◽

Non Linear ◽

Regression Techniques

Partial least squares (PLS) is a powerful tool for multivariate linear regression. But what if the data show a non-linear structure? Near infrared spectra from a pharmaceutical process were used as a case study. An ANOVA test revealed that the data are well described by a 2nd order polynomial. This work investigates the application of regression techniques that account for slightly non-linear data. The regression techniques investigated are: linearising data by applying transformations, local PLS, i.e. splitting of data, and quadratic PLS. These models were compared with ordinary PLS and principal component regression (PCR). The predictive ability of the models was tested on an independent data set acquired a year later. Using the knowledge of non-linear pattern and important spectral regions, simpler models with better predictive ability can be obtained.

Download Full-text