Variable Selection in Multivariate Calibration of a Spectroscopic Glucose Sensor

1997 ◽  
Vol 51 (10) ◽  
pp. 1559-1564 ◽  
Author(s):  
Michael J. McShane ◽  
Gerard L. Coté ◽  
Clifford Spiegelman

A variable selection method that reduces prediction bias in partial least-squares regression models was developed and applied to near-infrared absorbance spectra of glucose in pH buffer and cell culture medium. Comparisons between calibration and prediction capability for full spectra and reduced sets were completed. Variable selection resulted in statistically equivalent errors while reducing the number of wavelengths needed to fit the calibration data and predict concentrations from new spectra. Fewer than 25 wavelengths were selected to produce errors statistically equivalent to those yielded by the full set containing over 500 wavelengths. The algorithm correctly chose the glucose absorption peak areas as the information-carrying spectral regions.

2020 ◽  
Vol 2020 ◽  
pp. 1-12
Author(s):  
Weiwei Jiang ◽  
Changhua Lu ◽  
Yujun Zhang ◽  
Wei Ju ◽  
Jizhou Wang ◽  
...  

The MC-UVE-SPA method is commonly proposed as a variable selection approach for multivariate calibration. However, the SPA tends to select wavelength variables that are sparsely distributed over the wavelength ranges of the variables selected by the MC-UVE algorithm, and the MC-UVE-SPA cascade cannot improve the problem of wavelength point discontinuity. It is addressed in this paper by proposing a moving-window- (MW-) improved MC-UVE-SPA wavelength selection algorithm. The proposed algorithm improves the continuity of the selected wavelength variables and thereby better exploits the advantages of the MC-UVE algorithm and the SPA to obtain regression models with high prediction accuracy. The MC-UVE, MC-UVE-SPA, and MC-UVE-SPA-MW algorithms are applied for conducting wavelength variable selection for the NIR spectral absorbance data of corn, diesel fuel, and ethylene. Here, partial least squares regression (PLSR) models reflecting the oil content of corn, the boiling point of diesel fuel, and the ethylene concentration are established after conducting wavelength selection using the MC-UVE algorithm, and corresponding multiple linear regression (MLR) models are established after conducting wavelength selection using the MC-UVE-SPA and MC-UVE-SPA-MW algorithms. Experimental results demonstrate that the progressive elimination of uncorrelated and collinear variables generates increasingly simplified partial-spectrum models with greater prediction accuracy than the full-spectrum model. Among the three wavelength selection algorithms, the MC-UVE-SPA selected the least number of wavelength variables, while the proposed MC-UVE-SPA-MW algorithm provided models with the greatest prediction accuracy.


2017 ◽  
Vol 25 (4) ◽  
pp. 223-230 ◽  
Author(s):  
Joseph Dubrovkin

It was shown that linear transformations are suitable for use in multivariate calibration in near infrared spectroscopy as data compression tools. Partial Least Squares calibration models were built using spectral data transformed by expansion in the series of classical orthogonal polynomials, Fourier and wavelet harmonics. These models allowed effective prediction of the cetane number of diesel fuels, Brix and pol parameters of syrup in sugar production and fat and total protein content in milk. Depending on the compression ratio, prediction errors were no larger than 30% of corresponding errors obtained by the use of the non-transformed models. Although selection of the most suitable transformation depends on the calibration data and on the cross-validation method, in many cases Fourier transform gave satisfactory results.


2021 ◽  
Author(s):  
Hayfa Zayani ◽  
Youssef Fouad ◽  
Didier Michot ◽  
Zeineb Kassouk ◽  
Zohra Lili-Chabaane ◽  
...  

<p>Visible-Near Infrared (Vis-NIR) spectroscopy has proven its efficiency in predicting several soil properties such as soil organic carbon (SOC) content. In this preliminary study, we explored the ability of Vis-NIR to assess the temporal evolution of SOC content. Soil samples were collected in a watershed (ORE AgrHys), located in Brittany (Western France). Two sampling campaigns were carried out 5 years apart: in 2013, 198 soil samples were collected respectively at two depths (0-15 and 15-25 cm) over an area of 1200 ha including different land use and land cover; in 2018, 111 sampling points out of 198 of 2013 were selected and soil samples were collected from the same two depths. Whole samples were analyzed for their SOC content and were scanned for their reflectance spectrum. Spectral information was acquired from samples sieved at 2 mm fraction and oven dried at 40°C, 24h prior to spectra acquisition, with a full range Vis-NIR spectroradiometer ASD Fieldspec®3. Data set of 2013 was used to calibrate the SOC content prediction model by the mean of Partial Least Squares Regression (PLSR). Data set of 2018 was therefore used as test set. Our results showed that the variation ∆SOC<sub>obs</sub><sub></sub>obtained from observed values in 2013 and 2018 (∆SOC<sub>obs</sub> = Observed SOC (2018) - Observed SOC (2013)) is ranging from 0.1 to 25.9 g/kg. Moreover, our results showed that the prediction performance of the calibrated model was improved by including 11 spectra of 2018 in the 2013 calibration data set (R²= 0.87, RMSE = 5.1 g/kg and RPD = 1.92). Furthermore, the comparison of predicted and observed ∆SOC between 2018 and 2013 showed that 69% of the variations were of the same sign, either positive or negative. For the remaining 31%, the variations were of opposite signs but concerned mainly samples for which ∆SOCobs is less than 1,5 g/kg. These results reveal that Vis-NIR spectroscopy was potentially appropriate to detect variations of SOC content and are encouraging to further explore Vis-NIR spectroscopy to detect changes in soil carbon stocks.</p>


2000 ◽  
Vol 54 (4) ◽  
pp. 608-623 ◽  
Author(s):  
Vítézslav Centner ◽  
Jorge Verdú-Andrés ◽  
Beata Walczak ◽  
Delphine Jouan-Rimbaud ◽  
Frédéric Despagne ◽  
...  

The present study compares the performance of different multivariate calibration techniques applied to four near-infrared data sets when test samples are well within the calibration domain. Three types of problems are discussed: the nonlinear calibration, the calibration using heterogeneous data sets, and the calibration in the presence of irrelevant information in the set of predictors. Recommendations are derived from the comparison, which should help to guide a nonchemometrician through the selection of an appropriate calibration method for a particular type of calibration data. A flexible methodology is proposed to allow selection of an appropriate calibration technique for a given calibration problem.


2000 ◽  
Vol 54 (2) ◽  
pp. 294-299 ◽  
Author(s):  
Songbiao Zhang ◽  
Babs R. Soller ◽  
Shubjeet Kaur ◽  
Kristen Perras ◽  
Thomas J. Vander Salm

Hematocrit (Hct), the volume percent of red cells in blood, is monitored routinely for blood donors, surgical patients, and trauma victims and requires blood to be removed from the patient. An accurate, noninvasive method for directly measuring hematocrit on patients is desired for these applications. The feasibility of noninvasive hematocrit measurement by using near-infrared (NIR) spectroscopy and partial least-squares (PLS) techniques was investigated, and methods of in vivo calibration were examined. Twenty Caucasian patients undergoing cardiac surgery on cardiopulmonary bypass were randomly selected to form two study groups. A fiber-optic probe was attached to the patient's forearm, and NIR spectra were continuously collected during surgery. Blood samples were simultaneously collected and reference Hct measurements were made with the spun capillary method. PLS multivariate calibration techniques were applied to investigate the relationship between spectral and Hct changes. Single patient calibration models were developed with good cross-validated estimation of accuracy (∼ 1 Hct%) and trending capability for most patients. Time-dependent system drift, patient temperature, and venous oxygen saturation were not correlated with the hematocrit measurements. Multi-subject models were developed for prediction of independent subjects. These models demonstrated a significant patient-specific offset that was shown to be partially related to spectrometer drift. The remaining offset is attributed to the large spectral variability of patient tissue, and a significantly larger set of patients would be required to adequately model this variability. After the removal of the offset, the cross-validated estimation of accuracy is 2 Hct%.


1996 ◽  
Vol 50 (2) ◽  
pp. 270-276 ◽  
Author(s):  
Hoeil Chung ◽  
Mark A. Arnold ◽  
Martin Rhiel ◽  
David W. Murhammer

Calibration models are generated and evaluated for the measurement of five different components in synthetic mixtures prepared in aqueous solutions. Mixtures of glucose, glutamine, ammonia, lactate, and glutamate were prepared to simulate concentration levels expected during routine bioreactor fermentation processes. Near-IR spectra were collected from these solutions over the spectral range from 5000 to 4000 cm−1. This spectral information was used to build individual multivariate calibration models for each analyte. Models were constructed on the basis of partial least-squares regression of raw and Fourier filtered absorbance spectra. Each analyte could be detected selectively with mean percent errors of prediction ranging from 4 to 8%.


2020 ◽  
Vol 12 (20) ◽  
pp. 3394
Author(s):  
Lu Xu ◽  
Yongsheng Hong ◽  
Yu Wei ◽  
Long Guo ◽  
Tiezhu Shi ◽  
...  

Visible and near-infrared reflectance (VIS-NIR) spectroscopy is widely applied to estimate soil organic carbon (SOC). Intense and diverse human activities increase the heterogeneity in the relationships between SOC and VIS-NIR spectra in anthropogenic soil. This fact results in poor performance of SOC estimation models. To improve model accuracy and parsimony, we investigated the performance of two variable selection algorithms, namely competitive adaptive reweighted sampling (CARS) and random frog (RF), coupled with five spectral pretreatments. A total of 108 samples were collected from Jianghan Plain, China, with the SOC content and VIS-NIR spectra measured in the laboratory. Results showed that both CARS and RF coupled with partial least squares regression (PLSR) outperformed PLSR alone in terms of higher model accuracy and less spectral variables. It revealed that spectral variable selection could identify important spectral variables that account for the relationships between SOC and VIS-NIR spectra, thereby improving the accuracy and parsimony of PLSR models in anthropogenic soil. Our findings are of significant practical value to the SOC estimation in anthropogenic soil by VIS-NIR spectroscopy.


Sign in / Sign up

Export Citation Format

Share Document