Concentration Residual Augmented Classical Least Squares (CRACLS): A Multivariate Calibration Method with Advantages over Partial Least Squares

2002 ◽  
Vol 56 (5) ◽  
pp. 615-624 ◽  
Author(s):  
David K. Melgaard ◽  
David M. Haaland ◽  
Christine M. Wehlburg

A significant extension to the classical least-squares (CLS) algorithm called concentration residual augmented CLS (CRACLS) has been developed. Previously, unmodeled sources of spectral variation have rendered CLS models ineffective for most types of problems, but with the new CRACLS algorithm, CLS-type models can be applied to a significantly wider range of applications. This new quantitative multivariate spectral analysis algorithm iteratively augments the calibration matrix of reference concentrations with concentration residuals estimated during CLS prediction. Because these residuals represent linear combinations of the unmodeled spectrally active component concentrations, the effects of these components are removed from the calibration of the analytes of interest. This iterative process allows the development of a CLS-type calibration model comparable in prediction ability to implicit multivariate calibration methods such as partial least squares (PLS) even when unmodeled spectrally active components are present in the calibration sample spectra. In addition, CRACLS retains the improved qualitative spectral information of the CLS algorithm relative to PLS. More importantly, CRACLS provides a model compatible with the recently presented prediction-augmented CLS (PACLS) method. The CRACLS/PACLS combination generates an adaptable model that can achieve excellent prediction ability for samples of unknown composition that contain unmodeled sources of spectral variation. The CRACLS algorithm is demonstrated with both simulated and real data derived from a system of dilute aqueous solutions containing glucose, ethanol, and urea. The simulated data demonstrate the effectiveness of the new algorithm and help elucidate the principles behind the method. Using experimental data, we compare the prediction abilities of CRACLS and PLS during cross-validated calibration. In combination with PACLS, the CRACLS predictions are comparable to PLS for the prediction of the glucose, ethanol, and urea components for validation samples collected when significant instrument drift was present. However, the PLS predictions required recalibration using nonstandard cross-validated rotations while CRACLS/PACLS was rapidly updated during prediction without the need for time-consuming cross-validated recalibration. The CRACLS/PACLS algorithm provides a more general approach to removing the detrimental effects of unmodeled components.

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Camilo Broc ◽  
Therese Truong ◽  
Benoit Liquet

Abstract Background The increasing number of genome-wide association studies (GWAS) has revealed several loci that are associated to multiple distinct phenotypes, suggesting the existence of pleiotropic effects. Highlighting these cross-phenotype genetic associations could help to identify and understand common biological mechanisms underlying some diseases. Common approaches test the association between genetic variants and multiple traits at the SNP level. In this paper, we propose a novel gene- and a pathway-level approach in the case where several independent GWAS on independent traits are available. The method is based on a generalization of the sparse group Partial Least Squares (sgPLS) to take into account groups of variables, and a Lasso penalization that links all independent data sets. This method, called joint-sgPLS, is able to convincingly detect signal at the variable level and at the group level. Results Our method has the advantage to propose a global readable model while coping with the architecture of data. It can outperform traditional methods and provides a wider insight in terms of a priori information. We compared the performance of the proposed method to other benchmark methods on simulated data and gave an example of application on real data with the aim to highlight common susceptibility variants to breast and thyroid cancers. Conclusion The joint-sgPLS shows interesting properties for detecting a signal. As an extension of the PLS, the method is suited for data with a large number of variables. The choice of Lasso penalization copes with architectures of groups of variables and observations sets. Furthermore, although the method has been applied to a genetic study, its formulation is adapted to any data with high number of variables and an exposed a priori architecture in other application fields.


2003 ◽  
Vol 57 (2) ◽  
pp. 146-151 ◽  
Author(s):  
Babs R. Soller ◽  
Janice Favreau ◽  
Patrick O. Idwasi

The feasibility of using near-infrared (NIR) spectroscopy in combination with partial least-squares (PLS) regression was explored to measure electrolyte concentration in whole blood samples. Spectra were collected from diluted blood samples containing randomized, clinically relevant concentrations of Na+, K+, and Ca2+. Sodium was also studied in lysed blood. Reference measurements were made from the same samples using a standard clinical chemistry instrument. Partial least squares (PLS) was used to develop calibration models for each ion with acceptable results (Na+, R2 = 0.86, CVSEP = 9.5 mmol/L; K+, R2 = 0.54, CVSEP = 1.4 mmol/L; Ca2+, R2 = 0.56, CVSEP = 0.18 mmol/L). Slightly improved results were obtained using a narrower wavelength region (470–925 nm) where hemoglobin, but not water, absorbed indicating that ionic interaction with hemoglobin is as effective as water in causing measurable spectral variation. Good models were also achieved for sodium in lysed blood, illustrating that cell swelling, which is correlated with sodium concentration, is not required for calibration model development.


2000 ◽  
Vol 54 (2) ◽  
pp. 246-254 ◽  
Author(s):  
David M. Haaland

The analysis accuracy and precision of any multivariate calibration method will be severely degraded if unmodeled sources of spectral variation are present in the unknown sample spectra. A synthetic method for correcting errors generated by the presence of unmodeled components or other sources of unmodeled spectral variation has been developed. If the spectral shape of the unmodeled spectral component can be obtained and mathematically added in variable amounts to the original calibration spectra, then a new synthetic multivariate calibration model can be generated from the augmented data to accommodate the presence of the unmodeled source of spectral variation. The new method is demonstrated for a case where unmodeled temperature variations are present in the unknown sample spectra of dilute aqueous solutions of urea, creatinine, and NaCl. When constant-temperature partial least-squares (PLS) models are applied to spectra of variable-temperature samples, the standard errors of prediction (SEP) are approximately an order of magnitude higher than those of the original cross-validated SEPs of the constant-temperature PLS models. Synthetic models based upon constant-temperature data augmented with a classical least-squares (CLS) estimate of the spectral effect of temperature obtained from variable-temperature aqueous sample spectra are demonstrated to significantly reduce errors when predicting concentrations from spectra of solutions at variable-temperature. We demonstrate that the prediction precisions approach the original calibration precisions when the new synthetic PLS models are applied to variable-temperature solution spectra. Although spectrometer drift added bias errors to the analyte determinations, a method is demonstrated that can minimize the effect of long-term drift on prediction errors through the measurement and use of spectra obtained from a small subset of samples measured during both calibration and prediction. In addition, sample temperature can be predicted with high accuracy (±0.13 °C) with this new synthetic PLS modeling method without the need to recalibrate using actual variable-temperature sample data. Therefore, the synthetic method eliminates the need for expensive generation of new calibration samples and collection of their spectra. The method is quite general and can be applied by using any known source of spectral variation and used with any multivariate calibration method.


2002 ◽  
Vol 56 (5) ◽  
pp. 605-614 ◽  
Author(s):  
Christine M. Wehlburg ◽  
David M. Haaland ◽  
David K. Melgaard ◽  
Laura E. Martin

Our newly developed prediction-augmented classical least-squares/partial least-squares (PACLS/PLS) hybrid algorithm can correct for the presence of unmodeled sources of spectral variation such as instrument drift by explicitly incorporating known or empirically derived information about the unmodeled spectral variation. We have tested the ability of the new hybrid algorithm to maintain a multivariate calibration in the presence of instrument drift using a near-infrared (NIR) spectrometer (7500–11 000 cm−1) to quantitate dilute aqueous solutions containing glucose, ethanol, and urea. The spectral variations required to update the multivariate models for both short- and long-term drift were obtained using a single representative midpoint sample whose spectrum was repeatedly measured during collection of calibration data and during collection of separate validation sample spectra on three subsequent days. The performance of the PACLS/PLS model for maintaining a calibration was compared to PLS with subset recalibration, a method that has previously been applied to maintenance and transfer of calibration. Without drift corrections, both PACLS/PLS and PLS had poor predictive ability on sample spectra collected on subsequent days. Unlike previous maintenance of calibration studies that corrected for long-term drift only, the PACLS/PLS and PLS models demonstrated the best predictive abilities when short-term drift was also corrected. The PACLS/PLS hybrid model outperformed PLS with subset recalibration for near real-time predictions when instrument drift was determined from the repeat samples closest in time to the measurement of the unknown. Near real-time standard errors of prediction (SEPs) for the hybrid model were comparable to the cross-validated SEPs obtained with the original calibration model.


1995 ◽  
Vol 49 (4) ◽  
pp. 437-443 ◽  
Author(s):  
Jimmy Bak ◽  
Anders Larsen

Calibration spectra of CO in the 2.38–5100 ppm concentration range (22 spectra) have been measured with a spectral resolution of 4 cm−1, in the mid-IR (2186–2001 cm−1) region, with a Fourier transform infrared (FT-IR) instrument. The multivariate calibration method partial least-squares (PLS1) was used to model the CO calibration spectra in order to improve the sensitivity and to flag possible outliers in the prediction step. The relation between the absorbance values and concentrations was strongly nonlinear. This result was caused mainly by the low spectral resolution of the instrument. To improve the model predictions, we have linearized the data prior to making the model calculations. The linearization scheme presented here simplified the data pretreatment, because the function needed to linearize the data might be approximated by co-absorbance peak areas representing the concentrations. The integrated absorbance areas, rather than the concentration values, were used as input to the PLS algorithm. A fifth-order polynomial was used to calculate the concentrations from the predicted absorbance areas. The PLS algorithm used on the linearized data reduced the number of factors in the calibration model. Our results reveal that the calibration model based on the linearized data had a high concentration prediction accuracy throughout the entire concentration range.


2011 ◽  
Vol 8 (4) ◽  
pp. 1670-1679 ◽  
Author(s):  
Amir H. M. Sarrafi ◽  
Elahe Konoz ◽  
Maryam Ghiyasvand

Resolution of binary mixture of atorvastatin (ATV) and amlodipine (AML) with minimum sample pretreatment and without analyte separation has been successfully achieved using a rapid method based on partial least square analysis of UV–spectral data. Multivariate calibration modeling procedures, traditional partial least squares (PLS-2), interval partial least squares (iPLS) and synergy partial least squares (siPLS), were applied to select a spectral range that provided the lowest prediction error in comparison to the full-spectrum model. The simultaneous determination of both analytes was possible by PLS processing of sample absorbance between 220-425 nm. The correlation coefficients (R) and root mean squared error of cross validation (RMSECV) for ATV and AML in synthetic mixture were 0.9991, 0.9958 and 0.4538, 0.2411 in best siPLS models respectively. The optimized method has been used for determination of ATV and AML in amostatin commercial tablets. The proposed method are simple, fast, inexpensive and do not need any separation or preparation methods.


Author(s):  
Margaretha Ohyver

Multivariate calibration model aims to predict the expensive measures obtained by using the measures of a cheap and easy. There are several problems that often occur in the model calibration, among others, and multikolinear. To overcome these problems we used partial least squares method (PLS). The study was conducted to apply the PLS method on the data gingerol. Based on research conducted with the two components of the model obtained with the diversity of variable Y at 83.8032% and the diversity of variable X equal to 100%, and obtained for R2 = 83.8% and RMSE = 0.100891 calibration data group and R2 = 84.2 % and RMSEP = 0.199939 for the validation data.


RSC Advances ◽  
2014 ◽  
Vol 4 (77) ◽  
pp. 41039-41044 ◽  
Author(s):  
Shiva Mahramyari ◽  
Eslam Pourbasheer ◽  
Alireza Banaei ◽  
Mohammad Reza Ganjali ◽  
Parviz Norouzi

The simultaneous spectrophotometric determination of ceftazidime (CTZ) and sulbactam (SBT) in the presence of the overlapping spectra were accomplished with the partial least squares (PLS) and genetic algorithm-partial least square (GA-PLS) approaches.


Plant Methods ◽  
2021 ◽  
Vol 17 (1) ◽  
Author(s):  
Jordi Ortuño ◽  
Sokratis Stergiadis ◽  
Anastasios Koidis ◽  
Jo Smith ◽  
Chris Humphrey ◽  
...  

Abstract Background The presence of condensed tannins (CT) in tree fodders entails a series of productive, health and ecological benefits for ruminant nutrition. Current wet analytical methods employed for full CT characterisation are time and resource-consuming, thus limiting its applicability for silvopastoral systems. The development of quick, safe and robust analytical techniques to monitor CT’s full profile is crucial to suitably understand CT variability and biological activity, which would help to develop efficient evidence-based decision-making to maximise CT-derived benefits. The present study investigates the suitability of Fourier-transformed mid-infrared spectroscopy (MIR: 4000–550 cm−1) combined with multivariate analysis to determine CT concentration and structure (mean degree of polymerization—mDP, procyanidins:prodelphidins ratio—PC:PD and cis:trans ratio) in oak, field maple and goat willow foliage, using HCl:Butanol:Acetone:Iron (HBAI) and thiolysis-HPLC as reference methods. Results The MIR spectra obtained were explored firstly using Principal Component Analysis, whereas multivariate calibration models were developed based on partial least-squares regression. MIR showed an excellent prediction capacity for the determination of PC:PD [coefficient of determination for prediction (R2P) = 0.96; ratio of prediction to deviation (RPD) = 5.26, range error ratio (RER) = 14.1] and cis:trans ratio (R2P = 0.95; RPD = 4.24; RER = 13.3); modest for CT quantification (HBAI: R2P = 0.92; RPD = 3.71; RER = 13.1; Thiolysis: R2P = 0.88; RPD = 2.80; RER = 11.5); and weak for mDP (R2P = 0.66; RPD = 1.86; RER = 7.16). Conclusions MIR combined with chemometrics allowed to characterize the full CT profile of tree foliage rapidly, which would help to assess better plant ecology variability and to improve the nutritional management of ruminant livestock.


Sign in / Sign up

Export Citation Format

Share Document