Infrared Spectral Search for Mixtures in Large-Size Libraries

1991 ◽  
Vol 45 (10) ◽  
pp. 1628-1632 ◽  
Author(s):  
Su-Chin Lo ◽  
Chris W. Brown

A routine for searching large spectral libraries with spectra of mixtures is presented. The dimensionality of a 3169-compound library is reduced to 12% of its original size by using Fourier transform compression and principal component analysis. A principal component regression is performed and used as a prefilter in selecting spectra having features (and chemical groups) similar to those of the unknown mixture. A dot-product metric is then used to identify a target component from the subgroup formed by the prefilter. This is followed by the application of an adaptive filter to remove the similarity of the target component from the subgroup and from the unknown mixture; the search is repeated on the modified data. Successive applications of the adaptive filter will produce minimum residuals if the correct identifications are made. Once the residuals are minimized, a similarity index is calculated to determine the closeness of the unknown mixture spectrum to a spectrum reconstructed from the library spectra. Four out of five two- and three-component spectra were correctly identified. One of the two components in the fifth mixture was correctly identified, and the residual values flagged the improper identification of the second component. After the adaptive filter was applied to the entire library, the second component was correctly identified. Results for this new algorithm are compared to those from four more traditional search routines, which were only completely successful on one of the unknown mixtures.

1991 ◽  
Vol 45 (10) ◽  
pp. 1621-1627 ◽  
Author(s):  
Su-Chin Lo ◽  
Chris W. Brown

A new algorithm is presented for searching medium-size infrared spectral libraries for the components in spectra of mixtures. The algorithm treats the spectra in the library as an m-component quantitative analysis problem in which each of the library spectra represents a standard mixture having a concentration of 1.0 for that component. Principal component regression (PCR) is used to reduce the dimensionality of the problem and to provide the regression coefficients for determining pseudo-concentrations or composition indices (CI) in mixtures. The PCR analysis is followed by the application of an adaptive filter to remove all similarity of the first target component from the mixture and from a selected subgroup of the library. This is followed by a second PCR analysis on the modified spectral data to identify the next target compound. If the correct target components are selected with successive applications of the adaptive filter, the residuals will approach zero. All components in five two- and three-component mixtures were correctly identified by this new Mix-Match algorithm, whereas only two of the five mixtures were completely identified by a typical dot-product search routine.


1987 ◽  
Vol 41 (3) ◽  
pp. 449-453 ◽  
Author(s):  
P. B. Harrington ◽  
T. L. Isenhour

Different methods of data preprocessing were evaluated for the compression of Fourier transform-infrared spectral libraries by principal component analysis (PCA). The effect of noise on compressed library searches was examined. A PCA compression of an infrared library achieved an 81% reduction in size without any loss in search performance.


1992 ◽  
Vol 46 (5) ◽  
pp. 790-796 ◽  
Author(s):  
Su-Chin Lo ◽  
Chris W. Brown

A mathematical technique for the identification of components in the near-infrared spectra of liquid mixtures without any prior chemical information is demonstrated. Originally, the technique was developed for searching mid-infrared spectral libraries. It utilizes principal component analysis to generate an orthonormal reference library and to compute the projections or scores of a mixture spectrum onto the principal space spanned by the orthonormal set. Both library and mixture spectra are analyzed and processed in Fourier domain to enhance the searching performance. A calibration matrix is calculated from library scores and is used to predict the mixture composition. Five liquid mixtures were correctly identified with the use of the calibration algorithm, whereas only one mixture was correctly characterized with a straight dot-product metric. The predictions were verified with the use of an adaptive filter to remove each of the resulting components from the library and the mixture spectra. In addition, a similarity index between the original mixture spectrum and a regenerated mixture spectrum is used as a final confirmation of the predictions. The effects of random noise on the searching method were also examined, and further enhancements of searching performance are suggested for identifying poor-quality mixture spectra.


2019 ◽  
Vol 8 (1) ◽  
Author(s):  
Khairunnisa Khairunnisa ◽  
Rizka Pitri ◽  
Victor P Butar-Butar ◽  
Agus M Soleh

This research used CFSRv2 data as output data general circulation model. CFSRv2 involves some variables data with high correlation, so in this research is using principal component regression (PCR) and partial least square (PLS) to solve the multicollinearity occurring in CFSRv2 data. This research aims to determine the best model between PCR and PLS to estimate rainfall at Bandung geophysical station, Bogor climatology station, Citeko meteorological station, and Jatiwangi meteorological station by comparing RMSEP value and correlation value. Size used was 3×3, 4×4, 5×5, 6×6, 7×7, 8×8, 9×9, and 11×11 that was located between (-40) N - (-90) S and 1050 E -1100 E with a grid size of 0.5×0.5 The PLS model was the best model used in stastistical downscaling in this research than PCR model because of the PLS model obtained the lower RMSEP value and the higher correlation value. The best domain and RMSEP value for Bandung geophysical station, Bogor climatology station, Citeko meteorological station, and Jatiwangi meteorological station is 9 × 9 with 100.06, 6 × 6 with 194.3, 8 × 8 with 117.6, and 6 × 6 with 108.2, respectively.


2007 ◽  
Vol 90 (2) ◽  
pp. 391-404 ◽  
Author(s):  
Fadia H Metwally ◽  
Yasser S El-Saharty ◽  
Mohamed Refaat ◽  
Sonia Z El-Khateeb

Abstract New selective, precise, and accurate methods are described for the determination of a ternary mixture containing drotaverine hydrochloride (I), caffeine (II), and paracetamol (III). The first method uses the first (D1) and third (D3) derivative spectrophotometry at 331 and 315 nm for the determination of (I) and (III), respectively, without interference from (II). The second method depends on the simultaneous use of the first derivative of the ratio spectra (DD1) with measurement at 312.4 nm for determination of (I) using the spectrum of 40 μg/mL (III) as a divisor or measurement at 286.4 and 304 nm after using the spectrum of 4 μg/mL (I) as a divisor for the determination of (II) and (III), respectively. In the third method, the predictive abilities of the classical least-squares, principal component regression, and partial least-squares were examined for the simultaneous determination of the ternary mixture. The last method depends on thin-layer chromatography-densitometry after separation of the mixture on silica gel plates using ethyl acetatechloroformmethanol (16 + 3 + 1, v/v/v) as the mobile phase. The spots were scanned at 281, 272, and 248 nm for the determination of (I), (II), and (III), respectively. Regression analysis showed good correlation in the selected ranges with excellent percentage recoveries. The chemical variables affecting the analytical performance of the methodology were studied and optimized. The methods showed no significant interferences from excipients. Intraday and interday assay precision and accuracy values were within regulatory limits. The suggested procedures were checked using laboratory-prepared mixtures and were successfully applied for the analysis of their pharmaceutical preparations. The validity of the proposed methods was further assessed by applying a standard addition technique. The results obtained by applying the proposed methods were statistically analyzed and compared with those obtained by the manufacturer's method.


2021 ◽  
pp. 1471082X2110229
Author(s):  
D. Stasinopoulos Mikis ◽  
A. Rigby Robert ◽  
Georgikopoulos Nikolaos ◽  
De Bastiani Fernanda

A solution to the problem of having to deal with a large number of interrelated explanatory variables within a generalized additive model for location, scale and shape (GAMLSS) is given here using as an example the Greek–German government bond yield spreads from 25 April 2005 to 31 March 2010. Those were turbulent financial years, and in order to capture the spreads behaviour, a model has to be able to deal with the complex nature of the financial indicators used to predict the spreads. Fitting a model, using principal components regression of both main and first order interaction terms, for all the parameters of the assumed distribution of the response variable seems to produce promising results.


2021 ◽  
Vol 19 (1) ◽  
pp. 205-213
Author(s):  
Hany W. Darwish ◽  
Abdulrahman A. Al Majed ◽  
Ibrahim A. Al-Suwaidan ◽  
Ibrahim A. Darwish ◽  
Ahmed H. Bakheit ◽  
...  

Abstract Five various chemometric methods were established for the simultaneous determination of azilsartan medoxomil (AZM) and chlorthalidone in the presence of azilsartan which is the core impurity of AZM. The full spectrum-based chemometric techniques, namely partial least squares (PLS), principal component regression, and artificial neural networks (ANN), were among the applied methods. Besides, the ANN and PLS were the other two methods that were extended by genetic algorithm procedure (GA-PLS and GA-ANN) as a wavelength selection procedure. The models were developed by applying a multilevel multifactor experimental design. The predictive power of the suggested models was evaluated through a validation set containing nine mixtures with different ratios of the three analytes. For the analysis of Edarbyclor® tablets, all the proposed procedures were applied and the best results were achieved in the case of ANN, GA-ANN, and GA-PLS methods. The findings of the three methods were revealed as the quantitative tool for the analysis of the three components without any intrusion from the co-formulated excipient and without prior separation procedures. Moreover, the GA impact on strengthening the predictive power of ANN- and PLS-based models was also highlighted.


Molecules ◽  
2021 ◽  
Vol 26 (6) ◽  
pp. 1546
Author(s):  
Ioanna Dagla ◽  
Anthony Tsarbopoulos ◽  
Evagelos Gikas

Colistimethate sodium (CMS) is widely administrated for the treatment of life-threatening infections caused by multidrug-resistant Gram-negative bacteria. Until now, the quality control of CMS formulations has been based on microbiological assays. Herein, an ultra-high-performance liquid chromatography coupled to ultraviolet detector methodology was developed for the quantitation of CMS in injectable formulations. The design of experiments was performed for the optimization of the chromatographic parameters. The chromatographic separation was achieved using a Waters Acquity BEH C8 column employing gradient elution with a mobile phase consisting of (A) 0.001 M aq. ammonium formate and (B) methanol/acetonitrile 79/21 (v/v). CMS compounds were detected at 214 nm. In all, 23 univariate linear-regression models were constructed to measure CMS compounds separately, and one partial least-square regression (PLSr) model constructed to assess the total CMS amount in formulations. The method was validated over the range 100–220 μg mL−1. The developed methodology was employed to analyze several batches of CMS injectable formulations that were also compared against a reference batch employing a Principal Component Analysis, similarity and distance measures, heatmaps and the structural similarity index. The methodology was based on freely available software in order to be readily available for the pharmaceutical industry.


Sign in / Sign up

Export Citation Format

Share Document