scholarly journals Reduction of the Number of Samples for Cost-Effective Hyperspectral Grape Quality Predictive Models

Foods ◽  
2021 ◽  
Vol 10 (2) ◽  
pp. 233
Author(s):  
Julio Nogales-Bueno ◽  
Francisco José Rodríguez-Pulido ◽  
Berta Baca-Bocanegra ◽  
Dolores Pérez-Marin ◽  
Francisco José Heredia ◽  
...  

Developing chemometric models from near-infrared (NIR) spectra requires the use of a representative calibration set of the entire population. Therefore, generally, the calibration procedure requires a large number of resources. For that reason, there is a great interest in identifying the most spectrally representative samples within a large population set. In this study, principal component and hierarchical clustering analyses have been compared for their ability to provide different representative calibration sets. The calibration sets generated have been used to control the technological maturity of grapes and total phenolic compounds of grape skins in red and white cultivars. Finally, the accuracy and precision of the models obtained with these calibration sets resulted from the application of the selection algorithms studied have been compared with each other and with the whole set of samples using an external validation set. Most of the standard errors of prediction (SEP) in external validation obtained from the reduced data sets were not significantly different from those obtained using the whole data set. Moreover, sample subsets resulting from hierarchical clustering analysis appear to produce slightly better results.

2012 ◽  
Vol 80 (1) ◽  
pp. 103-112 ◽  
Author(s):  
Andreas Melfsen ◽  
Eberhard Hartung ◽  
Angelika Haeussermann

The robustness of in-line raw milk analysis with near-infrared spectroscopy (NIRS) was tested with respect to the prediction of the raw milk contents fat, protein and lactose. Near-infrared (NIR) spectra of raw milk (n = 3119) were acquired on three different farms during the milking process of 354 milkings over a period of six months. Calibration models were calculated for: a random data set of each farm (fully random internal calibration); first two thirds of the visits per farm (internal calibration); whole datasets of two of the three farms (external calibration), and combinations of external and internal datasets. Validation was done either on the remaining data set per farm (internal validation) or on data of the remaining farms (external validation). Excellent calibration results were obtained when fully randomised internal calibration sets were used for milk analysis. In this case, RPD values of around ten, five and three for the prediction of fat, protein and lactose content, respectively, were achieved. Farm internal calibrations achieved much poorer prediction results especially for the prediction of protein and lactose with RPD values of around two and one respectively. The prediction accuracy improved when validation was done on spectra of an external farm, mainly due to the higher sample variation in external calibration sets in terms of feeding diets and individual cow effects. The results showed that further improvements were achieved when additional farm information was added to the calibration set. One of the main requirements towards a robust calibration model is the ability to predict milk constituents in unknown future milk samples. The robustness and quality of prediction increases with increasing variation of, e.g., feeding and cow individual milk composition in the calibration model.


2017 ◽  
Vol 72 (2) ◽  
pp. 288-296 ◽  
Author(s):  
Michał Kwaśniewicz ◽  
Mirosław A. Czarnecki

Effect of the chain length on mid-infrared (MIR) and near-infrared (NIR) spectra of aliphatic 1-alcohols from methanol to 1-decanol was examined in detail. Of particular interest were the spectra-structure correlations in the NIR region and the correlation between MIR and NIR spectra of 1-alcohols. An application of two-dimensional correlation analysis (2D-COS) and chemometric methods provided comprehensive information on spectral changes in the data set. Principal component analysis (PCA) and cluster analysis evidenced that the spectra of methanol, ethanol, and 1-propanol are noticeably different from the spectra of higher 1-alcohols. The similarity between the spectra increases with an increase in the chain length. Hence, the most similar are the spectra of 1-nonanol and 1-decanol. Two-dimensional hetero-correlation analysis is very helpful for identification of the origin of bands and may guide selection of the best spectral ranges for the chemometric analysis. As shown, normalization of the spectra pronounces the intensity changes in various spectral regions and provides information not accessible from the raw data. The spectra of alcohols cannot be represented as a sum of the CH3, CH2, and OH group spectra since the OH group is involved in the hydrogen bonding. As a result, the spectral changes of this group are nonlinear and its spectral profile cannot be properly resolved. Finally, this work provides a lot of evidence that the degree of self-association of 1-alcohols decreases with the increase in chain length because of the growing meaning of the hydrophobic interactions. For butyl alcohol and higher 1-alcohols the hydrophobic interactions are more important than the OH OH interactions. Therefore, methanol, ethanol, and 1-propanol have unlimited miscibility with water, whereas 1-butanol and higher 1-alcohols have limited miscibility with water.


Sensors ◽  
2019 ◽  
Vol 19 (11) ◽  
pp. 2600 ◽  
Author(s):  
Hui Xiao ◽  
Li Feng ◽  
Dajie Song ◽  
Kang Tu ◽  
Jing Peng ◽  
...  

The potential of visible-near infrared (vis/NIR) spectroscopy (400 nm to 1100 nm) for classification of grape berries on the basis of multi inner quality parameters was investigated. Stored Vitis vinifera L. cv. Manicure Finger and Vitis vinifera L. cv. Ugni Blanc grape berries were separated into three classes based on the distribution of total soluble solid content (SSC) and total phenolic compounds (TP). Partial least squares regression (PLS) was applied to predict the quality parameters, including color space CIELAB, SSC, and TP. The prediction results showed that the vis/NIR spectrum correlated with the SSC and TP present in the intact grape berries with determination coefficient of prediction (RP2) in the range of 0.735 to 0.823. Next, the vis/NIR spectrum was used to distinguish between berries with different SSC and TP concentrations using partial least squares discrimination analysis (PLS-DA) with >77% accuracy. This study provides a method to identify stored grape quality classes based on the spectroscopy and distributions of multiple inner quality parameters.


2017 ◽  
Vol 25 (5) ◽  
pp. 324-329 ◽  
Author(s):  
Li Dan ◽  
Wu Yi-Hui

The aim of this research was to investigate the feasibility of Fourier transform near infrared spectroscopy combined with chemometric analysis to develop a rapid method for identification of different resin types which had been deemed similar by a preliminary visual examination. Principal component analysis was applied on spectral data to classify two types of epoxy resin samples and three types of phenolic resin samples. In this case, a total of two hundred and fifteen samples were used for the evaluation and validation of two types of epoxy resin samples (SY1342 and SY1346) and three types of phenolic resin samples (Y3567, Y2705 and Y2137). All were correctly differentiated by their respective models. Moreover, in the external validation, the prediction rate of samples correctly classified was also 100%. Such classifications are very important for the detection of adulterated samples and for quality control. Near infrared spectroscopy was shown to be a very reliable, accurate and useful tool to classify resin samples in a fast, clean and inexpensive way compared to classical analysis, and it will enable copper clad laminate manufacturers to detect and take early corrective actions that will ultimately save time and money while establishing a uniform quality.


2003 ◽  
Vol 11 (1) ◽  
pp. 55-70 ◽  
Author(s):  
Laila Stordrange ◽  
Olav M. Kvalheim ◽  
Per A. Hassel ◽  
Dick Malthe-Sørenssen ◽  
Fred Olav Libnau

Partial least squares (PLS) is a powerful tool for multivariate linear regression. But what if the data show a non-linear structure? Near infrared spectra from a pharmaceutical process were used as a case study. An ANOVA test revealed that the data are well described by a 2nd order polynomial. This work investigates the application of regression techniques that account for slightly non-linear data. The regression techniques investigated are: linearising data by applying transformations, local PLS, i.e. splitting of data, and quadratic PLS. These models were compared with ordinary PLS and principal component regression (PCR). The predictive ability of the models was tested on an independent data set acquired a year later. Using the knowledge of non-linear pattern and important spectral regions, simpler models with better predictive ability can be obtained.


Antioxidants ◽  
2019 ◽  
Vol 8 (7) ◽  
pp. 203 ◽  
Author(s):  
Ziaul Hasan Rana ◽  
Mohammad Khairul Alam ◽  
Mohammad Akhtaruzzaman

Wild plants are considered the richest source of essential nutrients and other beneficial phytochemicals. Hence, the objective of this study was to evaluate the nutritional composition, antioxidant- and α-amylase inhibition activities of leaves and roots of selected Bangladeshi wild plants. These wild plants were found to have high fiber (13.78–22.26 g/100 g), protein (7.08–21.56 g/100 g) and ash (8.21–21.43 g/100 g) contents. The total phenolic and total flavonoid contents were significantly higher in the leaves than the roots. Additionally, antioxidant activity was evaluated using ferric-reducing antioxidant power, 2, 2-diphenyl-1-picrylhydrazyl radical (DPPH) and trolox equivalent antioxidant capacity assays and was strongly correlated with phenolic compounds. The leaf extracts of the selected plants also exhibited potent α-amylase inhibition (~71%) and were significantly higher than their root counterparts. Thus, the study findings concluded that the investigated plants were good sources of fiber, protein, mineral, natural antioxidant compounds and α-amylase inhibitors, and their increased intake could provide health benefits. The principal component analysis (PCA) of analyzed variables divided the samples into three clear groups, and the first two principal components accounted for 86.05% of the total data set variance.


2012 ◽  
Vol 95 (4) ◽  
pp. 1035-1042
Author(s):  
Alaa El-Gindy ◽  
Khalid Abdel-Salam Attia ◽  
Mohammad Wafaa Nassar ◽  
Hamed Hamed Abu Seda ◽  
Maisra Al-Shabrawi

Abstract Reflectance near-IR (RNIR) spectroscopy was used for the simultaneous determination of chondroitin (CH), glucosamine (GO), and methyl sulfonyl methane (MSM) in tablets. Simple sample preparation was done by grinding, sieving, and compression of the tablets for improving RNIR spectra. Principal component regression and partial least squares (PLS-1 and PLS-2) were successfully applied to quantify the three components in the studied mixture using information included in RNIR spectra in the range of 4350–9100 cm–1. The calibration model was developed with drug concentration ranges of 14.5–44.2% (w/w) for CH, 18.4–55.3% (w/w) for GO, and 6–18.6% (w/w) for MSM with addition of tablet excipients to the calibration set in the same ratio as in the tested tablets. The calibration models were evaluated by internal validation, cross-validation, and external validation using synthetic and pharmaceutical preparations. The proposed method was applied for analysis of six batches of the pharmaceutical product. The results of the proposed method were compared with the results of the pharmacopoeial method for the same batch of the pharmaceutical product. No significant differences between the results were found. The RNIR method is accurate and precise, and can be used for QC of pharmaceutical products.


1997 ◽  
Vol 51 (12) ◽  
pp. 1814-1822 ◽  
Author(s):  
Gregory A. Bakken ◽  
Dixie R. Long ◽  
John H. Kalivas

In analytical chemistry, principal component regression (PCR) is widely used as a method for calibration and prediction. The motivation behind PCR is to select factors associated with predictive information and eliminate those associated with noise. The classical approach, referred to as top-down selection, chooses sequential factors based on singular value magnitudes, and the same factors are used for all future unknown samples; i.e., a global model is formed. The number of factors needed is often determined through cross-validation on the calibration samples or with an external validation set. Alternatively, a model developed specific to an unknown sample, i.e., a local model or sample-dependent model, could offer improved accuracy. The idea behind sample-dependent PCR is that factors associated with small singular values not included in a top-down PCR model can still contain relevant predictive information. This paper shows that local models generated by selecting factors on a sample-by-sample basis often reduce prediction errors compared with those for the global top-down model. However, evidence is also provided that supports the use of global top-down models. Several criteria are proposed and examined for selecting factors on a sample-dependent basis. Observations and conclusions presented are based on two near-infrared data sets.


Molecules ◽  
2018 ◽  
Vol 23 (12) ◽  
pp. 3191 ◽  
Author(s):  
Eva Toledo-Martín ◽  
María García-García ◽  
Rafael Font ◽  
José Moreno-Rojas ◽  
María Salinas-Navarro ◽  
...  

A rapid method to quantify the total phenolic content (TPC) and total carotenoid content (TCC) in blackberries using near infrared spectroscopy (NIRS) was carried out aiming to provide reductions in analysis time and cost for the food industry. A total of 106 samples were analysed using the Folin-Ciocalteu method for TPC and a method based on Ultraviolet-Visible Spectrometer for TCC. The average contents found for TPC and TCC were 24.27 mg·g−1 dw and 8.30 µg·g−1 dw, respectively. Modified partial least squares (MPLS) regression was used for obtaining the calibration models of these compounds. The RPD (ratio of the standard deviation of the reference data to the standard error of prediction (SEP)) values from external validation for both TPC and TCC were between 1.5 < RPDp < 2.5 and RER values (ratio of the range in the reference data to SEP) were 5.92 for TPC and 8.63 for TCC. These values showed that both equations were suitable for screening purposes. MPLS loading plots showed a high contribution of sugars, chlorophyll, lipids and cellulose in the modelling of prediction equations.


2018 ◽  
Vol 11 (02) ◽  
pp. 1850005 ◽  
Author(s):  
Lijun Yao ◽  
Weiqun Xu ◽  
Tao Pan ◽  
Jiemei Chen

The moving-window bis-correlation coefficients (MW-BiCC) was proposed and employed for the discriminant analysis of transgenic sugarcane leaves and [Formula: see text]-thalassemia with visible and near-infrared (Vis–NIR) spectroscopy. The well-performed moving-window principal component analysis linear discriminant analysis (MW-PCA–LDA) was also conducted for comparison. A total of 306 transgenic (positive) and 150 nontransgenic (negative) leave samples of sugarcane were collected and divided to calibration, prediction, and validation. The diffuse reflection spectra were corrected using Savitzky–Golay (SG) smoothing with first-order derivative ([Formula: see text]), third-degree polynomial ([Formula: see text]) and 25 smoothing points ([Formula: see text]). The selected waveband was 736–1054[Formula: see text]nm with MW-BiCC, and the positive and negative validation recognition rates ([Formula: see text]_REC[Formula: see text], [Formula: see text]_REC[Formula: see text] were 100%, 98.0%, which achieved the same effect as MW-PCA–LDA. Another example, the 93 [Formula: see text]-thalassemia (positive) and 148 nonthalassemia (negative) of human hemolytic samples were collected. The transmission spectra were corrected using SG smoothing with [Formula: see text], [Formula: see text] and [Formula: see text]. Using MW-BiCC, many best wavebands were selected (e.g., 1116–1146, 1794–1848 and 2284–2342[Formula: see text]nm). The [Formula: see text]_REC[Formula: see text] and [Formula: see text]_REC[Formula: see text] were both 100%, which achieved the same effect as MW-PCA–LDA. Importantly, the BiCC only required calculating correlation coefficients between the spectrum of prediction sample and the average spectra of two types of calibration samples. Thus, BiCC was very simple in algorithm, and expected to obtain more applications. The results first confirmed the feasibility of distinguishing [Formula: see text]-thalassemia and normal control samples by NIR spectroscopy, and provided a promising simple tool for large population thalassemia screening.


Sign in / Sign up

Export Citation Format

Share Document