A new and efficient variable selection algorithm based on ant colony optimization. Applications to near infrared spectroscopy/partial least-squares analysis

2011 ◽  
Vol 699 (1) ◽  
pp. 18-25 ◽  
Author(s):  
Franco Allegrini ◽  
Alejandro C. Olivieri
2018 ◽  
Vol 26 (2) ◽  
pp. 95-100 ◽  
Author(s):  
Yanjie Li ◽  
Wenhao Shao ◽  
Ruxiang Dong ◽  
Jingmin Jiang ◽  
Songfeng Diao

In this study, near infrared spectroscopy has been demonstrated to quickly determine the saponin content in soapnut fruits. Partial least squares analysis combined with pre-processing methods and significance multivariate correlation variable selection was introduced to develop a statistical model calibrated for saponin content in soapnut fruits. The results showed that the first derivative yielded the best partial least squares calibration models with spectra of both the surface of dried fruits and the powder of dry seeded fruits with root mean square error of calibration values of 0.85% and 0.59%, respectively. The surface model presented less accuracy than the powder model. However, when the significance multivariate correlation variable selection method was applied to select the best variables from the spectra, the partial least squares models using spectra of surface and powder samples became similar, with higher R2 values (0.84 and 0.90), lower root mean square error of calibration values of 0.23% and 0.39%. It was suggested that near infrared spectroscopy could be a promising and rapid method for predicting the saponin content in the soapnut fruits without grinding them into powder.


2018 ◽  
Vol 2018 ◽  
pp. 1-9 ◽  
Author(s):  
Lianqing Zhu ◽  
Haitao Chang ◽  
Qun Zhou ◽  
Zhongyu Wang

In order to improve the classification accuracy of Chinese Salvia miltiorrhiza using near-infrared spectroscopy, a novel local variable selection strategy is thus proposed. Combining the strengths of the local algorithm and interval partial least squares, the spectra data have firstly been divided into several pairs of classes in sample direction and equidistant subintervals in variable direction. Then, a local classification model has been built, and the most proper spectral region has been selected based on the new evaluation criterion considering both classification error rate and best predictive ability under the leave-one-out cross validation scheme for each pair of classes. Finally, each observation can be assigned to belong to the class according to the statistical analysis of classification results of the local classification model built on selected variables. The performance of the proposed method was demonstrated through near-infrared spectra of cultivated or wild Salvia miltiorrhiza, which are collected from 8 geographical origins in 5 provinces of China. For comparison, soft independent modelling of class analogy and partial least squares discriminant analysis methods are, respectively, employed as the classification model. Experimental results showed that classification performance of the classification model with local variable selection was obvious better than that without variable selection.


Energies ◽  
2021 ◽  
Vol 14 (5) ◽  
pp. 1460
Author(s):  
Jinming Liu ◽  
Changhao Zeng ◽  
Na Wang ◽  
Jianfei Shi ◽  
Bo Zhang ◽  
...  

Biochemical methane potential (BMP) of anaerobic co-digestion (co-AD) feedstocks is an essential basis for optimizing ratios of materials. Given the time-consuming shortage of conventional BMP tests, a rapid estimated method was proposed for BMP of co-AD—with straw and feces as feedstocks—based on near infrared spectroscopy (NIRS) combined with chemometrics. Partial least squares with several variable selection algorithms were used for establishing calibration models. Variable selection methods were constructed by the genetic simulated annealing algorithm (GSA) combined with interval partial least squares (iPLS), synergy iPLS, backward iPLS, and competitive adaptive reweighted sampling (CARS), respectively. By comparing the modeling performances of characteristic wavelengths selected by different algorithms, it was found that the model constructed using 57 characteristic wavelengths selected by CARS-GSA had the best prediction accuracy. For the validation set, the determination coefficient, root mean square error and relative root mean square error of the CARS-GSA model were 0.984, 6.293 and 2.600, respectively. The result shows that the NIRS regression model—constructed with characteristic wavelengths, selected by CARS-GSA—can meet actual detection requirements. Based on a large number of samples collected, the method proposed in this study can realize the rapid and accurate determination of the BMP for co-AD raw materials in biogas engineering.


Sign in / Sign up

Export Citation Format

Share Document