High accuracy determination of Angelica dahurica origin based on near infrared spectroscopy and a random forest pruning algorithm

2019 ◽  
Vol 27 (4) ◽  
pp. 278-285 ◽  
Author(s):  
Yonghao Xu ◽  
Li Liu ◽  
Meizhen Huang ◽  
Ning Xu

A near infrared spectroscopy method combined with a random forest pruning algorithm based on margin optimization and principal component analysis (PCA-MORFP) was proposed to identify the origin of Angelica dahurica. One hundred and ninety-six samples of A. dahurica were collected from four original cultivation regions; their NIR diffuse reflectance spectra were measured by a custom-built near infrared spectrometer which works in the range of 900–1700 nm with a resolution (full width at half maximum [FWHM]) of 4 nm. Combinations of Savitzky–Golay smoothing, standard normal variates, and first derivative transformations were used to preprocess the spectral data. Then the PCA-MORFP classification model was constructed. Meanwhile, the was compared with other classifying approaches, including: principal component analysis-K-nearest neighbor, principal component analysis-support vector machine, and principal component analysis-random forest. Experimental results showed that the PCA-MORFP achieved the best prediction performance over other compared methods. The recognition rates of the PCA-MORFP model were up to 100% for the calibration set and 98.2% for the prediction set, respectively. The method provides a rapid and convenient detection technique for the origin identification of A. dahurica.

2014 ◽  
Vol 07 (06) ◽  
pp. 1450028 ◽  
Author(s):  
Xuan Zhang ◽  
Yiping Du ◽  
Peijin Tong ◽  
Yuanlong Wei ◽  
Man Wang

Near infrared spectroscopy (NIRS), coupled with principal component analysis and wavelength selection techniques, has been used to develop a robust and reliable reduced-spectrum classification model for determining the geographical origins of Nanfeng mandarins. The application of the changeable size moving window principal component analysis (CSMWPCA) provided a notably improved classification model, with correct classification rates of 92.00%, 100.00%, 90.00%, 100.00%, 100.00%, 100.00% and 100.00% for Fujian, Guangxi, Hunan, Baishe, Baofeng, Qiawan, Sanxi samples, respectively, as well as, a total classification rate of 97.52% in the wavelength range from 1007 to 1296 nm. To test and apply the proposed method, the procedure was applied to the analysis of 59 samples in an independent test set. Good identification results (correct rate of 96.61%) were also received. The improvement achieved by the application of CSMWPCA method was particularly remarkable when taking the low complexities of the final model (290 variables) into account. The results of the study showed the great potential of NIRS as a fast, nondestructive and environmentally acceptable method for the rapid and reliable determination for geographical classification of Nanfeng mandarins.


2019 ◽  
Vol 59 (6) ◽  
pp. 1190 ◽  
Author(s):  
A. Bahri ◽  
S. Nawar ◽  
H. Selmi ◽  
M. Amraoui ◽  
H. Rouissi ◽  
...  

Rapid measurement optical techniques have the advantage over traditional methods of being faster and non-destructive. In this work visible and near-infrared spectroscopy (vis-NIRS) was used to investigate differences between measured values of key milk properties (e.g. fat, protein and lactose) in 30 samples of ewes milk according to three feed systems; faba beans, field peas and control diet. A mobile fibre-optic vis-NIR spectrophotometer (350–2500 nm) was used to collect reflectance spectra from milk samples. Principal component analysis was used to explore differences between milk samples according to the feed supplied, and a partial least-squares regression and random forest regression were adopted to develop calibration models for the prediction of milk properties. Results of the principal component analysis showed clear separation between the three groups of milk samples according to the diet of the ewes throughout the lactation period. Milk fat, protein and lactose were predicted with good accuracy by means of partial least-squares regression (R2 = 0.70–0.83 and ratio of prediction deviation, which is the ratio of standard deviation to root mean square error of prediction = 1.85–2.44). However, the best prediction results were obtained with random forest regression models (R2 = 0.86–0.90; ratio of prediction deviation = 2.73–3.26). The adoption of the vis-NIRS coupled with multivariate modelling tools can be recommended for exploring to differences between milk samples according to different feed systems, and to predict key milk properties, based particularly on the random forest regression modelling technique.


2013 ◽  
Vol 834-836 ◽  
pp. 935-938
Author(s):  
Lian Shun Zhang ◽  
Chao Guo ◽  
Bao Quan Wang

In this paper, the liquor brands were identified based on the near infrared spectroscopy method and the principal component analysis. 60 samples of 6 different brands liquor were measured by the spectrometer of USB4000. Then, in order to eliminate the noise caused by the external factors, the smoothing method and the multiplicative scatter correction method were used. After the preprocessing, we got the revised spectra of the 60 samples. The difference of the spectrum shape of different brands is not much enough to classify them. So the principal component analysis was applied for further analysis. The results showed that the first two principal components variance contribution rate had reached 99.06%, which can effectively represent the information of the spectrums after preprocessing. From the scatter plot of the two principal components, the 6 different brands of liquor were identified more accurate and easier than the spectra curves.


2013 ◽  
Vol 781-784 ◽  
pp. 1464-1468
Author(s):  
Xiu Hua Liu ◽  
Xiao Ting Li ◽  
Jing Wang ◽  
Rui Ying Li ◽  
Guang Chen Wu ◽  
...  

In order to identify the authentic Pingli Gynostemma, a geographical indication products, diffuse reflectance spectroscopy of Gynostemma came from eight different origins were collected by the Fourier near-infrared spectrometer. The spectroscopy was analyzed with Chemometrics method, and the spectroscopy was pretreated by the vector normalization condition. The range of spectra was 4800-10096 cm-1. The Calibration models of Gynostemma were established by the principal component analysis, qualification testing and cluster analysis, respectively, and each model was verified. The results show that the optimal model established by the principal component analysis, qualification testing and cluster analysis can effectively identify authentic Pingli Gynostemma, and accuracy rate was 100%. In conclusion, Pingli Gynostemma can be identified accurately and quickly by the near-infrared spectroscopy technique.


2014 ◽  
Vol 989-994 ◽  
pp. 4028-4031
Author(s):  
Yan Ping Pang ◽  
Kun Liu ◽  
Li Ya Xia ◽  
Shao Long Yu

In order to identify the Zherong Radix Pseudostellariae, a geographical indication products, diffuse reflectance spectroscopy of came from ten different origins were collected by the Fourier near-infrared spectrometer. The spectroscopy was analyzed with Chemometrics method,and the spectroscopy was pretreated by the second derivative, first derivation and minus a straight line condition. The range of spectra was 3996.1-7282.5 cm-1. The Calibration models of Radix Pseudostellariae were established by the qualification testing, principal component analysis, and cluster analysis respectively, and each model was verified. The results show that the optimal model established by the qualification testing, principal component analysis and cluster analysis can effectively identify authentic Zherong Radix Pseudostellariae , and accuracy rate was more than 97.5%. In conclusion, Zherong Radix Pseudostellariae can be identified accurately and quickly by the near-infrared spectroscopy technique.


Sign in / Sign up

Export Citation Format

Share Document