Diffuse reflectance spectroscopy to estimate the concentration of chemical elements in soil and sediment combining pre-processing methods with machine learning 

Author(s):  
Gabriela Naibo ◽  
Rafael Ramon ◽  
Gustavo Pesini ◽  
Jean Michel Moura-Bueno ◽  
Claudia Alessandra Peixoto Barros ◽  
...  

<p>The intense soil use with inadequate management can result in the constant transport of sediments with chemical elements absorbed to aquatic systems. The diffuse reflectance spectroscopy in the near infrared (NIR) and medium (MIR) spectral bands associated with chemometry and machine learning, is an analytical technique that has the potential to quantify the concentration of chemical elements in the environment. However, there is no consensus on the best combination of calibration methods, spectral pre-processing and spectral ranges. Thus, the objective of this study was to evaluate the use of this technique, with the combination of different spectral bands, pre-processing techniques and machine learning to estimate the concentration of chemical elements on soil and sediment samples. In this study we used a soil and sediment database from samples collected in the Guaporé River catchment, in southern Brazil. A total of 316 soil samples and 196 sediment samples were dried, disaggregated and sieved at 63 μm. Organic carbon (CO) was quantified by wet oxidation and the total concentration of 21 elements (Al, Ba, Be, Ca, Co, Cr, Cu, Fe, K, La, Li, Mg, Mn, Na, Ni, P, Pb, Sr, Ti V and Zn) were quantified by ICP-OES after microwave assisted digestion for 9,5 min at 182ºC with HCl and HNO<sub>3 </sub>concentrated in the proportion of 3:1. The NIR (1000-2500 nm) and MIR (2500-25000 nm) spectra were obtained in all soil and sediment samples. Two machine-learning methods were tested: Partial Least Squares Regression (PLSR) and Support Vector Machine (SVM), associated with three different spectrum pre-processing methods: Detrend (DET), Savitzky-Golay Derivative (SGD) and Standard Normal Variate (SNV), compared to raw data (RAW). Performance was assessed by the coefficient of determination (R²) and the relationship between performance and interquartile distance (RPIQ). The SVM model resulted in better predictions compared to the PLSR in all evaluated cases, as indicated by the average adjustment values of the model (R²=0.87 for SVM and 0.62 for PLSR), and by the RPIQ values (7.14 for SVM and 2.22 for PLSR). The pre-processing method increased the accuracy of the estimates in the following order: RAW<SNV< DET<SGD. The best performance in relation to the spectral range was observed for the MIR region, being significantly superior to the NIR and NIR+MIR combination. The adjustment of the models calibrated with soil (R²=0.91) and sediment (R²=0.90) data was higher compared to the calibrated with the combination soil + sediment (R²=0.78). For RPIQ, the calibration model with soil data showed the highest RPIQ value (9.29), being higher and differing significantly from the others. In general, the results show that the combination of different calibration methods, spectral pre-processing and spectral ranges has an effect on the accuracy of the estimates. The studied elements can be estimated by means of diffuse reflectance spectroscopy, however it should be noted that this technique has an associated error in the estimates due to the heterogeneity of the chemical structure of the elements in the soil and sediment matrix and the reference samples obtained by chemical methods.</p>

PLoS ONE ◽  
2019 ◽  
Vol 14 (10) ◽  
pp. e0223682 ◽  
Author(s):  
Ulf Dahlstrand ◽  
Rafi Sheikh ◽  
Cu Dybelius Ansson ◽  
Khashayar Memarzadeh ◽  
Nina Reistad ◽  
...  

2021 ◽  
Vol 26 (05) ◽  
Author(s):  
Mayna H. Nguyen ◽  
Yao Zhang ◽  
Frank Wang ◽  
Jose De La Garza Evia Linan ◽  
Mia K. Markey ◽  
...  

2020 ◽  
Vol 11 (9) ◽  
pp. 5122
Author(s):  
Rajitha Gunaratne ◽  
Joshua Goncalves ◽  
Isaac Monteath ◽  
Raymond Sheh ◽  
Michael Kapfer ◽  
...  

1997 ◽  
Vol 5 (2) ◽  
pp. 67-75 ◽  
Author(s):  
M. Blanco ◽  
J. Coello ◽  
H. Iturriaga ◽  
S. Maspoch ◽  
C. de la Pezuela

The results obtained by implementing Principal Component Regression (PCR) according to three different criteria for choosing principal components (PCs), and those provided by Partial Least-Squares Regression (PSLR), in the determination of the active compound in a pharmaceutical preparation by near infrared diffuse reflectance spectroscopy are compared. The PCR-top down criterion used is commonly implemented in commercially available software: it selects consecutive PCs beginning with that possessing the largest eigenvalue. The other two criteria used do not assume the PCs with the largest eigenvalues to be the best predictors for the response variable; rather, the PCR-correlation criterion chooses only those PCs exhibiting the highest correlation with the response variable, and the PCR-best subset criterion selects those that provide the lowest predicted residual sum of squares ( PRESS) for an external prediction set. All the calibration methods tested exhibited a similar predictive ability (prediction errors ranged from 1.34% to 1.49%); however, the number of PCs used in the regression varied among them. The PLSR technique did not excel the methods based on selecting the best PCs for regression. Also, the PCR-correlation and PCR-best subset methods provided the same results and used fewer PCs than the PCR-top down method.


2019 ◽  
Vol 10 (8) ◽  
pp. 3889 ◽  
Author(s):  
Rajitha Gunaratne ◽  
Isaac Monteath ◽  
Joshua Goncalves ◽  
Raymond Sheh ◽  
Charles N Ironside ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document